Unlocking the World’s Largest Human Genome Sequence Dataset for Researchers

Crowds of shoppers are seen on Oxford Street on December 2, 2020 in London, England.

The sources of human variability can be revealed by plumbing our genetic code.Credit: Peter Summers/Getty

The world’s largest collection of full human genomes has just gone live. The UK Biobank — a repository of health, genomic and other biological data — today released complete genome sequences from every one of the 500,000 British volunteers in the database. Researchers around the world can apply for access to the data, which lack identifiable details, and use them to probe the genetic basis for health and disease.

“Scientists are looking at this like Google Maps,” Rory Collins, the UK Biobank’s chief executive, said at a press briefing. “When they want to know what are the pathways from lifestyle, environment, genetics to disease, they don’t go Google, they go to UK Biobank.”

Today’s bonanza releases the complete 3-billion-letter genome sequence for every UK Biobank participant, and follows the 2021 release of whole genomes from 200,000 Biobank participants. The £200-million (US$250-million) effort was funded by the biomedical-research funder Wellcome, the UK government and several pharmaceutical companies — which, in return, got access to the data 9 months before their wider release.

Previously, the UK Biobank’s genetic information included entire ‘exomes’ — each being the the 2% of the genome that codes for proteins — and before that, 850,000 common single-letter DNA variants that were spread across the genome. The latter information powers genome-wide association studies (GWAS) linking health and genetics.

Rare variants

But when researchers look for associations between genetics and disease or other traits, most of these ‘hits’ turn up in non-coding regions of the genome that are missing from exome sequences and are covered only at low resolution in existing genome-wide data. Whole genomes also allow researchers to spot very rare mutations, says Michael Weedon, a human geneticist at the University of Exeter, UK. “We’re hoping rare variants give us more insight into biology.”

That is already proving to be the case. In a 20 November preprint1, a team led by Weedon and Gareth Hawkes, a human geneticist also at Exeter, mined the first 200,000 complete genomes in the UK Biobank data and found 29 rare DNA variants that were implicated in height differences as large as 7 centimetres; these variants had not been spotted in previous genetic research. The study was a pilot for analysing all 500,000 genomes, says Weedon, who plans to spend the day taking a first look at the genome data.

Ultimately, researchers will need many more than half a million full genomes to comprehensively map associations between rare gene variants and health, says Weedon. “I’d see this as a good next step to getting the millions of samples we probably need.”

Disease links

Those numbers are on the horizon. The All of Us study, funded by the US government, plans to eventually release whole genome and health data from one million or more people in the United States. The effort has released 250,000 genomes, but did not start taking applications to study the data from non-US researchers until August. Databases such as All of Us will also be useful for confirming links uncovered with the UK Biobank, say researchers.

Going by what’s been learnt from the first 200,000 genomes in the UK Biobank, Andrea Ganna, a statistical geneticist at the University of Helsinki, isn’t yet convinced that they provide much bang for the buck. Many of the non-coding variants picked up by whole-genome studies such Weedon and Hawkes’s are close to hits already found through GWAS. Still, complete genome sequences might help researchers to map disease links more accurately to structural variations — missing, extra or flipped-around chunks of DNA — says Ganna.

The UK Biobank has already given rise to more than 9,000 publications, and the true impact of the latest release might not be clear for some time, Collins said. “I think we’ll be surprised by how much comes out that we haven’t even imagined.”

Reference

Denial of responsibility! Vigour Times is an automatic aggregator of Global media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, and all materials to their authors. For any complaint, please reach us at – [email protected]. We will take necessary action within 24 hours.
DMCA compliant image

Leave a Comment