Would you like to?


The Next Big Life Sciences Opportunity: Leveraging Data from Population-scale Sequencing Projects

Posted by: 
 | May 2019

A sharp rise in population-scale genome sequencing projects creates opportunities for biopharma companies and genomics service providers to fuel the future of drug discovery and development.

Population-scale genome and exome sequencing is gathering pace, creating opportunities for biopharma companies and genomics service providers to speed up drug discovery and develop personalised treatments for a wide variety of diseases.

Advances in sequencing technology and a decline in sequencing costs have lowered the barriers to conducting population-scale projects. Our research identified more than 50 such projects which are underway worldwide, providing unprecedented insights into populations, diseases, targets, molecular pathways and drugs.


What does a population-scale project typically involve?

A typical project involves the sequencing or genotyping of DNA or RNA obtained from a large number of diseased or healthy individuals.

It involves the collection of blood specimens, along with other datasets in the form of electronic health records, surveys or medical reports. Genomic data can also be integrated with these datasets to help establish a link between genomic and phenotypic behaviour.


50+ projects now underway worldwide

Through our work in population genomics, we have identified more than 50 ongoing/completed projects around the globe – conducted by biopharma companies, academic institutions, government bodies, consumer genomics companies and biobanks (see map).


Population-scale sequencing initiatives – data generated by key projects

Of these 50 projects, the top 16 are expected to account for ~92% of the world’s genomic data by 2022. Key initiatives include:


  • Genomics England’s 100,000 Genomes Project has sequenced 100,000 genomes from around 85,000 people. Participants are National Health Service patients with a rare disease, plus their families and patients with cancer
  • Following completion of the 100,000 Genomes Project, the UK government has shown a keen interest in making whole genome sequencing part of routine care

United States

  • The All of US Research Initiative by the US National Institutes of Health describes itself as a historic effort to gather data from one million or more people living in the US to accelerate research and improve health
  • Geisinger – a US-based healthcare system – has already made DNA sequencing part of routine care for all its patients


  • The GenomeAsia100k Project by Medgenome aims to sequence and analyse 100,000 genomes of Asian individuals to help accelerate Asian population-specific medical advances and precision medicine
  • In January 2019, the Hong Kong government announced a project aimed at sequencing 20,000 patients, beginning 2019–2020
  • Also in January 2019, the Bangladesh government inaugurated its first project expected to initially sequence 100 samples using NovaSeq6000

Strong government and academic interest in driving these initiatives means the field of population-scale sequencing should continue to expand over the coming years, creating huge opportunities for biopharma companies and genomics service providers.


Opportunities for biopharma companies to accelerate drug discovery

The opportunity for biopharma companies lies in leveraging insights from population-scale initiatives to accelerate drug discovery and development of personalised therapies.

Large-scale initiatives using whole-genome or exome sequencing have led to the discovery of a huge number of novel sequence variants in known pharmacogenes. Selection of drug targets based on molecular characterisation of pharmacogenes during drug discovery can lead to the discovery of innovative drugs and also, improve the success rate of clinical trials.

Moreover, during clinical trials, these initiatives serve a twofold purpose – helping to predict patients at risk for an altered drug response, and aiding dose selection for clinical trial participants with a high variability in genetic markers.

In January 2018, pharma giants including AbbVie, AstraZeneca and Pfizer formed a consortium with Regeneron to reshape drug development and speed up sequencing of 500,000 exomes of volunteer participants within the UK Biobank by three years (from 2022 to 2019).

The data generated will be integrated with UK Biobank’s patient details and de-identified medical and health records to understand the links between genetic variation and diseases.


Opportunities for genomics service providers to offer data analytics platforms

The huge amount of genomic data generated by population-scale initiatives is a key resource for understanding differential drug responses.

Genomics service providers have an opportunity to make this resource available to researchers or biopharma companies by:

  • Providing sequencing, analysis and storage services for population-scale projects: For instance, Seven Bridges Genomics, a cloud-based genomic data analysis provider, supports multiple population genomics projects, including Simons Foundation’s Genome Diversity Project, the Million Veteran Program and the 100,000 Genomes Project
  • Integrating genomics data with other datasets such as phenotypic and clinical data: The integrated dataset allows researchers to improve translational research, biomarker discovery or clinical-stage activities. Companies that provide data integration or harmonisation services include Wuxi NextCODE, PerkinElmer and Oracle

Many biopharma companies are already making use of population genomics data by processing it on freely available or paid informatics platforms. Johnson & Johnson, AstraZeneca, Sanofi and Roche, for example, have all adopted tranSMART – a freely available informatics platform for ingestion and harmonisation of datasets obtained from population-scale sequencing projects.

Nowadays, companies are looking at paid platforms, such as the PerkinElmer Signals Translational platform from PerkinElmer, for improving drug development and discovery by making use of genomic-phenotypic datasets.


A major opportunity that demands close ongoing attention

Our work in this space suggests that population-scale sequencing initiatives are likely to grow at a CAGR of 50% during 2018–2022 and generate 200 PB of data by 2022 (up from 40 PB in 2018), representing a lucrative opportunity for genomics service providers.

As population-scale sequencing initiatives continue to rise and generate more data, it becomes vital for biopharma companies and genomics service providers to keep an eye on the opportunities that lie ahead.


Leverage market intelligence from The Smart Cube to drive life sciences strategy

The Smart Cube has significant experience in helping clients navigate this space by providing insights into competitor activity, assessing market opportunities from the data that is likely to be generated, and evaluating collaborative opportunities.

If you’d like to know more about how The Smart Cube can help you better understand current market opportunities in the life sciences arena, please do get in touch.

Thanks to Deepanshu Jain, Senior Analyst, Life Sciences, for additional inputs into this blog.

  • Komal Khandelwal

    In her seven years with The Smart Cube, Komal has risen through the ranks from a Senior Analyst to currently a Senior Manager. Managing some of the key large accounts in the firm, she is responsible for providing solutions that enable operational excellence, innovation and demand management to clients in Life Sciences sector, particularly in the procurement and supply chain domain. Komal specialises in consumer healthcare products and OTC space and manages end-to-end delivery of projects in these areas. Outside work, she likes to travel and experience different cultures across the globe.