We live in an era where the generation of big datasets is commonplace. However, the reference datasets that are used to align and assign analyses lack genetic diversity and, therefore, carry limitations and caveats when interpreting results.
Liao et al. have published the interim results of the Human Pangenome Reference Consortium, which aimed to produce the first human pangenome reference sets for common big data analysis. Taking the near-complete diploid genomic information from 47 diverse individuals, the research team has carried out multiple RNA- and DNA-based sequencing modalities, including 10x genomics, microarray, high-fidelity and structural variant genotyping, to generate the human pangenome.
Complete findings and datasets will be published in a follow-up to this study. However, these interim data provide a valuable insight into how they differ and add diversity compared with current reference datasets. Moreover, the results of this study will provide important genetic diversity in reference ranges for big datasets across multiple sequencing platforms.
Read the full article in Nature 617 312–324