Breaking the implementation barrier for polygenic scores
From embryo selection to clinical decision-making
Two weeks ago, polygenic risk scores (PRS) made headlines again, after a newly launched biotech company announced plans to offer PRS-based embryo selection for women undergoing IVF. While this is arguably the most controversial potential application of PRSs, far less attention was paid to a more basic reality: PRSs remain a research tool, not a clinical one. Despite nearly two decades of research, implementation in real-world patient care is absent. PRSs are not included in any routine medical guidelines, and no major specialty society currently recommends their use for clinical decision-making.. Back in 2018, I asked a leading scientist at the forefront of PRS research, when they might enter practice. “Five years,” he replied confidently. Here, I explore realistic scenarios for how PRSs could finally make the leap from research and hype into routine medicine. Ideally before we begin selecting embryos based on them.
The promise of polygenic scores for disease prediction
PRSs promised to translate the complex insights from the genomic revolution into clinical value for disease prediction and personalized decision-making. Genetic testing has long been routine for diseases with a clear genetic etiology, such as phenylketonuria, muscular dystrophy, hypertrophic cardiomyopathy, Down syndrome, or epileptic encephalopathies. But whether genetics could also inform risk prediction for the vast majority of so-called “complex” diseases has remained an open question. Complex diseases are those whose causes we only partly understand. They are not determined by single genetic mutations and have a multifactorial etiology, where both inherited genetic variation and environmental exposures play important roles. Common examples include heart disease, cancer, diabetes, and Alzheimer’s disease.
For most complex diseases, there is broad consensus that there is a genetic component that contributes to their pathogenesis. Epidemiologic studies consistently demonstrate within-family clustering of disease outcomes. For example, women with first-degree relatives affected by breast cancer have up to 4 times higher risk themselves, a finding that directly informs clinical guidelines recommending earlier mammography screening for this group. Complementing these observations, twin studies, despite controversies about the precision of their heritability estimates, cshow higher concordance for most disease traits in monozygotic (sharing 100% of their DNA) compared to dizygotic twins (sharing ca. 50% of their DNA), pointing to a role of genetic factors in their etiology.

Despite being highly variable per outcome, these patterns clearly suggest that genetic variation contributes measurably to disease susceptibility. Although there are examples of complex traits, where single variants drive a large proportion of this susceptibility (BRCA1 and BRCA2 in breast cancer or APOE4 in Alzheimer’s disease), for most diseases, risk is spread across multiple genetic variants throughout the genome. Thus, quantifying genetic predisposition at an individual-level is challenging.
PRSs were developed as biomarkers that address this gap. Over the past two decades, genome-wide association studies (GWAS) involving up to millions of participants have identified thousands of variants associated with human diseases (the timeline from the GWAS catalogue illustrates the rapid increase in number of variants associated with diverse traits over the years).
As GWAS sample sizes grew and more risk variants were identified with greater certainty, PRS performance steadily improved. Parallel advances in computational methods for modeling the joint effects of variants further enhanced predictive accuracy. The latest generation of PRSs now reach levels of predictive power comparable to well-established clinical biomarkers routinely used in risk stratification. However, this does not guarantee clinical utility.

Do polygenic risk scores have implementation potential?
It is reasonable to assume that PRSs have potential for clinical implementation. After all, family history is a core component of the standardized physician interview taught in medical school. Yet, family history is a low-resolution biomarker of genetic predisposition: a negative history does not rule out risk, and a positive history may reflect shared environmental exposures rather than genetic susceptibility. Why not replace, or at least complement, family history with much more precise quantifications of genetic predisposition? PRSs theoretically require only a one-time sample during our lifetimes, from which up-to-date scores for any disease could be calculated. Given that genotyping costs are now very low (< $50/sample), the approach appears highly cost-effective.
However, despite recent improvements, polygenic predictions remain unfortunately not very accurate. For example, recent PRSs from Herasight explain at most 10–20% of the variance for most diseases.

The probabilistic, non-diagnostic nature of PRSs makes them challenging to interpret at an individual level, but also to communicate to both physicians and patients. Their interpretation depends heavily on baseline individual risk, including demographics such as age, sex, and ancestry, as well as other known and unknown risk factors. Most PRSs have been trained on datasets predominantly of European ancestry, limiting their transferability to individuals from other ancestral backgrounds.
Interpretation difficulties are compounded when PRSs are constructed using different training cohorts or computational methods. In a recent study comparing 46 PRSs for coronary artery disease, individual-level agreement was very poor: 80% of participants were classified in both the highest and lowest 20% of some PRS distributions. This variability raises concerns about interpretation, particularly in a landscape where multiple companies offer their own score calculators.

Finally, although genotyping costs are low, implementing PRSs in a clinical context requires infrastructure for data storage and computational analysis. Such resources remain unavailable in many laboratories.
Nevertheless, it would be unfair to claim that PRSs have no implementation potential. For all arguments above, there are very strong counterarguments. Clinicians are already familiar with risk prediction tools, which are routinely used in practice, and most can interpret and communicate them to patients. The argument of complexity is a very weak one to use when it comes to clinical routine. In my experience, researchers tend to underestimate the nuanced risk-benefit decisions that physicians navigate daily while advising patients.
The argument of modest predictive accuracy is valid. However, it can be countered by the fact that many predictive tools used routinely in clinical medicine also have limited precision. Cardiovascular risk estimation is a prime example. Despite decades of research, well-established risk factors, and large cohorts, our risk estimators remain imperfect. The PREVENT equations recommended by the American Heart Association achieve a 10-year risk discrimination of about 65%. In other words, if we randomly select two individuals, one who will develop cardiovascular disease in the next 10 years and one who will not, the model assigns a higher risk to the wrong individual 35% of the time. (Importantly, the reference point for discrimination is not 0% but 50%, the accuracy of a coin flip!) Yet these tools are widely accepted and guide preventive decision-making in everyday practice. By comparison, a polygenic risk score for coronary artery disease together with age and sex has demonstrated a c-index of around 70%, and when added to conventional estimators, improves risk discrimination by up to 5%.
Furthermore, PRSs show substantial separation at the extremes of their distributions. In the UK Biobank, individuals in the highest percentile of a BMI PRS had a 60% prevalence of obesity (BMI > 30 kg/m²), compared with <1% among those in the lowest percentile. Likewise, those in the top percentile of a coronary artery disease PRS had a disease prevalence of 16%, versus <1% in the bottom percentile. Such differences are meaningful and carry real potential to influence clinical decision-making.

Regarding the ancestry argument, the academic community has long recognized this limitation, and recent years have seen substantial progress in cross-ancestry prediction. As more genotyping data are collected from underrepresented populations, PRS performance will continue to improve. In parallel, the field is developing standards for PRS computation that should enable greater consistency and comparability across scores. Finally, in the computational era, laboratories will need to scale their infrastructure toward data science regardless of whether PRSs ultimately enter routine clinical care.
Unfortunately, discussions around PRS implementation often focus on regulatory, administrative, ethical, or moral consideration and do not address concrete scenarios in which PRSs could have clear, actionable utility. Biomarkers do not exist to fuel philosophical debates. Their purpose is to guide decision-making and interventions.
Entry points
Let’s consider the potential of PRS implementation throughout the lifecourse. Broadly, strategies could be split into three categories based on timing:
pre-implantation testing
disease-agnostic implementation of genome sequencing at birth
at different timepoints throughout the lifetime
I will refrain from discussing the first approach in depth, as there is currently no reliable way to test PRS-based pre-implantation selection for late-life diseases in our lifetimes. Companies offering such services are taking substantial risks without meaningful evidence1. Selecting embryos based on a probabilistic biomarker is fundamentally different from screening for established causal high-risk mutations, as pleiotropic effects make outcomes unpredictable. Did any of these companies test in detail associations of their proposed PRSs with all potential outcomes to secure safety? Most proposed applications focus on traits or diseases that manifest decades later, making short- or even mid-term testing (one could say conveniently) unrealistic. Most women pursue IVF to overcome infertility, not to produce a genius child. Why not focus on reproductive outcomes? Even the most specialized centers, after screening for chromosomal abnormalities, barely cross 50% live birth rates per implanted euploid embryo. Constructing a PRS for a live birth outcome, and testing it in a randomized selection vs. no selection framework, would arguably be far more sensible than directly implementing a controversial PRS for intelligence. Alternatively, embryo selection for childhood disease outcomes that impact quality of life, such as asthma, type 1 diabetes, or epilepsy, could be tested in a research setting within a realistic 10-year timeframe. And yet, testing directly for IQ? It is noteworthy that the current dystopic regulatory frameworks demand a decade of trials for a novel drug targeting incurable stage 4 cancer, while nothing prevents companies from offering tests with lifelong individual and potentially societal implications.
The second option is probably going to generate considerable discussion in the near future. Some healthcare systems are about to implement or are debating universal whole-genome-sequencing for newborns, which would provide a permanent resource for calculating up-to-date PRSs across most diseases. This approach has transformative potential and in the long run is probably the way that genomics is going to affect healthcare the most. It would facilitate both observational and interventional research, as physicians and scientists could access PRSs without running additional tests for study purposes, helping identify actionable opportunities over time. In the long-term, it would also enable a real transition towards precision preventive medicine with closer follow-up of individuals at higher risk of specific diseases. However, realizing the full potential of this strategy will likely take decades.
The third option involves identifying clinical scenarios in which PRSs have direct actionability. While epidemiological studies often present elegant population-level graphs, physicians make decisions for individual patients only when a test informs a downstream intervention. Linking PRSs in specific populations to clearly relevant disease outcomes may be less sexy than pre-implantation or neonatal screening, but it currently represents the fastest, safest, and most practical path toward implementation. This approach could be tested and integrated within a < 10-year timeframe.
Idenitying clinical scenarios for PRS implementation
Several efforts have explored where PRSs could have real-world impact across the lifespan, from childhood-implemented prevention to cancer screening, cardiovascular risk prediction, and treatment decisions. Below, I outline at a high-level selected examples without diving into field-specific nuances.
Preventing obesity in childhood
Childhood presents a potential window for implementation of early prevention strategies. BMI trajectories are relatively stable from adolescence into adulthood, and preventing weight gain early is far easier than reversing obesity later. Early-onset obesity strongly predicts cardiometabolic complications, reduced quality of life, and increased healthcare burden. A recent study showed that a PRS for BMI trained on a massive dataset of 5.1M individuals can predict weight trajectories as early as age 8. Children in the top 10% of the PRS distribution already had considerably higher BMIs than those in the bottom 10%, with differences widening over time. The PRS outperformed common early-life predictors, such as maternal BMI and birthweight. This discrimination could allow pediatricians to target high-risk children for lifestyle interventions, including structured physical activity programs, nutritional counseling, and family-based support. Such PRS-guided strategies could, for some, bend the trajectory of obesity before adulthood, avoiding the difficult battle of later weight loss.

Guiding population-based cancer screening
Another area with potential applicability of PRS is personalized cancer screening. In the BARCODE1 study, 6,393 European men aged 55–69 years were screened for prostate cancer using MRI and biopsy if their PRS was in the top 10% (n=468). This approach detected prostate cancer in 187 participants, including 103 classified as intermediate/high risk, 72% of whom would have been missed under standard PSA/MRI screening alone.
Similar screening efforts have been pursued for other cancer types, e.g. for personalizing the age at breast cancer screening with mammography or colorectal cancer screening with colonoscopy. For breast cancer, individual-level risk in clinical settings is typically calculated with risk models, such as the Tyrer-Cuzick (TC) model, which include age, family history, anthropometrics, breast density, history of hormone use, and other factors. Adding a PRS for breast cancer to this model, seems to improve risk prediction identifying women at elevated risk who could benefit from targeted risk-reducing strategies. Population-based studies suggest that the average 5-year breast cancer risk at age 50 (the currently recommended age threshold for starting mammography screening has been in the meantime shifted to 40 by the CDC), is reached at significantly younger ages for women with higher PRSs for breast cancer. Similarly, for colorectal cancer, PRS-guided screening could personalize the starting age for colonoscopy, increasing the detection of clinically significant cases.

Preventing cardiovascular disease by estimating future risk
Cardiovascular disease remains the leading cause of death worldwide. Fourtunately, decades of research have shown that cardiovascular disease is highly preventable. Several actionable risk factors have been detected, such as hyperlipidemia, hypertension, diabetes, and obesity that promote the entrapment of lipoprotein particles within the arterial wall leading to development of atherosclerosis. When to start treatment with lipid-lowering agents (usually statins) in asymptomatic individuals, has been the matter of intense research in previous years. Typically, we use scores that calculate the 10-year risk of cardiovascular events (e.g. myocardial infarction, stroke, sudden cardiovascular death), where an estimated risk of >7.5% or >10% is generally considered an indication for starting statin therapy. The scores we use to calculate risk are based on age, sex, race, and vascular risk factors and while informative, as mentioned above, have limited discriminative power in real-world settings. Adding a PRS in these risk equations, seems to lead to improvements in risk stratification. Building on existing equaltions, adding a PRS represents a relatively low barrier to clinical implementation.

Infroming risk-benefit decisions for anticoagulants in patients with atrial fibrillation
Atrial fibrillation is an important diagnosis in asymptomatic individuals, as it can increase the risk of stroke, which can lead to disability, dementia, or death. Atrial fibrillation is often an incidental finding on an ECG and when detected, physicians are encountered with the dilemma of prescribing anticoagulants for life. Anticoagulants increase risk of bleeding, which can in rare cases be even fatal, making careful risk-benefit assessment crucial. To guide decision-making, risk scores have been developed, the most widely used being the CHA₂DS₂-VASc Score.
Adding PRSs in the decision making could improve risk assessment, as CHA₂DS₂-VASc doesn’t discriminate risk perfectly. Post hoc analyses of an anticoagulation trials, provided evidence that considering a PRS for stroke in the highest tertile improved discrimination in the small group of patients with questionable benefit from anticoagulants (those scoring 2 at CHA₂DS₂-VASc, which would typically make them eligible for treatment).
Similarly, decisions about anticoagulation must be balanced against the risk of serious bleeding. Intracranial hemorrhage represents the most severe such risk, with a one-month mortality of approximately 40% and high levels of disability among survivors—an outcome that should be minimized, particularly in individuals with elevated genetic risk. The HAS-BLED score may be used to estimate bleeding risk, and adding a PRS for intracranial hemorrhage could further refine prediction in candidates for anticoagulation.
Challenges towards wider implementation of PRSs
How can we accelerate research in this direction? Despite extensive development in population-based studies, PRSs have seen limited application in clinical research. Patient cohorts, which are critical for evaluating biomarkers in real-world settings, remain largely underused. Testing PRSs in such cohorts could address issues ranging from technical robustness and outlier handling to stratification by established risk factors and patient responses to PRS communication. Furthermore, they could be used to validate PRS performance in relevant patient collectives.
Clinical trials represent another underexploited setting. Post hoc analyses of trial cohorts offer unique opportunities to test PRSs in combination with interventions in randomized contexts, helping establish their value for decision-making and laying the groundwork for dedicated PRS-driven trials. While many trials already collect genetic data, these are rarely shared by the sponsors, limiting rapid progress and widespread utilizaton by the research community. Considering current genotyping costs, broader adoption in trials could be a low-cost but high-yield strategy.
Ultimately, I cannot foresee a setting of PRSs being implemented in clinical routine without some sort of clinical trial evidence for actionability. Such trials would ideally link PRS testing to an intervention and testing this combined approach versus standard of care. Unfrortunately, trials are notoriously slow. They demand years of preparation and regulatory approval and recruitment of probably thousands of participants, who need to be followed-up for several years to show significant effects on clinical outcomes. This timeline is dramatically outpaced by the speed of genetic discovery and PRS development. By the time trial results arrive, the tested PRSs may be entirely outdated. For example, the MI-GENES trial, one of the first randomized studies of a PRS, tested the impact of disclosing a coronary artery disease PRS to patients in a primary prevention setting. The study found that communicating the PRS improved adherence to statins, lowered lipid levels, and even reduced cardiovascular event rates over 10 years of follow-up. However, the score itself was based on only 28 SNPs, which is primitive by today’s standards, when state-of-the-art PRSs for coronary artery disease incorporate more than a million variants. Encouragingly, several ongoing trials are now testing PRS-guided interventions in diverse clinical settings, with the potential to provide definitive evidence for the utility of current-generation PRSs in the coming years.
This is a truly excellent overview of the state of a field I have been obsessed with since starting grad school. For me, the individual-level variability and low phenotypic variance explained make it hard for me to see a future for PRSs. But you present a nice positive case — particularly about how PRS uncertainty compares to already widely used clinical risk scores
This was fascinating. Thanks. I'm reading this from the place of a clinician, though, and I can only imagine what it will feel like to parents of two-day-old newborns to hear "Congratulations on your new baby! Here are the hundreds of genetic diseases he is susceptible to based on his whole-genome-sequencing!" Also, regarding unknowns, IVF pregnancy has risk factors above spontaneously conceived pregnancy that no one on the genetic frontier seems to be acknowledging. I wrote about it here: https://open.substack.com/pub/annledbetter/p/why-i-dont-fear-an-ivf-takeover?r=8c5pl&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false