Human Data Era

By studying human genetics, scientists discovered mechanisms that, when defective, cause disease. While this type of data is powerful, additional information can provide more insight on the human condition.

In the Human Data Era podcast, Ray Deshaies, senior vice president of Global Research at Amgen, explores the potential of human data and the important transition scientists and clinicians are making to incorporate this wealth of information into drug research and development.

Launched October 20, 2022

Released Episodes

Human Data: Beyond the Genome with Rob Lenz, M.D., Ph.D., senior vice president of Global Development at Amgen

Transcript

Episode 1: Human Data: Beyond the Genome

The Scientist: Welcome to The Human Data Era, a special edition podcast series produced by The Scientist's Creative Services Team.

This series is brought to you by Amgen, a pioneer in the science of using living cells to make biologic medicines. They helped invent the processes and tools that built the global biotech industry and have since reached millions of patients suffering from serious illnesses around the world with their medicines.

By studying human genetics, scientists discovered mechanisms that, when defective, cause disease. While this type of data is powerful, additional information can provide more insight on the human condition. Researchers and clinicians can now go beyond genetics, combining proteomics, metabolomics, transcriptomics, and environmental factors into a broad category of human data. In this series, Ray Deshaies, senior vice president of Global Research at Amgen, explores the potential of human data and the important transition scientists and clinicians are making to incorporate this wealth of information into drug research and development.

Ray: Senior Vice President of Global Development at Amgen. We review the full scope of human data, going beyond the genome to explore the challenges of using human data as well as the opportunities of applying human data to drug research and clinical trials.

Hey Rob, it's fabulous to have you with me here today. When I first came to Amgen, you were running the neurology part of Amgen, the clinical development. I'd be curious to know, how did your experiences as a neurologist treating patients inform how you think about running global clinical development at Amgen, and in particular, how you think about the potential for using human data to drive the development of our medicines?

Rob: I got involved with designing and conducting a clinical trial in my residency at UCLA and I got hooked. I loved just how rapidly I could get actionable answers from clinical research, and then bring that almost immediately to my patients. That love of clinical research is still very much alive in me today. Soon after joining the pharmaceutical industry, I was designing my first phase two industry trial. I realized that many aspects of the traditional trial design simply didn't make sense to me. When asked people why they did the things the way that they did, I often got pretty unsatisfactory responses. So, I reached out and started working with some really brilliant statisticians outside the company who are incorporating adaptive designs in oncology trials. It's that constant motivation, how do we continually improve how we design and conduct our trials, that definitely remains with me today.

Ray: At Amgen, my group makes the molecules, and then we hand the molecules over to you so that you can take those and bring them into humans. In our industry, from when a clinical candidate is nominated for FDA approval and into the market, it's roughly only one in 10 molecules that successfully negotiate that long journey. Why do so few of the molecules make it through?

Rob: Historically, there has been an entire reliance on the discovery research parts of organizations to simply make better molecules. Just get better at targeting the targets. There's been huge strides in our industry, but a fair number of drugs fail, not because it's the wrong target, but because they're not studied in the right way. A major focus for us in the clinical trial part is to think about what are the things that we can do to increase the probability of success and reduce those failures?

Ray: We're going to be talking today about using human data—human genetics, proteomics, transcriptomics, single cell sequencing, electronic health records, and so forth. In the research organization, we use human data in a very particular way, which is to identify targets that might be particularly effective to intervene with. My understanding from talking to you is that it's not just a tool for use in research, but it's a tool you could also apply in clinical development. Can you talk a little bit about that?

Rob: There's utility for all of the different types of human data. I categorize these into two broad categories. One would include traditional phenotypic data, that's things like a patient's symptoms, their medical history. Do they have depression? Do they have a history of heart attack? Do they have a rash? That's captured through a number of sources, like insurance claims data, electronic health records, and increasingly, digital data from things like wearables, smartwatches, etc. And then the other broad category of human data is molecular data. That includes things like blood-based or tissue-based lab data, imaging, and germline and tumor genomic data, as well as transcriptomic and even proteomic data. We're in a time of unprecedented access to this data, and we're in the midst of a transformation in utilizing those data to think really differently about how we generate the clinical data that we need to understand how our medicines work and, ultimately, to support their approval and the use of those medicines. The traditional drug development model since I've been in industry has been basically study a drug in as broad a patient population as possible, realizing that any medicine is very unlikely to work equally as well across all patients. But the problem is, historically, we really didn't have the tools, or the ability to define populations of patients who were more likely to develop a disease or more likely to progress more quickly, or most importantly, to predict who would respond best to our medicines. When I was seeing patients, what patients wanted to know is, what's the benefit that they specifically will experience on this medicine? And all that I can tell them is what's the average effect that was observed in a broad population from the clinical trials. It was incredibly unsatisfying to the patient, and also incredibly unsatisfying to me. But I see now, for the first time, we have this unprecedented access to patient-level data, both phenotypic and molecular data to better understand the disease and the potential treatment responses at an individual patient level.

Ray: What's the level of variance that you would typically see in a clinical trial and how much room is there for improvement?

Rob: In an ideal world, we'd have medicines that either cure all patients with the disease, or at least result in meaningful clinical benefit in all patients with that disease. But unfortunately, that simply isn't the case, and there's a wide range of responses that patients get in terms of efficacy and safety. When you include such a broad patient population in a clinical trial, the treatment effect at the end is averaged over the broad patient population, and this often results in the drug having relatively modest benefits. When you combine a rather modest treatment effect with a lot of variability, this means you need to run really large and really expensive trials to understand what the actual treatment effect is. That large variability, it's almost certainly driven in large part by the underlying differences in a patient's biology. So, the ultimate goal is to use these various data sources as a way to get better insights into those biological differences, and then use them to identify patients who are likely to respond the best or maybe to not develop side effects to a medicine.

Ray: Variability can have different bases: it could be entirely due to their genetics, it can be entirely due to their environmental exposure, it can be a combination of the two—it could be the unique interplay of a genotype with an unusual environmental exposure. Are there different types of human data that can access those different causal bases?

Rob: Yeah, absolutely. I'll pivot a little bit to atherosclerotic cardiovascular disease because that's where a fair amount of data has been done. So, think myocardial infarction and stroke. There are, in the cardiovascular space, very large clinical data sets, and those allow us to establish which clinical factors are the most predictive of patients experiencing a first MI, or who are at risk of having highest risk of having a second MI. Clinical risk factors like age, high blood pressure, cholesterol, smoking, etc., those are all of course associated with increased risk. We know that through these large epidemiologic studies. That's qualitative, you know that patients are likely to have a higher risk if they have one or more risk factors. But it doesn't provide the doctor or the patient or the clinical trials with a quantitative assessment of what that risk is. So now we're combining many of these risk factors together in a weighted way to create what's called a risk score. That is better in predicting who is at higher risk versus lower risk, and it provides a quantitative measure of an individual's risk, say for having an MI in the next 10 years. Clinicians are already using this in the real world to change how aggressively they treat a patient. We're starting to employ this in clinical trials to enrich the trial for those who are more likely to have a second event during the course of the trial. Now, the reality is the clinical measures are okay, but they're not great at defining that risk We can use a person's genetic makeup to further understand the risk. Each of us contains a number of variations in our genes. Those are called single nucleotide polymorphisms, or SNPs. Some of those variations confer additional risk for a particular disease or potentially protection from a disease. Any single SNP may confer a relatively small risk, but there can be dozens, even hundreds of the SNPs, that when we add them together, they can confer quite significant risk or protection in an individual. We can use these to generate what's called a polygenic risk score. A similar risk score can also be created by measuring levels of thousands of proteins in the blood. These polygenic risk scores and proteomic risk scores can do a much better job, when used instead of just the clinical risk scores to predict who's at the highest risk of getting a disease or progressing after being diagnosed with a disease.

Ray: Can you talk a little bit about the differential utility in terms of predictive value of a polygenic risk score versus say a polyprotein risk score that evaluates risk by measuring the level of proteins in the blood?

Rob: I like to look at them as being complementary. A critical piece of information that we need to understand is to what extent things like enriching for a particular genotype or proteotype brings additional disease risk above and beyond that confirmed for more traditional clinical risks. We need to start with the clinical risk factors because those are extremely well established from epidemiologic studies that are already incorporated as part of standard of care, and they're usually cheap to do. It may be that genomic/proteomic data are more beneficial in certain instances. We can hypothesize genomic data might be most useful when understanding something like the lifetime risk of developing a disease or for use in a primary prevention trial when the disease hasn't manifested any clinical symptoms yet. One can imagine in that scenario using a clinical risk score simply wouldn't be feasible. Whereas somebody who already has a disease, the polygenic risk score may become less important. Proteins are influenced not only by the genetic background of an individual, but they can also reflect adverse changes due to lifestyle alterations, the environment, therapeutics that the patients are on. Another consideration as to which of these risk predictions is better may depend on the disease of interest. Certain diseases don't have as strong a genetic underpinning as others and the clinical and proteomic risk scores would be more helpful. One interesting example specifically in atherosclerotic cardiovascular disease is that polygenic risk scores have been demonstrated to be better predictors than the clinical risk scores at not only who's at risk of having a first MI or stroke, but they actually outperform the clinical measures in patients who have already established cardiovascular disease, they've already had a stroke or an MI, at predicting who's going to go on to have a second.

Ray: There might be a natural tendency to think, somebody who's at greatest risk may be the most likely to benefit from my therapy that I'm developing. But, as people become later stage in their cancer, they become sicker and sicker and they actually have a tendency to respond less and less to an intervention. How do you think about that from the point of view of using these multi-omic approaches to identify patients for your clinical trials?

Rob: Let's say we have a targeted therapy for a particular somatic mutation that happens in a tumor, and that mutation early on in the cancer is a critical driver mutation for tumor growth. We ended up enriching for patients who are more likely to progress quickly. It may be that other mutations that are driving metastases of the primary tumor become much more important. In that case, we've potentially enriched for a higher risk population for progressing, but it's also a population who is less likely to derive benefit. So, we need to enrich for patients who are not only a greater risk, but in whom the biological pathway that we're interdicting remains relevant in that population.

Ray: There can be a distinction between what triggers the initiation of an irreversible disease process and then the events that are downstream that lead to progression of that disease. And depending on the disease, polygenic risk scores over proteomics might be more informative. How might we apply this to another difficult disease like Alzheimer's?

Rob: Absolutely. It's a nice example that underscores the importance of being thoughtful around which of these various measures you use to enrich for—either the clinical or the proteomic or the genomic. In Alzheimer's, it's often the case depending on the underlying pathway that you want to intervene early on in the disease process, but you also want to simultaneously enrich for patients who will progress quickly, so you don't have to run a 10- or 15-year trial. You could use a polygenic risk score to identify those patients who are still asymptomatic but they're at higher risk of converting in a relatively short period of time to symptomatic Alzheimer's. It may be that the plasma proteomic signature may not be that helpful in a disease that hasn't manifested significant clinical symptomatology and in an essentially brain-restricted disease. And the clinical scores really wouldn't be of benefit because those would only change once the disease is already too far progressed for your therapeutic to work.

Ray: Both of us really buy into this idea that human data has already led to major improvements and will lead to more in the future. If you and I believe in this and we run research and development at Amgen, and many other people smarter than us believe in this, what's holding us back from doing this right now at scale, transforming the whole medical system overnight?

Rob: It is still relatively early days and we have a lot to learn. So, some of the examples that we discussed for atherosclerotic cardiovascular disease, there appears to be very promising evidence that proteomic and polygenic risk scores can outperform clinical scores and be additive to those. That's simply not the case for a large number of other diseases. Without large prospective genotype, proteotype, phenotype datasets in particular disease areas we simply can't generate these polygenic and proteomic risk scores. That'll take time; that'll take money. This will not be equally as difficult for every disease. So, for cardiovascular disease, things like claims databases do a really good job of capturing something that's pretty straightforward—did the patient have a stroke or a myocardial infarction? But inflammatory diseases, lupus, others, we're interested in what's the severity of the symptoms. Claims databases don't capture those data. So, then we need to turn to things like electronic health records. Those are made up of what we call unstructured data, and that's much more difficult to work with. I'm encouraged by advances in natural language processing that can extract meaningful data from those unstructured data sets. And the molecular characterization, we're finally at a point where we can rapidly extract those data from these electronic health records. Another limitation is that there are important cost considerations. The costs of genotyping have come down to a level that we can now realistically incorporate those into clinical trials and potentially into the clinical ecosystem, depending on the disease. But broad proteomic testing is still quite expensive, even for a clinical trial, and would be simply prohibitive at this point to require that for every single patient who is being considered to receive a medicine. There are some other practical considerations about deploying genotyping and proteotyping at large scale on patients in the real world. This would require pretty significant changes, to have the widespread population genotyping and proteotyping. If we look by analogy in oncology, even when there are clear diagnostic tests available for somatic mutations in known driver mutations, and when we have targeted therapies available for those mutated proteins, a large percentage of patients still don't get that test in the healthcare system today, and thus can't even get the opportunity to receive the targeted therapy. This is where patients literally may die within a very short amount of time without the appropriate medicine. So, establishing this broad application for chronic diseases that may manifest over the course of a decade or more will be challenging.

Ray: You made this point about electronic health records; you need natural language processing to extract things from them. And it reminded me of this effort in the research community to take abstracts of scientific papers and reduce them into an electronic language where you could extract information more readily. Do you think that's feasible to do in the case of EHRs? Like can EHRs be reworked to make them more digital friendly?

Rob: Changing the behavior of millions of clinicians is no small feat. We have vast amounts of data today that sit in the electronic health records. If we could introduce this organized way of capturing data in health records, then we would only be able to use what we capture prospectively. There are major efforts ongoing by all the global regulatory authorities around the concept of data standards, so we standardize our data and our clinical trials such that everyone around the world is collecting data the same way. This has happened through a major set of initiatives over the past one and a half decades or so. The same is happening with real world data. But I still think for the vast majority of diseases where we need much more nuanced clinical data that resides within the electronic health records, we're going to continue to rely more heavily on natural language processing to extract that. One of the main challenges isn't pulling the term of interest out, it's understanding the temporal relation, right? So, if I say heart attack or MI, that gives you part of the information, but am I talking about did the patient have an MI today? Did the patient have an Mi 27 years ago? Did the patient's brother have an MI? It's the context around the term that has been so challenging, but we're finally getting to a point with advances in NLP that we can incorporate that context and the temporality to help us.

Ray: When do you think it will be the case when I go to my general practitioner the first thing he or she is going to do is pull up a screen on their laptop that shows a graphic that depicts my lifetime risk of various diseases based on my genotype, on my proteome, and how much it has changed since the last time I visited the office? When do you think we're going to actually see actionable changes in how each of us as private citizens experience our medical care?

Rob: The adoption of these approaches in the healthcare ecosystem will be somewhat erratic in the near term. They'll be probably more significant in sub-specialties like cardiology, where the underlying data are becoming so compelling. What we will see perhaps before the broader adoption in the clinical ecosystem is what we're doing in the clinical trial space. If you think about how the cost has come down precipitously, it seems reasonable to assume that in the relatively near future, adoption of genomic sequencing for patients who participate in trials is probably not that far away. Assessing proteomics as part of routine clinical trials is certainly further away. In parallel, but perhaps lagging what happens in the clinical trial space will be the utilization of genomic in combination with the clinical risk scores in helping clinicians have more informed discussions with individual patients about what their risk is of developing a disease, or their disease worsening, what they might expect from a medicine. That, to me, is really the holy grail in clinical trials and also in the healthcare ecosystem.

Ray: Well, Rob, thank you so much for joining me today. You've sketched out in a very compelling way the opportunity but also the practical challenges that face us as we try to bring this this -omic data into clinical practice and change how we do things. I'm optimistic for the future, and I'm really glad you were able to join me today and talk about where this is headed in the coming years.

Rob: It's been it's been great speaking with you, Ray.

Thank you for listening to The Human Data Era, and thanks again to Rob Lenz, Senior Vice President of Global Development at Amgen. To dive further into this topic, please join Amgen scientists at the Human Data Q&A webinar discussion on November 16, 2022. Register for the event at the link provided in the episode notes.
To keep up to date with this podcast, follow The Scientist on Facebook and Twitter, and subscribe to The Scientist's LabTalk wherever you get your podcasts.

Human Data: New Connections Between Genetics and Human Disease with Dr. Nancy Cox, professor and director of the Vanderbilt Genetics Institute

Transcript

Episode 2: New Connections Between Genetics and Human Disease

The Scientist: Welcome to The Human Data Era, a special edition podcast series produced by The Scientist's Creative Services Team.

This series is brought to you by Amgen, a pioneer in the science of using living cells to make biologic medicines. They helped invent the processes and tools that built the global biotech industry and have since reached millions of patients suffering from serious illnesses around the world with their medicines.

By studying human genetics, scientists discovered mechanisms that, when defective, cause disease. While this type of data is powerful, additional information can provide more insight on the human condition. Researchers and clinicians can now go beyond genetics, combining proteomics, metabolomics, transcriptomics, and environmental factors into a broad category of human data. In this series, Ray Deshaies, senior vice president of Global Research at Amgen, explores the potential of human data and the important transition scientists and clinicians are making to incorporate this wealth of information into drug research and development.

Ray: Biobanks that house data from electronic health records or collect samples directly from participants are precious resources for researchers looking to understand health and disease and translate these discoveries into recommendations and treatments for patients. In this episode, I talk to Dr. Nancy Cox, professor and director of the Vanderbilt Genetics Institute, about Vanderbilt's DNA biobank, BioVU. Nancy and her fellow researchers use computational genetics to study the de-identified patient DNA stored in the bank along with corresponding electronic health records in order to discover links between genes and disease.

Hello, Dr. Cox, welcome. What drew you to become a researcher focused on human genetics?

Nancy: Thank you. Nice to be here. I was among the few high school students of my era who actually had genetics in high school, and I never wanted to do anything else. In college, I worked in mosquito genetics at the University of Notre Dame in the third graduating class to women, and then went to Yale in human genetics to get my PhD.

Ray: I was looking at your, your CV, and I noticed that you started your independent academic career back in the late 1980s, at the University of Chicago, one of our other guests in this podcast series, Dr. Kari Stefansson, who's now at deCode genetics in Iceland. He was also at the University of Chicago, I think around the same time, and I was curious when I saw your CV, did you overlap with Dr. Stephen Johnson in Chicago?

Nancy: I did. I did. I didn't know him well, but I certainly knew of him and I knew him a lot better afterwards, since he was into genetics, and their director of statistical genetics for a long time. Agha Khan was a close colleague of mine at the University of Chicago.

Ray: So much has happened in human genetics research over the past 30 years. What was it like doing human genetics research back when you first started your lab versus today? What's the single biggest difference that comes to mind?

Nancy: The biggest difference is computer power. All of the sequencing that we do now wouldn't be possible without immense storage capacity, immense analytic capacity for all the data. And while the changes in the cost of sequencing are relatively new and a real driver of technology, unquestionably the change in computer power has been a huge engine for genetics research.

Ray: Let's fast forward to you moving to Vanderbilt to start the Genetics Institute, the VGI, in 2015. Can you tell us a little bit more about VGI and your vision for it?

Nancy: Genetics belongs in medicine at all levels, so the vision was a way to run towards that faster using the biobank. A key point of our biobank is that the phenotypic data comes as electronic health records, so we make our discoveries in the same medium in which we want to do translation. That's a huge advantage. Electronic health records are not research quality information, but if we can't detect our genetic signals in these data, if we can't use these data for discovery, we won't be able to use them for translation. We have fantastic tools for being able to use electronic health records for research purposes and to very effectively treat patients and use genetic information in that context.

Ray: So there's a number of biobanks that have been or are being set up. How do you differentiate what you're doing at BioVu from what's happening everywhere else in the world?

Nancy: There are two kinds of biobanks—biobanks that use electronic health records and biobanks that collect information from subjects as they enter and sometimes along the way. Many are inventories or relatively straightforward and inexpensive laboratory tests that can be measured. We have up to 30 plus years of healthcare information on our subjects. Vanderbilt built their own electronic health records more than 30 years ago, so we have this longitudinal record. I think the biobanks that collect information more directly from their subjects have the opportunity to ask other kinds of questions. So, I see the biobanks as fairly complimentary. A key thing is how healthy the subjects are. Vanderbilt is a tertiary care medical center. Patients drive hours and hours to be here for their medical care. So, we over-represent rare and complicated diseases as a consequence. Whereas something like the UK Biobank was an unusually healthy cohort, ages 49 to 65, when ascertained. It's a very different cohort in terms of their health and well being than a population that is largely here for medical care. In addition, every test that gets ordered for subjects in our biobank is ordered for a reason. They do a battery of tests at the very beginning in the UK Biobank for the purposes of collecting data. We have thousands of laboratory values that are measured in our patients, but every one of those tests was ordered because some physician had some suspicion that they needed this test for to follow up. Again, that's why I think these are all complementary pieces of information. We certainly use the UK Biobank extensively. We collaborate through the eMERGE network with many other US-based healthcare delivery, hospitals, and medical centers to replicate each other's studies. So, there are lots of opportunities to go across the biobanks for maximum utility.

Ray: So, your biobank, BioVU, has about 275,000 DNA samples from patients. How big is big enough? Do you see BioVU continuing to grow further? Is there a point at which the law of diminishing returns starts to kick in?

Nancy: We haven't found it yet. In studies in genetics, especially in the common disease space but also in the rare disease space, the more people that we have, the better we do, and the nuances matter. It's now about using what we're discovering to improve our understanding of the biology that drives disease. One of the things we're learning more about is pleiotropy, the way that genetic variation can lead to increased risk for many different phenotypes. The more samples we have, the deeper our data, the more we're empowered to learn those kinds of things. That can matter a lot in how we understand the ways that genetics can influence disease risk. It's really a harbinger of what things will be like when every medical center has genetics on its subjects, because then every place is a biobank. The more we learn how to use these data well now, the better the quality of our inferences will be when every medical center is its own biobank. We have many things to learn, we can make new discoveries, try out translation in silico, to really see how things will work.

Ray: There's a number of nationally routed biobanks: UK Biobank, Estonia, deCODE Genetics which has the DNA sequence information on the Icelandic population. Some critics out there have noted that some of these biobanks are biased and that they draw from a very specific population. They may not accurately portray how a genetic variant might influence a specific disease process in different genetic backgrounds. So, you might have a variant that might play out differently in an Icelandic individual from an Asian individual. How do you think about population diversity and how it relates to biobanks?

Nancy: This is rooted in the fact that governments that have funded the most collections related to genetics have been for European ancestry populations. But of course it's critical to understand genetics for the world. It is our heritage. It's a matter of catching up because many of our largest genetic studies were rooted in cohort studies that started a long time ago. There have been really good efforts over the last five years to do a better job of representing the world's populations in genetic studies. There's a lot more investment in Africa, South America, Central America by their governments, by world health organizations as well. I'm very heartened how much NIH has been willing to invest in collection of population samples from additional parts of the world, I think that's a good investment for all of us. It's all part of a really important effort to make sure that we capture the diversity of all populations, as we learn to understand how genetics affects risk of disease.

Ray: Another important topic that animates people when thinking about human data and biobanks is the issue of privacy. For example, can you link somebody's human genetic information to their name and might this be used against them in some way by insurance companies and so forth? How does privacy affect what you do?

Nancy: We need to divide the privacy issue into several buckets. There are legitimate concerns that people have about the ways their data get used. That needs to be addressed transparently when doing research with specimens from patients in a hospital with their electronic health record data. People who come to Vanderbilt for their healthcare have to sign a consent that indicates that they find it acceptable that Vanderbilt would use their electronic health record data, leftover biological specimens to create genetic information, to do research to improve our understanding of disease and, ultimately, our ability to care for patients in our healthcare settings. I think that's separate from the legal issues around privacy. The European Union, for example, has very strict rules, and if we participate in research projects with people from Europe, we have to abide by those rules. I think the opportunities for new medicines, a much accelerated understanding of disease processes, when we understand the genetics is coming quickly. People have to be willing to participate in order to make sure that that their genetic information is well represented, and that we understand the information from everyone. The consequences of failing to do that will create more genetic-based health disparities that I think no one wants to see. But that is a risk if whole groups for privacy reasons decided not to participate in genetic research.

Ray: I'd like to zoom in, walk me through a specific example of how you or some of the researchers at VGI have used the resources of BioVU to advance the genetic testing capabilities relevant to human health.

Nancy: One of the faculty that I recruited to Vanderbilt as part of my vision for expanding genetics and genomics in the healthcare setting has developed an algorithm based on people who underwent genetic testing here, using only the subjects that had a genetic test ordered for them, to then identify other people with all of those same phenotypic characteristics that could benefit from genetic testing as well. Within our own healthcare system, there are thousands of individuals who are quite similar in terms of the phenotypes, so we can apply that algorithm to identify people who would benefit from genetic testing These algorithms are built just off of diagnostic codes, the billing codes that every hospital uses, so that it's possible to apply them in community health settings.

Ray: Can you give a sneak preview of something going on in your research group right now that you're super excited about?

Nancy: We're very excited about learning more the biology under disease, and one of the ways that we're trying to do that is to probe the biology of things that we know something about, at the genetics level. Combining genetic variation within a biological pathway so that we understand all of the medical disease consequences that arise from, for example, disruption in TGF-beta signaling, or disruption in the GABA-ergic pathway. We're trying to combine the large-scale data that we have in the biobank and phenotypic information to understand those pathways. There have been some recent papers with very clever approaches to identifying drugs for repurposing or multi-purposing using large scale genetically predicted transcriptomics. This idea is that we try to understand how any given disease seems to be arising as a consequence of more subtle changes—not so much changes in the proteins that people have, but rather changes in the amount of protein, the timing of the production of those proteins, how that gets disrupted. With this longitudinal biobank, we even have the opportunity to validate those kinds of things in silico by looking at something like Alzheimer's, for increased age at onset for people who've taken a certain drug.

Ray: One thing that's interesting to me about BioVU is that unlike some other biobanks out there, it's being developed and maintained within the context of an educational institution. As we're collecting more human omics data, that's increasing the demand for essentially a new breed of scientist who's conversant with interacting with huge datasets, which is very different from the kind of scientific training I got when I was in my youth. How are you thinking about that in terms of your educational mission at Vanderbilt? What type of students are you training and how are you training them to open up this new future of human data-driven discovery?

Nancy: Yes, it's definitely the case that computation is a bigger part of biology than it's ever been, and many more of our students are conversant in both wet laboratory science, large scale -omics generation, and the analysis of that data. Although, a lot of the large scale -omics data get generated in core facilities rather than individual labs. It's also a much more dynamic and fluid environment, in the academic to industry partnerships. Vanderbilt is also a home for people coming from industry back into academia, especially people who really want to learn about electronic health records research. The environment of the future will certainly involve computation, a lot of big data, continued data generation in things like proteomic and metabolic spaces, not just DNA sequence. Things like transcriptomics will start to have a place in biomarker development. That will become over time more a part of medicine than discovery-level research. And so, training people means making sure that they get education in all of these spheres and that they are also cognizant of opportunities outside academia, having a more exciting environment that includes direct collaborations with industry, more internship opportunities for our trainees, allows people in industry to refresh their knowledge of new things that are coming up by coming back into academia for sabbaticals.

Ray: Let's say in the year 2150 that all this genetic analysis, collecting data, and building databases has been going on for well over 100 years. We now have data on literally billions of human beings. And we have computers that make today's computers look like Tinker Toys. We could mine everything extensively. What's going to be the output of that as a result—can we expect humans are going to be living to be 200 years old because we can treat them with all this genetically-informed knowledge that we have?

Nancy: Not by that time. The goal was more modest. We want to understand disease processes enough to interrupt them in advance of them actually occurring. But because so many people already have disease, we need to learn more about the difference between the initiation of the disease and progression of disease, and how we identify and repair the consequences of disease processes. The far future will be much more about disease prevention and interruption. In the near term, we have a lot to learn about disease progression and interrupting disease processes and repairing damage that diseases have done. And that will occupy us for quite a while. I don't like to think of people living hundreds of years, but having people live their best lives while they're alive is a terrific goal to shoot for.

Ray: Nancy, it's really been terrific to have a conversation with you as one of the foremost leaders in human genetics in the world today. So, thank you so much for sharing your time with me.

Nancy: My pleasure. Thanks for inviting me.

Thank you for listening to The Human Data Era, and thanks again to Nancy Cox, director of the Vanderbilt Genetics Institute. To dive further into this topic, please join Amgen scientists at the Human Data Q&A webinar discussion on November 16, 2022. Register for the event at the link provided in the episode notes.

Heterogeneous disorders such as cardiovascular disease have multiple risk factors and causes. In the next episode of The Human Data Era, we'll talk to Narimon Honarpour, vice president of Global Development at Amgen about combining various types of real-world data to understand and develop medicines for complex diseases.

To keep up to date with this podcast, follow The Scientist on Facebook and Twitter, and subscribe to The Scientist's LabTalk wherever you get your podcasts.

Human Data: Exploring Human Data in Cardiovascular Disease with Narimon Honarpour, M.D., vice president of General Medicine, Global Clinical Development at Amgen

Transcript

Episode 3: Exploring Human Data in Cardiovascular Disease

The Scientist: Welcome to The Human Data Era, a special edition podcast series produced by The Scientist's Creative Services Team.

This series is brought to you by Amgen, a pioneer in the science of using living cells to make biologic medicines. They helped invent the processes and tools that built the global biotech industry and have since reached millions of patients suffering from serious illnesses around the world with their medicines.

By studying human genetics, scientists discovered mechanisms that, when defective, cause disease. While this type of data is powerful, additional information can provide more insight on the human condition. Researchers and clinicians can now go beyond genetics, combining proteomics, metabolomics, transcriptomics, and environmental factors into a broad category of human data. In this series, Ray Deshaies, senior vice president of Global Research at Amgen, explores the potential of human data and the important transition scientists and clinicians are making to incorporate this wealth of information into drug research and development.

Ray: Heterogeneous disorders such as cardiovascular disease have multiple risk factors, causes, and manifestations. Having a holistic view of a patient's unique biology potentially leads to earlier and better treatment options. In this episode, I talk to Narimon Honarpour, vice president of Global Development at Amgen, about how human data is helping drug developers and clinicians unpack the complexities of cardiovascular disease to improve patient outcomes.

Hi Narimon, I'm so happy you are able to join me today. We've known each other way back to when you were a post doc in my lab at Cal Tech. I remember you coming to me with to talk about two job offers – one from Amgen and another from a university. Of course, being an academic myself, I encouraged you to take the academic position. But you'd have nothing of it and joined Amgen. The irony is that six years later I followed you to Amgen. Any regrets?

Narimon: Oh, absolutely not. Thanks so much for inviting me to your podcast today. Ray, I think it had been really a pivotal moment for me and in my career development and trying to understand how I was going to apply what I knew to what I was passionate about. And you know, that for many of us in the life sciences is translating the basic science to bench side medicines. So, no regrets. I think one of the great things that's come out of it is we've still had an opportunity to work with each other very closely albite, in a very different set of circumstances. But I reflect on that conversation myself from time to time. I was looking for mentorship, and you'd provided it and you asked a very important question, you know, why make this change? Why now? And I just felt in me that there was a different set of skills, a different way that I could realize that passion.

Ray: So, you took your medical degree training as a as a cardiologist, and in my lab you were doing basic research. When you went to Amgen, you went into global development working on cardiology programs. What do you think has been the most significant challenge that you've experienced in developing cardiovascular medicines?

Narimon: The most significant challenge has been appreciating the heterogeneity in the disease. As a practitioner I've learned to recognize the condition on the basis of what I saw. Now in scientific parlance, we might call that the phenotype of the patient, or the phenotype that comes from the disease. But it wasn't until I got into drug development that I really understood deeply the heterogeneity in cardiovascular diseases and the difficulty that imposes on drug development. It's tempting to consider cardiovascular diseases as a plumbing problem, right, a vascular issue with the pipes. It's not until I got into drug development that I understood that so much of what we're regarding as a plumbing problem is actually driven by the interplay of multiple metabolic pathways, different environmental effects that impact those pathways in people differently. And all of this results in things like different responses to drugs and efficacy that we see in the cardiovascular space. So, it's a far more complex disease than I think many people give credit.

Ray: Do you think there's something unique that makes cardiovascular such a difficult target for drug development?

Narimon: It gets back to that point of heterogeneity, and it's not just heterogeneity as to the causal factors, but it's differences in the metabolic pathways and the systemic pathways that affect cardiovascular diseases. The management of blood pressure physiologically is one systems pathway. What drives that systems pathway is a multitude of other molecular pathways and each of us have a different fingerprint when it comes to the drivers of hypertension. How we respond to the hypertension, how it results in cardiovascular events. We know relatives, friends, loved ones, that seem to have high blood pressure for a long period of time, they seem to be just fine. But then there's the opposite case: a person who has seemingly modest blood pressure issues, and they may have a lot of consequences that come from that. You can then zoom out of the blood pressure system and think about things like lipid management, and you find the same phenomenon happens. You can zoom out further and you can talk about things like obesity, and there are people that have significant problems with weight, but some of them appear comparatively well versus others that don't fare well at all and have an awful course of disease as a result of that. So, the challenge is trying to figure out who is going to develop what condition and what is their major driver of cardiovascular risk. We haven't been able to do that effectively in the field. We've developed new tools, we've characterized the diseases better, but our predictive capabilities are still very much near the beginning stages.

Ray: We've all been taught that there's two main things that contribute to diversity: nature and nurture, right? So, what's in your genes and what you've been exposed to in your environment. How do you think those play out in this heterogeneity? Given two sources of heterogeneity, both nature and nurture, what do you see as a role for using human genetics and other human data to get at this challenge?

Narimon: You're hitting on two significant sources of that challenge with the genetics: we appreciate that there's a variability in the genetics and how that impacts the incidence of disease, the start of the disease process, and how there are impacts of the environment on genetic expression. And then there's what happens once a person actually has the disease, which may not be so much driven by the genetics anymore. In other words, you may have a risk to develop a particular condition on the basis of your genes, you develop that condition, and then a whole host of other pathways come into play. These are expected physiologic responses to having that condition or that disease, and that then impacts the degree and course of disease progression. All of these things lead into a heterogeneity in response. I believe that it is absolutely necessary for us to apply human data to understand the heterogeneity of cardiovascular diseases. We can understand not only the role of particular targets and genes and the incidence and progression of disease, but once the disease is set in motion, we can characterize things like biomarkers, proteomic assessments, transcriptomic assessments. Understand how those changes are influencing that particular person's biology.

We are using genetics more and more to identify targets of interest. This helps us determine which drugs would have a higher probability of success in treating a particular condition. Genetics has a role in also understanding a person's risk, and the first detection of a particular disease that might be more driven by a person's nature composition rather than the nurture composition or environmental effects. It's understanding the risk of developing a particular disease for the first time, and also understanding what particular biologic pathways we could interdict to treat those diseases.

Ray: Okay, how about proteomics? Is that going to give me a different axis on disease?

Narimon: Absolutely. Contrasting that to genetics, where you'd have a risk of developing a particular condition, proteomics could be the channel that we use to understand what happens after you develop that disease, and those pathways are set into motion that impact the progression of our cardiovascular diseases. When it comes to rate of progression, or how severe the disease is going to be for you once you develop it, those are areas that I think proteomics will have an outsized role. And it may be that we identify novel drug targets there as well.

Ray: When you talk about cardiovascular disease and therapy for cardiovascular disease, you often break it down into primary prevention versus secondary prevention, where primary prevention is preventing you from having that first heart attack and secondary prevention would be a therapy designed to spare you from having a subsequent heart attack. Can you talk about how primary and secondary prevention relate to incidence versus progression? And what types of human data are most informative for each of those stages of disease?

Narimon: Primary prevention is predicated on this concept that I can determine who has a particular disease. Diseases don't happen categorically, and they typically don't happen overnight with the flip of a switch. These are biologic processes that have been at play and interacting with the environment and with each other for years. It takes a lot of that type of exposure and time to get to incident disease. That time period up to the incident disease, I call primary prevention. If we think about what it means for a patient once they've had the disease is we appreciate that very similar things are happening, except it may be happening at an accelerated course. A person's had a heart attack. What's going on in those particular patients before their second one are consequences or complications of that first event. Human data give us an opportunity to really rarefy what happens in both of those categories. There is a wide distribution of patient risk profiles that reside within primary prevention. We're not all the same group of people when you're thinking about a primary prevention group. And similarly, we're not all the same group of people when you're talking about a secondary prevention type of study. This gets back to one of the major challenges in addressing chronic diseases like cardiovascular disease. How can I identify the young person that is going to have a significant cardiovascular event early in her life relative to someone else who isn't going to have one until at least age 65? Really, we don't have that capability today for except a couple of clinically measured biomarkers. Lipids are an example of that. That's not the whole story of what predicates risk for a particular patient. So human data, for me, really helps us break down those groups, who is the high-risk patient, irrespective of this somewhat arbitrary categorization on the basis of whether they've had an event or didn't have an event.

Ray: One thing that strikes me as a limit of human data is that people may not visit their doctor until it's too late. I've known several people who started monitoring their health very closely only after they had a heart attack or started to feel other ill effects. We can look at a patient's genetics and proteomics, and measure various markers of disease at that point, but that misses the window of opportunity for primary prevention. Some risk factors, such as high cholesterol, can't be felt, so people may be reluctant to see their doctor to treat that condition until they feel the effects. How can we get around that stumbling block?

Narimon: Feeling the disease is always going to be a compelling reason to seek treatment for a disease, right? So, I think it will always be a problem. But with the right level of education and ingraining the practice of measuring these types of things into medical practice, we can overcome this. There's room to improve how we motivate people to see wellness as opposed to treat their disease, and we need to gear our medical system toward that mindset as well.

Ray: If you can name one area in cardio-metabolic disease where human data has the best potential to transform patient care and patient outcomes, what do you think that'll be?

Narimon: There's the nearest opportunity where I think it will translate into benefit in effects. And then I think there's what I would call the biggest opportunity that one might be further out. There's still a lot that we need to learn about ischemic heart disease. It's not simply driven by an LDL cholesterol problem. We're starting to learn about other lipoproteins and other determinants of cardiovascular risk that may make us treat patients differently and with different care pathways. When you think about a disease like oncology, we oftentimes tailor the person's treatment to what we understand uniquely about their tumor. This is a remarkable advantage that cancer therapeutics have over cardiovascular disease, where we're trying to affect as a heterogeneous disease with some very broad brushstrokes, like managing your blood pressure, reducing your LDL cholesterol. There's much more to it than that. Because we have a lot of access to data and we happen to understand that disease very well comparatively, the nearest opportunity for the application is going to be in ischemic heart disease. The biggest potential impact is with obesity. This is an incredibly heterogeneous area. There are vastly different outcomes that people with obesity have, ranging from diabetes, to liver disease, to heart disease, to none of the above. And I don't think we really understand that at all. There's a lot of important biology that is driving that heterogeneity and response, and many in the scientific arena are regarding obesity as an overarching condition that needs to be managed and tied into other diseases.

I personally consider obesity a label of convenience. It's really the outward appearance of a variety of metabolic disorders. Some of those disorders will respond favorably or disproportionately to things like reduction of adiposity and weight. Others will not so much be dependent on weight reduction, but other things that we modify in patients who happen to be obese and therefore have a certain metabolic predilection that results in cardiovascular events.

Ray: A lot of people hesitate to consider it a disease, they consider it a lifestyle choice. There's clearly a lot of data showing that propensity for obesity has significant genetic roots. Trying to get people to eat less and exercise more, frankly, just doesn't work because once you change the body's set point for mass, it's almost impossible to reverse that in a long-term, stable way. We know this is an epidemic in the United States, and that's clearly a risk factor for having a cardiovascular event. How do you think this is going to evolve from both a regulatory and from a payer perspective?

Narimon: In the future, we're going to have to be far more serious about how we consider and manage obesity as a disease and how we cover obesity medicines for patients that have this condition. And I do think that we're going to be in a position to treat people more precisely with medicines that are tuned to them. So, it's not going to be a one size fits all therapy, the consequences of the condition are far too great to ignore. There is no way that one can defend considering obesity as a lifestyle choice. You raise an important point about this concept of set point, and that is a biological mechanism. Our body responds perhaps initially to something in our environment, and it may result in having gained some weight. Our body responds by fixating on that set point. That's a physiologic response, and it becomes impossible to lose weight durably and to get back to where people had been before. Obesity then spawns off a number of other metabolic arrangements that will end up impacting your cardiovascular health. To affect the biology, we're going to have to have impactful medicines that interdict the different pathways that not only result in obesity, but importantly, affect the consequences that come from that obesity, whether it's avoiding the incident diabetes, avoiding cardiovascular effects, avoiding heart failure, avoiding liver disease, we really have to flush that out. We're at the beginning of this story.

Ray: Zooming out a bit back to more broadly cardiometabolic disease, what's your top findings emerging out of human data in terms of either their impact today on cardiometabolic medicine or their potential impact in the future?

Narimon: A remarkable observation founded in human genetics—people identified variants in PCSK9 that influenced LDL cholesterol. There are patients with particular mutations that were recognized as having familial hypercholesterolemia, a genetic condition that results in high LDL cholesterol levels, that led to the development of an entirely novel class of medications, PCSK9 inhibitors. That happens to be, I believe, the most common monogenetic disorder in the population. So, those for sure had profound impacts on the development of cardiovascular therapies in the management of cardiovascular diseases. Surely, they're going to be more targets in the future. There are going to be other channels of data that we need to look through to identify other targets. The other limitation has been the sensitivity of the tools that we use to uncover those insights. We're starting to see improvements in those technologies, whether it's in proteomics, genomics, transcriptomics, or otherwise, that will help us uncover new targets. It may not always be as simple as inhibiting something at the end of the day. We may have to be more sophisticated in terms of what we do with those targets that we discovered. But I do think that there are a lot more of them out there.

Ray: Coming from the point of view of human genetics and human data, where do you hope to see the field 10 years from now?

Narimon: I hope the approach to managing a person and their cardiovascular risk will be far more personalized than it is today. If you look at the developments of the medicines that we've had, we apply them in broad brushstrokes. I'll take an example of a person who normally comes in at high cardiovascular risk. Let's say they had a stress test, and it was positive, but they aren't necessarily requiring a cardiac intervention at this stage. You put them on a classic series of medicines that will help manage their blood pressure, manage their lipids. Depending on the risk category, you may put them on antiplatelet agents as well. That's something that we apply broadly across all patients as if they're the same. I hope 10 years from now, we'll be able to do better than that based on a person's biology, their proteome, their genomic composition, as well identify the particular channels of residual risk that are going to be the major drivers of cardiovascular events for them. Based on this holistic category of are you a primary prevention patient or a secondary prevention patient, there's so much more that we can do to verify those patient segments, and therefore tailor the therapy to them uniquely.

Ray: 10 years from now, I'll be 70 years old. So, it's 2032 and I go to the cardiologist visit or I go to my GP. How is that visit going to look different than the visit today? Do you envision they're going to pull out a chart that has my DNA sequence, and it's going to list all the variants I have and all the genes that influence cardiovascular biology? Going to my doctor 10 years from now, what's going to be different?

Narimon: First of all, there's going to be a lot more information that you're going to be armed with going into your cardiologist's office. Let's call it a risk score, or a standard test that's done. Instead of it just being a fasting lipid panel and standard blood chemistries, it will be a particular panel geared towards understanding your unique risk. And so maybe the conversation will be, based on this test result that's been validated, it looks like the three main drivers of cardiovascular risk for you, Ray, happen to be this, that, and the other thing. To best manage you, I'm going to put you on this, that, and the other medicine because that's what you're going to benefit from the most. The other medicines that had been generally applied before, maybe they're not going to be so well suited for you. So, you can dispense with the other, say, five or six medications would have come to you in a bundle as just a generic management strategy for cardiovascular risk. That's not just going to help the interaction between the patient and the physician, it's going to benefit our healthcare delivery system as well. Because this really points to more efficient management of the patient. Today, you don't have that opportunity.

Ray:This has been a lot of fun. The field is moving very fast, new targets are constantly being identified, and my group is developing medicines against them, we're handing them off to you. I look forward to continuing that partnership as we use human data to develop new medicines to have beneficial impact on people's cardiometabolic health. Thanks very much for joining me today.

Narimon: Thank you, Ray.

The Scientist: Thank you for listening to The Human Data Era, and thanks again to Narimon Honarpour, vice president of Global Development at Amgen. To dive further into this topic, please join Amgen scientists at the Human Data Q&A webinar discussion on November 16, 2022. Register for the event at the link provided in the episode notes.

By assessing human data, clinicians and drug developers can design therapies that address an individual's unique needs. In the next episode of The Human Data Era, we'll talk to Kári Stefánsson, founder and CEO of deCODE genetics, about the future of precision medicine. To keep up to date with this podcast, follow The Scientist on Facebook and Twitter, and subscribe to The Scientist's LabTalk wherever you get your podcasts.

Human Data: The Role of Human Diversity in Progressing Precision Medicine with Kári Stefánsson, M.D., founder, deCODE Genetics

Transcript

Episode 4: The Role of Human Diversity in Progressing Precision Medicine

The Scientist: Welcome to The Human Data Era, a special edition podcast series produced by The Scientist's Creative Services Team.

This series is brought to you by Amgen, a pioneer in the science of using living cells to make biologic medicines. They helped invent the processes and tools that built the global biotech industry and have since reached millions of patients suffering from serious illnesses around the world with their medicines.

By studying human genetics, scientists discovered mechanisms that, when defective, cause disease. While this type of data is powerful, additional information can provide more insight on the human condition. Researchers and clinicians can now go beyond genetics, combining proteomics, metabolomics, transcriptomics, and environmental factors into a broad category of human data. In this series, Ray Deshaies, senior vice president of Global Research at Amgen, explores the potential of human data and the important transition scientists and clinicians are making to incorporate this wealth of information into drug research and development.

Ray: By understanding disease risk through the information found in a person's genome, scientists can develop more effective therapeutics and clinicians can treat their patients more effectively. In this episode, I talk to Kári Stefánsson, founder and CEO of deCODE Genetics, based in Reykjavik, Iceland that collects and analyzes genealogical, medical, and genomic data at a national scale in order to identify variants that cause disease. We discuss his pioneering work in population-scale genetics, its applications in precision medicine and the healthcare system, and the difficult questions that access to these data raise.

Hi Kari, I'm really pleased to be with you here today. You're one of the great pioneers of using human genetics and discovering genes that influence phenotypes. Tell us a little bit about your background, and how is it that you gravitated towards becoming a human geneticist.

Kari: In my former life I was a neurologist and a neuropathologist, and besides my clinical work I was working on neurological diseases using molecular biology, protein biochemistry, and I wasn't particularly successful. Then one day, we isolated a protein from human brain, cloned the transcript, and sequenced the cDNA. When we localized the gene to a chromosome, it turned out to be in the middle of a disease gene, and that was the first step that I took towards human genetics. It became a very intense focus to figure out how we could go to the source information in the genome to figure out how to manage disease, particularly diseases of the brain.

Ray: You started out University of Chicago and then you moved to Harvard Medical School. When you made this decision to jump in with both feet into human genetics, you moved to Iceland, which to most people is not necessarily an obvious choice. Why don't you tell us a little bit about what drove that decision?

Kari: First of all, at the time it was really difficult to put together a large enough group to make a meaningful contribution to human genetics within a university. And secondly, I was absolutely convinced that the way to do human genetics would be to gather as much data as you could, both on the diversity and sequence and on phenotype, without having a focus on a particular disease or a particular phenotype. It turned out to be a reasonable approach because you could use data on individuals that were cases in one instance and controls in another. This was not just fairly effective, it was economical. In addition to that, Iceland is a founder population, which means that a relatively large percentage of the current population is accounted for by a relatively small number of ancestors. That means that sequence variants that were so rare amongst the founders is likely to be relatively common in the current day population. And this has turned out to be a particularly valuable attribute for us.

Ray: One of the things I've learned from you is the distinction between rare variants and common variants and why the rare variants are so useful from the point of view of target discovery. Can you give us a little bit of a primer on rare and common variants and what the particular utility of the rare variants tends to be?

Kari: The common variants are common either because they came with us out of Africa, or they have been under positive control, or both. The rare variants are rare because they are recent—rarity and novelty are basically synonymous in this kind of human genetics—or they have been under negative selection, or both. It's interesting that the rare variants that we can find are almost invariably with very large effects, but that does not mean that there are not rare variants with small effects. The only ones we have power to find are the ones that have large effects. And most of the rare variants with large effect that we and others have been discovering up until now have been in the coding sequence, and when you have variants in the coding sequences, it's very likely that the effect of the variant is mediated through the protein encoded by the gene where the variant sits, but that is not without exceptions. For example, if you take the coding sequence variants in the ApoE gene, they affect levels of about 62 proteins in blood. So, it is fairly dangerous to assume that those variants mediate an effect necessarily to ApoE. We have a paper describing the whole genome sequencing of 150,000 genomes from the UK Biobank, and one of the things that we were focusing on there is to figure out whether whole genome sequencing, which is relatively expensive, makes sense when you can sequence the whole exome for just a fraction of the cost. It turns out that when you look at the 1% of the genome that is least tolerant to sequence diversity, about 87% of it is outside to the coding sequence of genes. So it is in sequences that are not coding, but definitely have great functional importance.

Ray: I was flipping through an issue of Nature and a chart of the most productive scientists in the world. You were in the top five that they listed. You've been doing this now for 25 years, you've made enormous numbers of findings studying human diversity. What's your favorite discovery?

Kari: The one that has me most excited is usually the one that we made most recently. But if I take a step back and ask what have these 25 years of work been telling us, the most interesting message is a definition of the challenge that is before us. There's an enormous amount of data on the sequence of the human genome available now, and there has been absolutely astonishing numbers of attempts to correlate variants in the sequence with diseases. But I'm more impressed with what we have yet to discover, how little we have gotten out of this. We can sequence 14,500 whole genomes in one month, and the total capacity to sequence genomes in the world is enormous. But to figure out a way to make the information that lies in it meaningful, when it comes to shedding light on the nature of disease and give us an opportunity to treat them, we have to begin to bridge this gap. Human diversity does not just lie in the diversity in the sequence, our A, C, G, and Ts, it also lies in the interaction of the phenotype rooted in the sequence with our environment. The big task at hand now is to figure out systematic ways of capturing the environmental influences. And keep in mind that common diseases, most of them are diseases of relatively late onset, and almost all of them have both genetic and environmental components. We have to put together some sort of a net to catch the environmental influences.

Ray: We've been working together to use the information you gather in human diversity to identify new drug targets, validate previously identified candidate drug targets, identify biomarkers, identify the optimal patients for clinical trials, and so on. Which of those do you think human diversity has the greatest long-term potential? Where do you see the most important application?

Kari: The most important application that is going to have the lasting impact on the industry will be stratifying patients into clinical trial and matching medicines to individuals. It will lie in the application of precision medicine. Human diversity is basically the output of an experiment that has been going on for 250,000 years since modern man arrived on the scene and has been then modified through random change and then selection. When you look at the ability that we have to read into this diversity in explicit detail, there is hardly any aspect of human biology or pathobiology that will not be elucidated in one way or another by using these methods. This approach is going to give us a lot of targets. It is also going to be incredibly important when it comes to clinical development.

Ray: When Amgen first acquired deCODE, the focus was on discovering new targets and validating targets. In cancer biology, there's been a similar approach of sequencing a lot of tumors, looking for mutant genes that might be drivers, and then developing medicines that target those driver genes. We've seen spectacular successes in oncology there. With the human germline genetics, we've also identified targets that contribute to disease, but it's my impression at least that the drive is not as strong right now as what you see in oncology. Why do you think that is?

Kari: All cancer is rooted in some sort of a mutation of a large effect. Once you begin to develop a drug, the idea is to destroy the cancer. Once you find a mutation in a gene that is expressed in the heart and causes a disease there, the goal is not to destroy the heart, it is to modify it in some way or another. So, this is an unfair comparison. Cancer is always going to be ahead of other therapeutic areas when it comes to the use of mutations to direct treatment.

Ray: Human diversity can influence different aspects of human disease. One that I have in mind right now is the distinction between incidence and progression. Alzheimer's is a really interesting example of that. Faulty processing of the Alzheimer's precursor protein underlies Alzheimer's disease—accumulation of the a-beta fragment is a driver of disease. That's disease incidence. Once that fragment is accumulating in the realm of disease progression, targeting that has not been particularly successful. How do you think about disease incidence versus disease progression? Do you think that the same genes are involved in incidence and progression? Human genetics tends to reveal genes that control incidence. How do we go about finding progression genes, if they're going to be different?

Kari: It isn't necessarily certain that there are sequence variants that influence the progression. It is definitely possible that just once the initiation has taken place that there are processes that are independent, at least of the genetics of onset, that take over, and there may not be an awful lot of sequence diversity behind differences in progression. Alzheimer's disease is an excellent example. The deposition of amyloid, which we look at as the initiation process, it can be stopped. The amyloid plaque can be destroyed, but the cognitive decline of the disease continues. It is somewhat similar to what you see in chronic traumatic brain injury. The boxer who receives many blows to the head retires from boxing and receives no more blows to the head. Nevertheless, the cognitive decline continues. We have to look at the onset and the progression as distinct phenomena. We have recently been looking at it in the context of atherosclerotic cardiovascular disease, where we have been not just looking at genetic risk, so-called polygenic risk scores, but also looking at risk scores assessed with proteins in blood. The two of them seem to be somewhat uncorrelated. Once the arthrosclerosis begins, there seem to be forces taking over that are independent of the onset, which is fascinating. It's going to give an awful lot of opportunities for making new discoveries about the nature of disease that hopefully are going to lead to new methods to contain them.

Ray: What do you view as the major conceptual limitations, as opposed to technical limitations, of using human data and human genetics to drive the discovery and development of new medicines?

Kari: I would phrase the questions differently. What are the conceptual limitations to understanding human biology and pathobiology because the subsequent discovery and development of drugs, that's a little bit of a different task. Where we have the greatest need for more data are longitudinal data on people when they are healthy and then data as time passes and they begin to collect diseases. We are particularly in need of data like proteomic data, transcriptomic data, multiple -omic data to map this path from normal function of an organ to an organ with a disease. In addition to that, we are in desperate need for more data on sequence diversity in people of other origins than European. We have very little data on people of African origin, Asian origin, etc. And we need that to be able to claim that we have the full diversity of the human genome.

Ray: When deCODE sequenced my genome close to five years ago now, you discovered a human disease susceptibility lurking in my genome. As gathering human data on people becomes more widespread, finding susceptibilities in people's genomes as they're being sequenced, many people would prefer not to know. A poignant example is Nancy Wexler, who spearheaded a consortium to discover the Huntingtin gene. Her father had Huntington's disease, so she had a 50% chance of having it. When the gene was discovered, she did not want the test done on her DNA, she didn't want to know if she was going to get that disease. As doctors begin using human data more and more, how do we respect that right of a patient not to know if they have a particular susceptibility?

Kari: We have recently looked at about 200,000 genomes from the UK Biobank, from Iceland, from Denmark, from Northern Norway and Sweden, and about 4% of the population has a mutation that is actionable. A mutation in a gene that clinical geneticists have determined is a mutation that causes a serious disease that we can do something about. So, 4% of their population, according to our assessment, have a mutation that could lead to very serious disease or death if nothing is done about it. And the big question is, what do we do about it? Should we always report incidental findings of that sort? There is a law in Iceland that says that if you could save somebody's life, who has fallen, for example, into the harbor, it is your legal obligation to try to save them. It is probably much more likely that you're going to die from any of these mutations than you would die from falling into the harbor. It's a difficult dilemma. I know of one instance, where the integrated delivery network in the United States offered people a to learn about variants in the genome that causes a serious disease, and only 9% responded to it. I personally feel that we should use this data, we should use them to do the healthcare system more effective, to decrease the burden of serious disease in society. And then the question is, is there a right not to know all the carriers with this mutation? Does that weigh heavier than the needs of the of the healthcare system and the desire to decrease the cost of healthcare? If it is a mutation that causes a disease that you can do something about, I am inclined to believe that it is our obligation to report.

Ray: Let's say I was a primary care physician, and my patient had a BRCA1 mutation. So, I know that they had a very high lifetime risk of getting breast cancer, but they elected to not know anything that's embedded in their genes. Now I have to care for that patient knowing they have that risk. How do I appropriately care for that patient knowing they have this ticking time bomb that they don't want to know about? And how does that not influence my care?

Kari: We have a high-level overview of the genome of basically everyone in Iceland, and we could infer that there are about 2500 Icelandics who have a BRCA2 mutation. Women with BRCA2 mutation have about 86% probability of developing breast cancer, ovarian cancer, or a serious cancer that may be lethal. According to the interpretation of the healthcare system of laws in Iceland, we were not allowed to report this to the carrier of the mutation. So, what we did is put up a website where people could ask about the whether they were carrying or not. And only about 25% of the people at risk had actually asked about their carrier status after a year and a half had passed. This is a really a particularly pressing and important question. I don't think that there is only one answer to it, the answers are going to be very personal.

Ray: This highlights the importance of educating patients, scientists and providers about the role of genetic information in research and health care. What are some of the implications to consider when collecting and sharing this information?

Kari: Human genetics bridges the gap between biology and statistics and mathematics. When you generate an overlap between two disciplines, it opens up the possibility of spectacular discoveries. When you begin to put genetics or the study of human diversity into context, it clearly raises some philosophical questions. When it comes to our own health, do we have obligation towards society? Can we be expected to know something about our weaknesses, our liabilities? Because if you don't, it's going to be costly to society. Another thing is, how should we deal with information that lies in the healthcare system and could be used to make discoveries that are being turned into methods to diagnose, treat, and prevent disease? So, when you go to a hospital with a disease, the probability that the hospital will be able to help you is solely dependent on the fact that those who came before you with the same disease allowed the information on them being used to make discoveries. Should you have permission to take advantage of what is there because of the people who came before you, as you say, no, you cannot use information on me to make further discoveries to advance the healthcare system? These are complicated ethical questions that are becoming very important for society because we have started to look at access to health care as human rights. And the question is whether that right should come with obligations.

Ray: I'm going to conclude here by putting you on the spot for a couple of predictions. When do you think it will be the case, at least in developed countries, that when somebody is born, their genome will be sequenced, just as a matter of routine?

Kari: Probably within the next 10 years. We are being promised now by several companies that the cost of sequencing a genome next year will be down to $100. It is very likely, therefore, that within the next five years it will be down to $50 a genome. It is reasonable for the healthcare system to have the genome sequenced. Unfortunately, this prediction only applies to the developed parts of the world. When new technological advances come, they, in the beginning at least, increase the healthcare disparity. We have to begin to bridge that gap.

Ray: Let's say I go to the doctor tomorrow because I developed a peripheral neuropathy, for example. When do you think it will be the case that the first thing the doctor will do is consult my genomic sequence to see if there's any clue to this ailment that's lurking in my genes?

Kari: I think this will happen within the next five to 10 years. The diseases that are purely genetic are mostly the diseases of early age. The most common diseases in our society are late onset. There, you have the interplay between their environment and the genome. But still, when you look at the sequence of the entire genome, you develop polygenic risk scores that are based on a large number of variants. But this is already beginning to happen. When patients show up in a doctor's office with a difficult diagnostic problem, it is becoming common that the doctor sends blood samples for DNA sequencing.

Ray: Kari, this has been an extremely enjoyable conversation. You are probably the person who's thought more about human diversity and its implications for human health and human phenotype than anybody else in the world. I look forward to our next conversation.

Kari: Thanks, Ray.

The Scientist: Thank you for listening to the final episode of The Human Data Era, and thanks again to Kári Stefánsson, founder and CEO of deCODE Genetics. To dive further into this topic, please join Amgen scientists at the Human Data Q&A webinar discussion on November 16, 2022. Register for the event at the link provided in the episode notes.

To keep up to date with this podcast and learn about future series, follow The Scientist on Facebook and Twitter, and subscribe to The Scientist's LabTalk wherever you get your podcasts.