Predicting Outcome In Breast Cancer: The Search For A Robust List Of Predictive Genes
Eytan Domany, The Henry J. Leir Professorial Chair, Department of Physics of Complex Systems, The Weizmann Institute of Science, Rehovot, Israel
Predicting at the time of discovery the prognosis and metastatic potential of breast cancer is a major challenge in current clinical research. Numerous recent studies searched for gene expression signatures that outperform traditionally used clinical parameters in outcome prediction. Finding such a signature will free many patients of the suffering and toxicity associated with adjuvant chemotherapy given to them under current protocols, even though they do not need such treatment. A reliable set of predictive genes will contribute also to a better understanding of the biological mechanism of metastasis. Several groups have published ranked lists of predictive genes and reported good predictive performance based on them. However, the gene lists obtained for the same clinical types of patients by different groups differed widely and had only very few genes in common, raising doubts about the reliability and robustness of the reported predictive gene lists. The main source of the problem was shown to be [1] the highly fluctuative nature of the correlation of single genes' expression with outcome, on which the ranking was based. The underlying biological reason is the heterogeneity of the disease; to stabilize the genes' ranked list a much larger number of samples (patients) are needed than what has been used so far. We introduced [2] a novel mathematical method, PAC ranking, for evaluating the robustness of such rank-based lists. We calculated for several published datasets the number of samples that are needed in order to achieve any desired level of reproducibility. For example, in order to achieve a typical overlap of 50% between two predictive lists of genes, breast cancer studies would need the expression profiles of several thousand early-discovery patients.
[1] Liat Ein-Dor, Or Zuk and Eytan Domany, PNAS 103, 5923 (2006)
[2] L. Ein-Dor, I. Kela, G. Getz, D. Givol and E. Domany, Bioinformatics 21, 171 (2005)
Audio
Slides