Methodology article
Phenotype-genotype association grid: a convenient method for summarizing multiple association analyses
Daniel Levy1 ,2 ,3 ,4 ,5, Steven R DePalma6, Emelia J Benjamin2, Christopher J O’Donnell1 ,2 ,7, Helen Parise8, Joel N Hirschhorn6 ,9 ,10, Ramachandran S Vasan2, Seigo Izumo11 and Martin G Larson2 ,8
1From the National Heart, Lung, and Blood Institute, Bethesda, MD, USA
2National Heart, Lung, and Blood Institute’s Framingham Heart Study, Framingham, MA, USA
3Cardiology Division, Beth Israel-Deaconess Medical Center, Boston, MA, USA
4Division of Cardiology
5Department of Preventive Medicine, Boston University School of Medicine, Boston, MA, USA
6Department of Genetics, Harvard Medical School and Howard Hughes Medical Institute, Boston, MA, USA
7Division of Cardiology, Massachusetts General Hospital, Boston, MA, USA
8Department of Mathematics and Statistics, Boston University, Boston, MA, USA
9Divisions of Genetics and Endocrinology, Children’s Hospital, Boston. MA, USA
10Broad Center at Harvard and MIT, Cambridge, MA, USA
11Novartis Research Institute, Cambridge, MA, USA
Background
High-throughput genotyping generates vast amounts of data for analysis; results can be difficult to summarize succinctly. A single project may involve genotyping many genes with multiple variants per gene and analyzing each variant in relation to numerous phenotypes, using several genetic models and population subgroups. Hundreds of statistical tests may be performed for a single SNP, thereby complicating interpretation of results and inhibiting identification of patterns of association.
Results
To facilitate visual display and summary of large numbers of association tests of genetic loci with multiple phenotypes, we developed a Phenotype-Genotype Association (PGA) grid display. A database-backed web server was used to create PGA grids from phenotypic and genotypic data (sample sizes, means and standard errors, P-value for association). HTML pages were generated using Tcl scripts on an AOLserver platform, using an Oracle database, and the ArsDigita Community System web toolkit. The grids are interactive and permit display of summary data for individual cells by a mouse click (i.e. least squares means for a given SNP and phenotype, specified genetic model and study sample). PGA grids can be used to visually summarize results of individual SNP associations, gene-environment associations, or haplotype associations.
Conclusion
The PGA grid, which permits interactive exploration of large numbers of association test results, can serve as an easily adapted common and useful display format for large-scale genetic studies. Doing so would reduce the problem of publication bias, and would simplify the task of summarizing large-scale association studies.
BMC Genetics 2006, 7:30. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.