Related Links


Most downloaded review Q1 2009: Target discovery from data mining approaches

Stephen Carney

The most downloaded review article from Drug Discovery Today from the first quarter of 2009 deals with the topic of target discovery from the informatics perspective. It would be difficult to overstate the value to Pharma of identifying and validating the most relevant therapeutic targets. In this article, Yang, Adelstein and Kassis outline text mining, its value and limitations and application to target discovery. In addition, they cover the field of emerging and integrated data mining approaches.

One of the most critical steps in developing a new drug discovery programme is target identification and validation. History has taught us that one of the major factors responsible for failure in therapeutic programmes is inappropriate target selection [1,2,3]. In the pharma industry, a target is a relatively vague term referring to any number of biological molecular classes that generally include proteins, genes, RNA, sugars inter alia.  To be useful pharmaceutically, such a target needs to be "druggable".  To be druggable, a  target needs to be accessible to putative drug molecules (whether they be small organic molecules or larger, biological therapeutics) and bind them in such a way that a beneficial biological effect is produced.  As a result of that binding, in addition to the beneficial biologic effect, such a molecule may be able to affect the levels of disease biomarkers, biological pathways and crucial ‘nodes’ on a regulatory network. The availability of such "surrogate biomarkers" may be of great importance to how successful a drug discovery effort might be. Validation of  biomarkers for diseases is an important part of the target validation process.

Target discovery can be grouped into two categories: a system approach and a molecular approach [1]. The system approach is a strategy that selects targets through the study of diseases in whole organisms using information derived from clinical trials and in vivo animal studies. The molecular approach, the mainstream of current target discovery strategies [3] and [4], is geared towards the identification of ‘druggable’ targets where activities can be modulated through interactions with small molecules or proteins and/or antibodies. Presently, the majority of ‘druggable’ targets are G-protein-coupled receptors (GPCRs) and protein kinases. Because the biological mechanisms of human diseases are rather complex, the most crucial task in target discovery is not only to identify, prioritize and select reliable ‘druggable’ targets but also to understand the cellular interactions underlying disease phenotypes, to provide predictive models and to construct biological networks for human diseases [1]. This requires extensive gathering and filtering of a multitude of available heterogeneous data and information.

The exponential increase of biomedical data and information has been accompanying the so-called "omics era".  For instance, MEDLINE/PubMed, currently contains more than 18 million literature abstracts, and more than 60,000 new abstracts are added monthly. Analogously, the number of databases warehousing chemical, genomic, proteomic and metabolic data is rapidly growing and has been estimated to double every two years. This wealth of biological data and information presents immense new opportunities for target discovery in support of the drug discovery pipeline [3]. In pace with the growth of biological databases, the flourishing of bioinformatics, especially data mining approaches, to extract or filter valuable targets by combining biological ideas with computer tools or statistical methods has changed the way target discovery is conducted. Currently, text mining of literature databases and microarray data mining are the two prevailing approaches to target discovery [4]. With the recent development of high-throughput proteomics and chemical genomics, another two data mining approaches, proteomic data mining and chemogenomic data mining, have surfaced . To keep up with new scientific discoveries, there is a clear need to develop efficient data mining methods to fuel target discovery in the post-genomics era.

Target discovery from data mining approaches.
Yang Y, Adelstein SJ, Kassis AI.

Drug Discovery Today (2009) 14(3-4),147-54

1 Lindsay, M.A. (2003) Target discovery. Nat. Rev. Drug Discov. 2, 831–838
2 Sams-Dodd, F. (2005) Target-based drug discovery: is something wrong? Drug
Discov. Today 10, 139–147
3 Butcher, S.P. (2003) Target discovery and validation in the post-genomic era.
Neurochem. Res. 28, 367–371
4 Sakharkar, M.K. and Sakharkar, K.R. (2007) Targetability of human disease genes.
Curr. Drug Discov. Technol. 4, 48–58

Share this article

More services


This article is featured in:
Informatics  •  Target Identification/ Validation


Comment on this article

You must be registered and logged in to leave a comment about this article.