Wednesday, 22 October, 2014

Welcome to Orphadata

The mission of Orphadata is to provide the scientific community with a comprehensive, high-quality and freely-accessible dataset related to rare diseases and orphan drugs, in a reusable format.

For more information on Xml format files, see the user's guide.

See "How the data are produced"

Description of the freely-accessible dataset


The dataset is a partial extraction of the data stored in Orphanet, which is also accessible at www.orpha.net for consultation purposes only.
This freely-accessible dataset is available in seven languages (English, French, German, Italian, Portuguese, Spanish and Dutch). It includes:

  • An inventory of rare disorders indexed with OMIM, ICD-10, UMLS, MeSH, MedDRa.
    • What's new?
    • Types of disorders: Disorders in the database are comprised of a heterogeneous typology of entities of decreasing extension, including: groups of disorders, disorders, sub-types. A "rare disorder" in the database can be a disease, a malformation syndrome, a clinical syndrome, a morphological or a biological anomaly or a particular clinical situation (in the course of a disorder).
    • Flags of disorders: A flag is a numerical indication attached to an element of the database in order to allow information to be retrieved (i.e. find the list of deprecated entries in Orphanet).
    • New relations between disorders: We identify "Moved to" relationships between entities in the database when a deprecated disorder is part of another.
    • Characterisation of the alignments between disorders and external terminologies or resources: OMIM, ICD10, MeSH, UMLS and MedDRA. These alignments are further characterised, specifying whether the terms are perfectly equivalent (exact mapping) or not (other kinds of relationships: from broader to narrower, from narrower to broader,).
  • Linearisation of disorders
    • Disorders can be multi-classified in Orphanet classifications. For analysis purposes, each disorder is attributed to a preferred classification by linking it to the head of classification entity. As some decisions could be made somewhat arbitrarily, we have written a set of rules to make sure attributions are consistent. The methodology is described here.
  • Genes in Orphanet are cross-referenced with Orphanet diseases and indexed with HGNC, OMIM, GenAtlas, UniProtKB, Ensembl, IUPHAR-DB and Reactome. The relationship between a gene and a disease is qualified according to the role that the gene plays in the pathogenesis of a disease
    • What's new?
    • New information concerning genetic entities in the database:
      • Type of genetic entities: either gene with protein product, locus, non-coding RNA
      • Their chromosomal location
      • New gene-disease relationships: gain of function and loss of function germline disease causing mutation.
  • A classification of rare diseases established by Orphanet, based on published expert classifications
  • Epidemiology data related to rare diseases in Europe (class of prevalence, average age of onset, average age at death) extracted from the literature
    • What's new?
    • Major changes are introduced regarding epidemiological data:
      • Epidemiological figures: Point prevalence, birth prevalence, lifetime prevalence and incidence, or the number of cases/families reported will be available together with their respective intervals and their geographical area (country, continent).
      • Class of age of onset and death are more precisely indicated, and more than one can be assigned (instead of only one in the former version). Inheritance categories have also been revised in order to include complex forms of inheritance, and to fill in gaps such as Y-linked inheritable diseases.
  • A list of signs and symptoms associated with each disease, and their class of frequency within the disease
  • The list of Orpha signs and symptoms used to annotate the diseases, cross-referenced with other nomenclatures: HPO, PhenoDB, LDDB.

For more information, see the user's guide.

Only non-nominative data are accessible in accordance with personal data protection laws.
The dataset is updated once a month. The date of the last release is indicated.

"About Orphadata" for more information

How to quote

When quoting Orphanet, please use the following format :

Orphanet: an online rare disease and orphan drug data base. © INSERM 1997.
Available on http://www.orpha.net. Accessed [date accessed].


When quoting Orphadata, please use the following format:

Orphadata: Free access data from Orphanet. © INSERM 1997.
Available on http://www.orphadata.org. Data version [e.g.XML data version].


If you wish to use one of our logos, please make a request via the contact form.