Disambiguate


Resolve unstructured text affiliations to GRID ids

The most prevalent data source of institutional references is author affiliations found on scientific articles. Typically, creators of original research articles and conference proceedings annotate papers with their department. organisation and address. Not only does this allow readers to recognise the origin of the research, but it also serves as a mechanism to aggregate and assess scientific output in order to provide science metrics.

However, when acquiring and processing large quantities of author affiliations, it becomes apparent that significant variation in the format and structure prevents effective aggregation and reporting. This, coupled with changes in name and institutional structure over time, makes large-scale integration of such data prohibitively expensive, given the manual effort required to properly disambiguate each affiliation.

GRID provides an automatic disambiguation service to overcome these challenges. By exploiting the wealth of data we have acquired during the processing of award and publication data, coupled with extensive database of institutions, we are able to provide algorithmic matching of author affiliation strings to institutions.

Contact us at contact@grid.ac to find out more.

Uc disambiguate

Examples

Location aware disambiguation

Our disambiguation algorithm recognises geographic locations in the affiliation string to select the correct institution even if the name is ambiguous.

Washington University School of Medicine, St. Louis, Missouri, USA

Department of Medicine, University of Washington, Seattle, Washington, USA

Multilingual names

GRID has the ability to process affiliations in languages other than English.

Kirchhoff-Institut für Physik, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany

Departamento de Fisica, Pontificia Universidad Católica de Chile, Santiago, Chile

Dipartimento di Fisica e Astronomia, Università di Bologna, Bologna, Italy

Multiple institutes

GRID has coverage across various institution types allowing it to identify multiple organisations in a single affiliation string.

Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, US

University of Washington, Center for Integrative Brain Research, Children’s Hospital, Seattle, WA, 98101 USA

MRC Lifecourse Epidemiology Unit, University of Southampton and University Hospital Southampton NHS Foundation Trust, Southampton, SO16 6YD, United Kingdom

Name variants

An extensive hand curated list of mappings from real datasets ensures GRID has many name varations to match on.

The Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA

Queen Mary and Westfield College, University of London, London, E1 4NS, UK

Astrophysics Group, Cavendish Laboratory, J J Thomson Avenue, Cambridge CB3 0HE