
Just catching up.

Could I strongly support the following. If there is one repeatedly confirmed lesson from the medical communities experience with large terminologies/ontologies/ it is to separate the "terms" from the "entities". There are always linguistic artefacts, and language changes more fluidly in both time and space than the underlying entities. (In medical informatics this is sometimes quaintly phrased as using "nonsemantic identifiers").



On 5 Jul 2006, at 22:43, William Bug wrote:

By the way, the "mapping" I refer to above linking instance data where ever it may reside (primary data repositories, pooled/ analyzed/interpreted data, the scientific literature) to entities in the ontologies requires reference to the lexicon - the TERMS used to describe the ontological fundamentals by the scientists reporting them. This is true whether an algorithm or a human is trying to understand and interpret a collection of instance data in the context of the relevant knowledge framework, even if that framework resides in the head of the human researcher.

I like to think of this distinction as being very coarsely analogous to the distinction between the physical data model in an RDBMS and the many tools used to make that more abstracted, normalized collection of related entities directly useful for specific applications - e.g., SQL SELECT statements, VIEWs, and/or Materialized VIEWS. Maintaining these as distinct elements goes a long way toward ensuring the abstraction is re-usable for a large set of applications, while simultaneously being able to support each application's detailed requirements through custom de- normalization.

This is why I like to keep the lexicon distinct from the ontology. They are intimately linked. No ontology is free of lexical artifacts (I'm not certain it can or should be), anymore than a lexical graph can be assembled without representing semantic relations. Analysis of the lexicon can inform how to adapt the semantic graph in the ontology - make it more commensurate with the current state of knowledge as expressed by domain experts, and review of term use in the context of the ontology can be a great help in creating effective, structured, controlled terminological resources. However, the two types of knowledge resource are constructed via different process, support different Use Cases, and rely on different fundamental relations at their core, however intimately they may be linked.

Alan Rector
Professor of Medical Informatics
School of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL +44 (0) 161 275 6149/6188
FAX +44 (0) 161 275 6204

Reply via email to