deWaard, Anita (ELS) wrote:
A quick question that I was hoping this forum might have some thoughts
on: we are looking for a new editing tool for our life science thesaurus
EMTREE (proprietary, multi-facted polyhierarchical, 260 k terms (50 k
preferred, 210 k+ synonyms), > 10,000 nodes) and I am trying to convince
the thesaurus department to go to an RDF-based editor. I was wondering
if anyone had any thoughts on
a- the best professional-grade ontology editor to use (serious
alternatives to Protege?), and
b- the best arguments to convince my company to start using RDF, both
internally and externally.
I would like to address b) i.e. the WHY question.
There are several benefits to a semantic web approach:
1) Interoperability and reuse: The use of RDF should increase
interoperability and reuse within your company. Once your data/knowledge
is in RDF/OWL, a steadily growing number of tools are available to
query, manipulate, browse, and visualize it. In the 'internal use'
scenario, the use of standards that bring "interoperability" can result
in a common vocabulary for implementers, architects, and domain experts
within the company - this is already quite something!
2) Knowledge capture: semantic web tools are self-documenting in the
sense that you are able to 'look up' the semantics of both data and
queries. Semantic web can expose precisely the sort of semantics that
are often 'locked up' in the code of a programming language. For
example, some queries can be coded in a programming language for speed
but readability is dramatically reduced relative to SPARQL.
[Note that 1) and 2) can make personnel changes less traumatic - exposed
semantics simplifies reengineering and reuse.]
3) Reasoning: Reasoners can leverage the inherent semantics in a query,
for example, by 'expanding' and 'contracting' queries for you, making
use of background knowledge that is often too difficult to include in
the query itself.
4) Dissemination = robustness?: If your thesaurus is made public in an
RDF format, it will be used and referred to more frequently than if it
remains proprietary. Suggestions for improvements can then come from
outside as well as inside the company (as long as your company provides
a way to channel such information).
5) Clarification from formalization: I believe that the process of
formalization used to build an ontology can clarify murky issues and
improve the semantic models themselves. In the case of the life
sciences, the semantic models are often implicit in the text of a
document, or worse, in a researcher's brain. If semantic models can be
dislosed by choosing/defining the terms to describe a scientific
experiment, for example, it can potentially *expose* the often implicit
assumptions that are necessary for the experiment to succeed.
-scott
p.s. The term 'semantic web' seems to mean different things to different
people. It reminds me of people using 'AI' to refer to (all or some of):
rule-based systems, logic, knowledge representation, machine learning,
theorem provers, game players, scheduling algorithms, natural language
processing, machine intelligence, machine consciousness (!?), etc.
--
M. Scott Marshall
http://staff.science.uva.nl/~marshall
http://integrativebioinformatics.nl/