Many thanks for weighing in directly on this discussion, Jack. I
should have cc'd you myself on this.
Clearly the presentation you sight discussion the following issue is
one of great relevance to biomedical semantic web projects:
"Where ontological entities carry information resources that speak to
the same subject, then those entities are merged into a single
subject map entity -- a subject proxy -- regardless of conflicts
between messages conveyed."
Such a capability will be critical to merging the top-down
(ontological) and bottom-up (SW) approaches to KR. Terminologies -
and TMRM - clearly play a critical role in the intervening layers
uniting these approaches.
Cheers,
Bill
On Jun 7, 2006, at 1:21 PM, Jack Park wrote:
Brief comment regarding the topic mapping work Bill mentions here.
Doug Bowden recently participated in a workshop on "Ontology
Federation" here at SRI. We had several speakers from both the
topic mapping world, and bioinformatics: Douglas Bowden and Peter
Karp, and Steve Newcomb, Patrick Durusau, and myself. Vinay
Chaudhri opened the workshop and participated. Richard Fikes spoke
from the perspective of the KR community. KR lies, of course, at
the roots of the work products we all create.
We are moving away from the XTM topic mapping specification, and
into the TMRM [1] topic maps reference model, the product of which
we now call "subject maps" to distinguish topic maps from a
slightly different paradigm. Subject maps are no longer constrained
by a preselected ontology (XTM) and can be implemented using key/
value properties of the author's choice. This permits authors to
create subject maps that can mimic any frame-like language chosen,
including, I suppose, OWL.
There exists a necessary and important tension between the use
cases of traditional KR and those for which the topic mapping
paradigm has been created and shown useful. When Bill mentions
"more semantic web compliant", I would ask questions derived from
two important use cases. The two use cases do not circumscribe the
entire field of KR, but they serve as place holders to delimit a
useful discussion between ontologists and subject mappers. I will
argue that both ontologies and subject maps are valuable, and they
can serve users together. The two use cases about which I speak are:
1- accurately answering questions according to some authority
2- understanding some universe of discourse, even where
conflicting world views exist
Use of authoritative ontologies is clearly the domain of question
answering. Understanding some universe of discourse is also rightly
the domain of ontologies, but here, subject maps offer the
opportunity to "federate" disparate world views into a unified
framework organized around subjects. Where ontological entities
carry information resources that speak to the same subject, then
those entities are merged into a single subject map entity -- a
subject proxy -- regardless of conflicts between messages conveyed.
There are benefits to be derived from such merging operations.
Patrick Durusau and I spoke to this topic in a teleconference to
the Ontolog community [2], and slides and an mp3 of the talk are
available. There will be other papers released soon on these
opportunities.
It was in the spirit of this federation opportunity that Doug
Bowden and I first spoke. To be "semantic web compliant", it is
always possible for our subject map portal to carry plenty of RDF
metadata. It remains to be answered whether the goal of such
metadata is to accurately answer specific questions, or to just
advertise the presence of world views.
Bioinformatics, in all of its many manifestations, I strongly
believe, will benefit from collaborations between ontologists and
subject mappers.
Jack
[1] http://www.isotopicmaps.org/tmrm/
[2] http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2006_04_27
William Bug wrote:
Hi All,
Sorry - I'd thought I'd already subscribed to this list, but
apparently not - until now.
The need for a mereotopologically-sound, neuroanatomical ontology
is quite pressing across the community of neuroscientists involved
in neuroinformatics projects most of which include a neuroimaging
component. Generally there is only one thing neuroscientists are
interested in when analyzing images at whatever resolution from
the macromolecular (EM) on up to the macroscopic - i.e.,
identifying biologically relevant shapes. In order for these
shapes to have any meaning in a context where one attempts to pool
data and perform relevant data reduction operations, the shapes
must exist within a shared coordinate space of some sort. For
instance, if two separate labs are examining the change in the
size of the Substantia Nigra during the course of Parkinsonian
neurodegeneration, in order for them to compare their
observations, they require several data integration/semantic
frameworks:
- a shared neuroanatomical terminology
- a shared coordinate space (to place the shapes from their
images in a comparable coordinate framework)
- a shared, well-founded anatomical ontology which
encapsulates mereotopological knowledge about shapes in - at least
- 3D space.
Other knowledge resources can be helpful in supplementing this
array of tools, but, generally, these are the absolute minimum.
[NOTE: the Wikipedia has a moderately clear definition of
mereotopology (http://en.wikipedia.org/wiki/Mereotopology).
Basically, it combines a formal, ontological theory of shapes and
boundaries (mereology) with the mathematics of topology with the
goal of providing a computational formalism to support applying
logical operations to objects in space. As has been pointed out
by others, a great deal of the work in this field of applied
biomedical mereotopology derives from related work in the GIS
field. Use of mereotopology by geographers has been going on for
quite some time and is much more advanced. Work from GIS can be
adapted for use in the biomedical domain, but it must be done with
great care, as many of the assumptions behind the way researchers
represent space and manner of information being represented can
differ significantly across these disciplines.]
The same is true as you scale this problem up to field-wide
projects such as BIRN or The NeuroCommons.
As several have mentioned in this thread, there are already
existing resources that can begin to fill this need.
1) NeuroNames
Kei, Olivier, Peter Mork, and others have already given sufficient
references on NeuroNames in this thread, so that others can dig in
deeper to the specifics if they like.
Having worked with Doug Bowden, Mark Dubach, and their colleagues
over the last year or so in an advisory capacity on the specific
issue of use of NeuroNames for semantically-based, neuroanatomical
data set integration, I can add a few important qualifying points:
a) Doug et al. have been working on the extremely difficult
task of unifying neuroanatomical terminologies across mammalian
species for 20 years now. Embedded in Neuronames & Braininfo,
there is a wealth of hard won empirical knowledge related to how
one achieves this end. I think it would be ill-advised to try to
duplicate their effort, as the myriad scientific problems related
to this effort would surely present themselves again and only need
to be worked out once one.
b) Doug et al. are extremely collegial and quite receptive to
feedback and collaboration - within the bounds of their limited
resources.
c) NeuroNames is a terminological resource - not a well-
founded, spatial ontology of brain anatomy capable of supporting
mereotopological reasoning. As with most research-based
terminologies, there are many semantically-based relations
embedded in the NeuroNames graphs, but as the primary goal of NN
is to disambiguate and integrate across the neuroanatomical
lexicon, the embedded semantic information can often lead to a
logical dead end. For instance, many neuroanatomical terms
critical to specifying location in the rodent brain have been
placed in the NN category "ancillary terms," as they don't fit
into the core hierarchy in an unambiguous way. This can make use
of NN for annotating mouse brain gene & protein expression
patterns (e.g., GENSAT, the Allen Brain Atlas, various BIRN
projects) extremely problematic.
d) The NN primary structures (http://
braininfo.rprc.washington.edu/indexabout.html) provide the closest
thing to an ontology in NN. As Peter Mork pointed out, there has
been an effort in the past to unite this core NN hierarchy with
the FMA, which does provide a mereotopologically sound framework
for anatomy. Barry Smith (formal ontologist who has worked for
over a decade on problems in biomedical ontology - most
especially, though hardly exclusively, in the area of
mereotopological reasoning) and his colleagues have worked closely
with the Cornelius Rosse and his colleagues at the FMA project to
create in association with the work started in the FMA a
foundational ontology for biomedicine (the Ontology of Biological
Reality) that is becoming increasingly important to all of the
ontologies being monitored by NCBO and incorporated into the OBO
site and the emerging OBO Foundary (http://obofoundry.org/).
e) Doug and his colleagues have worked closely with Jack Park
(a consulting scientist to SRI's AI Center - http://
www.ai.sri.com/) to represent NN as a TopicMap (XTM). As many on
this list may know, there has been a moderate amount of effort to
integrate and/or reconcile XTM with RDF here at the W3C (search on
"TopicMaps" at the main RDF page - http://www.w3.org/RDF/). I'm
not certain how this effort will ultimately make NN more "semantic
web" compliant, but the bottom line is a great deal of effort has
already been expended to express NN in a semantically well-
grounded formalism.
f) Though - as Don points out - neuroanatomical
representations are likely to significantly evolve over the coming
decades, as the number of large scale gene & protein expression
characterization studies focussed on the brain continue to
accumulate. Having said that, the "conventional" view of
neuroanatomy will likely remain relevant for a long while to come,
not only because it has been used to characterize findings in the
literature for the last 125+ years, but also because it did derive
from a wealth of empirical observation which is likely to remain
valid in many domains of neuroanatomical study. I would also
modify Don's well informed comment regarding the derivation of
"conventional" views of neuroanatomy. To a large extent they are
related to functional studies of the brain - as well as lesion
based studies of functional deficits dating back to the 19th
century (think "Broca's Area"), but they are also very much based
on a study of the morphology of the brain - both the external
surface morphology (sulci, gyri, and lobes), as well as
histological examination of internal structures. Many of these
studies of structure in space are likely to stay with us for some
time to come (and are well-founded in reality), though as Tim
Clark & Don have pointed out in this thread, nomenclature is still
a very significant problem even in this very "old" field.
g) licensing of NN - Doug et al. formerly had a completely
open policy to distributing NN. The only a reason a license was
instituted was at some point about 5 years back another group
sucked down the entirety of NN, reworked a lot of what was there -
probably with very practical goals directed toward making NN more
"correct" and effective in their problem domain - then
"republished" their product as "NeuroNames". This lead to a great
deal of confusion. The fact they chose to do this on sly also
meant the work they did was not necessarily compatible with the
work done by Doug et al.. In order to avoid this happening again,
it was decided a license would be established to discourage this
sort of behavior. As anyone who has developed a terminology and/
or ontology, it is absolutely essential there remain a single
curating authority, if the value of the resource is to remain in
tact. The "vetting" performed by the central authority - as is
extensively done by the curators of the Gene Ontology, for
instance - is absolutely essential to the guaranteeing the
integrity of the knowledge resource. This is not a "closed" or
proprietary process, just a highly controlled one. Unfortunately,
Doug Bowden's resources are MUCH MUCH smaller than those available
to the curators/developers of GO, so the NN curation effort
necessarily moves at a slower pace.
2) Working with the Neuroscience community
As Kei, Don, and others have stated, it would be unwise to proceed
in creating an "open source" neuroanatomical ontology without
interacting with the researchers who've already put a lot of
effort into this problem over the past decade or so. With this in
mind, I have several suggestions:
a) The 5 ways of knowing neuroanatomy:
This is a pitch I've been making which I think helps to
sum up the current ways various sub-fields have attempted to
identify/label/collate brain morphology
i) Terminlogies - e.g., NN, BrainLex
ii) Ontologies - e.g., Neuro-FMA (the project Peter Mork
referred to)
iii) Literature Informatics (CocoMac, BrainMap,
NeuroScholar, BAMS, ArrowSmith, etc.).
These are very mature projects. Some include their
own mereotopological reasoning systems (e.g., CocoMac and
BrainMap) in order to be able to pool and compare the relatedness
of structures and connectivity across different studies in the
literature. The goal in this category is to perform large-scale
semantic mining of the literature to confirm/refute current
knowledge and uncover new correlations - very much along the lines
of what The NeuroCommons Project expects to achieve via use of
semantic web technologies. Some researchers in this category are
actually participating in The NeuroCommons Project (i.e., Gully
Burns, who developed NeuroScholar).
iv) voxel/pixel analysis:
This approach applies computer vision algorithms to
automatically - or semi-automatically - identify 2D & 3D shapes in
digital anatomical images. This field is also extremely mature,
though there are many significant caveats to exactly how much of
this work can be effectively automated.
v) parameterized models:
Often these are derived from - or used to drive - the
voxel/pixel based analysis described in 'iv' - though the spatial
modeling is definitely a distinct approach from the pure voxel/
pixel approach.
None of studies you'd fit into these categories exclusively focus
on their technique/tool alone without some aspect of the other
"ways of knowing neuroanatomy" playing a role in what they do.
However, it is clear much fundamental work in this area primarily
focuses on one technique over the others.
Having said that, when the neuroscience community makes use of
this work to examine a specific biological problem, they will
often draw significant tools and resources from more than one of
these domains.
b) NCBO/NCOR sponsored meeting focused on mereotopology in
neuroanatomy:
Barry Smith is working to bring together researchers
working in the 5 domains described above. There is a very
pressing need in large-scale, field-wide neuroinformatics projects
such as what is being done in the BIRN project to have these 5
domains converge and work more cooperatively. Right now, a lot of
manual effort has to be put out to bring them together. This is
something BIRN has been pursuing. In the last 6 months, we have
received a great deal of support and guidance on this effort from
NCBO. Daniel Rubin interacts directly with the BIRN Ontology Task
Force, and the work Barry Smith has been doing with FMA, OBO,
FuGO, and PATO have very much begun to create a much more well-
founded and computable path toward performing large-scale
annotation of neuroimaging data.
This meeting is on the NCBO/NCOR slate for 2007, but in
the interim I hope to see more effort invested in the coming year
across the 5 communities listed above toward the goal of
integrating across these "ways of knowing" now that the need has
been recognized.
3) Microarrays:
Just as Don, Kei, Alan R., and others have pointed out, high-
throughput assays - microarrays, BAC-based IHC, in situ studies
using the Gene Paint technology employed by the Allen Institute of
Brain Science to construct the Allen Brain Atlas of gene
expression in the brain - are going to transform our understanding
of neuroanatomy over the coming decades. This is just a given.
There is a pressing need to derive a means to integrate spatially-
mapped studies of gene & protein expression into a neuroimaging
setting. The spatial resolution may be very coarse - e.g., "whole
brain" - but they still provide sufficient spatial information to
be usable in the context of a neuroanatomical coordinate system.
We are working in the BIRN project to create a means for
researchers to integrate these distinct approaches to studying the
brain. As Alan R. pointed out, FuGO is working to put description
of microarray experiments on a solid, formal footing, and I would
expect one aspect of that will be to represent microarray data in
RDF/OWL. This is not a trivial problem, given as much of the
available data is merely MIAME-compliant - MIAME not even being a
data format, but just a collection of minimal data requirements.
One need only look at the great complexity of the data submission
process at the NCBI GEO site to get an appreciation for how
difficult this problem can be. A great deal of effort is being
invested in the microarray field to come up with a better means
handle this issue, and the FuGO effort will be a critical
clearinghouse for this work. The important thing to remember when
it comes to field-wide data pooling and re-analysis, it may
sometimes be necessary to get right back to the microarray primary
image files so as to reapply different criterial when performing
the statistical tests and reductions on pooled data. Given this
requirement - one we also see in the neuroimaging domain - I
believe it is very important to proceed in a well-reasoned manner
when seeking to integrate across microarray datasets using
semantic web technologies. Alan R. and myself - possibly others
too - on this list are on the FuGO Coordinators Committee, so
hopefully we can help to keep those lines of communication open.
Sorry to go on so, but this is a topic on which I've labored quite
intensively over the past year. There is a lot being done on this
issue, and I think all efforts will get much further more quickly
- and in a way that will carry more street cred with practicing
neuroscientists - if we all try to work together.
Cheers,
Bill
Bill Bug
Senior Analyst/Ontological Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - [EMAIL PROTECTED]
Bill Bug
Senior Analyst/Ontological Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - [EMAIL PROTECTED]
This email and any accompany attachments are confidential. This information is
intended solely for the use of the individual to whom it is addressed. Any
review, disclosure, copying, distribution, or use of this email communication
by others is strictly prohibited. If you are not the intended recipient please
notify us immediately by returning this message to the sender and delete all
copies. Thank you for your cooperation.