Re: NeuroNames [was: slides for the UMLS presentation]

Jack Park Wed, 07 Jun 2006 20:28:04 -0700


Brief comment regarding the topic mapping work Bill mentions here.

Doug Bowden recently participated in a workshop on "Ontology Federation"here at SRI. We had several speakers from both the topic mapping world,and bioinformatics: Douglas Bowden and Peter Karp, and Steve Newcomb,Patrick Durusau, and myself. Vinay Chaudhri opened the workshop andparticipated. Richard Fikes spoke from the perspective of the KRcommunity. KR lies, of course, at the roots of the work products we allcreate.

We are moving away from the XTM topic mapping specification, and intothe TMRM [1] topic maps reference model, the product of which we nowcall "subject maps" to distinguish topic maps from a slightly differentparadigm. Subject maps are no longer constrained by a preselectedontology (XTM) and can be implemented using key/value properties of theauthor's choice. This permits authors to create subject maps that canmimic any frame-like language chosen, including, I suppose, OWL.

There exists a necessary and important tension between the use cases oftraditional KR and those for which the topic mapping paradigm has beencreated and shown useful. When Bill mentions "more semantic webcompliant", I would ask questions derived from two important use cases.The two use cases do not circumscribe the entire field of KR, but theyserve as place holders to delimit a useful discussion betweenontologists and subject mappers. I will argue that both ontologies andsubject maps are valuable, and they can serve users together. The twouse cases about which I speak are:

  1- accurately answering questions according to some authority

2- understanding some universe of discourse, even where conflictingworld views exist

Use of authoritative ontologies is clearly the domain of questionanswering. Understanding some universe of discourse is also rightly thedomain of ontologies, but here, subject maps offer the opportunity to"federate" disparate world views into a unified framework organizedaround subjects. Where ontological entities carry information resourcesthat speak to the same subject, then those entities are merged into asingle subject map entity -- a subject proxy -- regardless of conflictsbetween messages conveyed. There are benefits to be derived from suchmerging operations. Patrick Durusau and I spoke to this topic in ateleconference to the Ontolog community [2], and slides and an mp3 ofthe talk are available. There will be other papers released soon onthese opportunities.

It was in the spirit of this federation opportunity that Doug Bowden andI first spoke. To be "semantic web compliant", it is always possible forour subject map portal to carry plenty of RDF metadata. It remains to beanswered whether the goal of such metadata is to accurately answerspecific questions, or to just advertise the presence of world views.

Bioinformatics, in all of its many manifestations, I strongly believe,will benefit from collaborations between ontologists and subject mappers.


Jack
[1] http://www.isotopicmaps.org/tmrm/
[2] http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2006_04_27

William Bug wrote:

Hi All,
Sorry - I'd thought I'd already subscribed to this list, butapparently not - until now.
The need for a mereotopologically-sound, neuroanatomical ontology isquite pressing across the community of neuroscientists involved inneuroinformatics projects most of which include a neuroimagingcomponent. Generally there is only one thing neuroscientists areinterested in when analyzing images at whatever resolution from themacromolecular (EM) on up to the macroscopic - i.e., identifyingbiologically relevant shapes. In order for these shapes to have anymeaning in a context where one attempts to pool data and performrelevant data reduction operations, the shapes must exist within ashared coordinate space of some sort. For instance, if two separatelabs are examining the change in the size of the Substantia Nigraduring the course of Parkinsonian neurodegeneration, in order for themto compare their observations, they require several dataintegration/semantic frameworks:
    - a shared neuroanatomical terminology
- a shared coordinate space (to place the shapes from their imagesin a comparable coordinate framework)- a shared, well-founded anatomical ontology which encapsulatesmereotopological knowledge about shapes in - at least - 3D space.Other knowledge resources can be helpful in supplementing this arrayof tools, but, generally, these are the absolute minimum.
[NOTE: the Wikipedia has a moderately clear definition ofmereotopology (http://en.wikipedia.org/wiki/Mereotopology).Basically, it combines a formal, ontological theory of shapes andboundaries (mereology) with the mathematics of topology with the goalof providing a computational formalism to support applying logicaloperations to objects in space. As has been pointed out by others, agreat deal of the work in this field of applied biomedicalmereotopology derives from related work in the GIS field. Use ofmereotopology by geographers has been going on for quite some time andis much more advanced. Work from GIS can be adapted for use in thebiomedical domain, but it must be done with great care, as many of theassumptions behind the way researchers represent space and manner ofinformation being represented can differ significantly across thesedisciplines.]
The same is true as you scale this problem up to field-wide projectssuch as BIRN or The NeuroCommons.
As several have mentioned in this thread, there are already existingresources that can begin to fill this need.
1) NeuroNames
Kei, Olivier, Peter Mork, and others have already given sufficientreferences on NeuroNames in this thread, so that others can dig indeeper to the specifics if they like.
Having worked with Doug Bowden, Mark Dubach, and their colleagues overthe last year or so in an advisory capacity on the specific issue ofuse of NeuroNames for semantically-based, neuroanatomical data setintegration, I can add a few important qualifying points:a) Doug et al. have been working on the extremely difficult taskof unifying neuroanatomical terminologies across mammalian species for20 years now. Embedded in Neuronames & Braininfo, there is a wealthof hard won empirical knowledge related to how one achieves this end.I think it would be ill-advised to try to duplicate their effort, asthe myriad scientific problems related to this effort would surelypresent themselves again and only need to be worked out once one.b) Doug et al. are extremely collegial and quite receptive tofeedback and collaboration - within the bounds of their limitedresources.c) NeuroNames is a terminological resource - not a well-founded,spatial ontology of brain anatomy capable of supportingmereotopological reasoning. As with most research-basedterminologies, there are many semantically-based relations embedded inthe NeuroNames graphs, but as the primary goal of NN is todisambiguate and integrate across the neuroanatomical lexicon, theembedded semantic information can often lead to a logical dead end.For instance, many neuroanatomical terms critical to specifyinglocation in the rodent brain have been placed in the NN category"ancillary terms," as they don't fit into the core hierarchy in anunambiguous way. This can make use of NN for annotating mouse braingene & protein expression patterns (e.g., GENSAT, the Allen BrainAtlas, various BIRN projects) extremely problematic.d) The NN primary structures(http://braininfo.rprc.washington.edu/indexabout.html) provide theclosest thing to an ontology in NN. As Peter Mork pointed out, therehas been an effort in the past to unite this core NN hierarchy withthe FMA, which does provide a mereotopologically sound framework foranatomy. Barry Smith (formal ontologist who has worked for over adecade on problems in biomedical ontology - most especially, thoughhardly exclusively, in the area of mereotopological reasoning) and hiscolleagues have worked closely with the Cornelius Rosse and hiscolleagues at the FMA project to create in association with the workstarted in the FMA a foundational ontology for biomedicine (theOntology of Biological Reality) that is becoming increasinglyimportant to all of the ontologies being monitored by NCBO andincorporated into the OBO site and the emerging OBO Foundary(http://obofoundry.org/).e) Doug and his colleagues have worked closely with Jack Park (aconsulting scientist to SRI's AI Center - http://www.ai.sri.com/) torepresent NN as a TopicMap (XTM). As many on this list may know,there has been a moderate amount of effort to integrate and/orreconcile XTM with RDF here at the W3C (search on "TopicMaps" at themain RDF page - http://www.w3.org/RDF/). I'm not certain how thiseffort will ultimately make NN more "semantic web" compliant, but thebottom line is a great deal of effort has already been expended toexpress NN in a semantically well-grounded formalism.f) Though - as Don points out - neuroanatomical representationsare likely to significantly evolve over the coming decades, as thenumber of large scale gene & protein expression characterizationstudies focussed on the brain continue to accumulate. Having saidthat, the "conventional" view of neuroanatomy will likely remainrelevant for a long while to come, not only because it has been usedto characterize findings in the literature for the last 125+ years,but also because it did derive from a wealth of empirical observationwhich is likely to remain valid in many domains of neuroanatomicalstudy. I would also modify Don's well informed comment regarding thederivation of "conventional" views of neuroanatomy. To a large extentthey are related to functional studies of the brain - as well aslesion based studies of functional deficits dating back to the 19thcentury (think "Broca's Area"), but they are also very much based on astudy of the morphology of the brain - both the external surfacemorphology (sulci, gyri, and lobes), as well as histologicalexamination of internal structures. Many of these studies ofstructure in space are likely to stay with us for some time to come(and are well-founded in reality), though as Tim Clark & Don havepointed out in this thread, nomenclature is still a very significantproblem even in this very "old" field.g) licensing of NN - Doug et al. formerly had a completely openpolicy to distributing NN. The only a reason a license was institutedwas at some point about 5 years back another group sucked down theentirety of NN, reworked a lot of what was there - probably with verypractical goals directed toward making NN more "correct" and effectivein their problem domain - then "republished" their product as"NeuroNames". This lead to a great deal of confusion. The fact theychose to do this on sly also meant the work they did was notnecessarily compatible with the work done by Doug et al.. In order toavoid this happening again, it was decided a license would beestablished to discourage this sort of behavior. As anyone who hasdeveloped a terminology and/or ontology, it is absolutely essentialthere remain a single curating authority, if the value of the resourceis to remain in tact. The "vetting" performed by the centralauthority - as is extensively done by the curators of the GeneOntology, for instance - is absolutely essential to the guaranteeingthe integrity of the knowledge resource. This is not a "closed" orproprietary process, just a highly controlled one. Unfortunately,Doug Bowden's resources are MUCH MUCH smaller than those available tothe curators/developers of GO, so the NN curation effort necessarilymoves at a slower pace.
2) Working with the Neuroscience community
As Kei, Don, and others have stated, it would be unwise to proceed increating an "open source" neuroanatomical ontology without interactingwith the researchers who've already put a lot of effort into thisproblem over the past decade or so. With this in mind, I have severalsuggestions:
    a) The 5 ways of knowing neuroanatomy:
This is a pitch I've been making which I think helps to sum upthe current ways various sub-fields have attempted toidentify/label/collate brain morphology
        i) Terminlogies - e.g., NN, BrainLex
ii) Ontologies - e.g., Neuro-FMA (the project Peter Morkreferred to)iii) Literature Informatics (CocoMac, BrainMap, NeuroScholar,BAMS, ArrowSmith, etc.).These are very mature projects. Some include their ownmereotopological reasoning systems (e.g., CocoMac and BrainMap) inorder to be able to pool and compare the relatedness of structures andconnectivity across different studies in the literature. The goal inthis category is to perform large-scale semantic mining of theliterature to confirm/refute current knowledge and uncover newcorrelations - very much along the lines of what The NeuroCommonsProject expects to achieve via use of semantic web technologies. Someresearchers in this category are actually participating in TheNeuroCommons Project (i.e., Gully Burns, who developed NeuroScholar).
        iv) voxel/pixel analysis:
This approach applies computer vision algorithms toautomatically - or semi-automatically - identify 2D & 3D shapes indigital anatomical images. This field is also extremely mature,though there are many significant caveats to exactly how much of thiswork can be effectively automated.
        v) parameterized models:
Often these are derived from - or used to drive - thevoxel/pixel based analysis described in 'iv' - though the spatialmodeling is definitely a distinct approach from the pure voxel/pixelapproach.
None of studies you'd fit into these categories exclusively focus ontheir technique/tool alone without some aspect of the other "ways ofknowing neuroanatomy" playing a role in what they do. However, it isclear much fundamental work in this area primarily focuses on onetechnique over the others.
Having said that, when the neuroscience community makes use of thiswork to examine a specific biological problem, they will often drawsignificant tools and resources from more than one of these domains.
b) NCBO/NCOR sponsored meeting focused on mereotopology inneuroanatomy:Barry Smith is working to bring together researchers workingin the 5 domains described above. There is a very pressing need inlarge-scale, field-wide neuroinformatics projects such as what isbeing done in the BIRN project to have these 5 domains converge andwork more cooperatively. Right now, a lot of manual effort has to beput out to bring them together. This is something BIRN has beenpursuing. In the last 6 months, we have received a great deal ofsupport and guidance on this effort from NCBO. Daniel Rubin interactsdirectly with the BIRN Ontology Task Force, and the work Barry Smithhas been doing with FMA, OBO, FuGO, and PATO have very much begun tocreate a much more well-founded and computable path toward performinglarge-scale annotation of neuroimaging data.This meeting is on the NCBO/NCOR slate for 2007, but in theinterim I hope to see more effort invested in the coming year acrossthe 5 communities listed above toward the goal of integrating acrossthese "ways of knowing" now that the need has been recognized.3) Microarrays:Just as Don, Kei, Alan R., and others have pointed out,high-throughput assays - microarrays, BAC-based IHC, in situ studiesusing the Gene Paint technology employed by the Allen Institute ofBrain Science to construct the Allen Brain Atlas of gene expression inthe brain - are going to transform our understanding of neuroanatomyover the coming decades. This is just a given. There is a pressingneed to derive a means to integrate spatially-mapped studies of gene &protein expression into a neuroimaging setting. The spatial resolutionmay be very coarse - e.g., "whole brain" - but they still providesufficient spatial information to be usable in the context of aneuroanatomical coordinate system.We are working in the BIRN project to create a means forresearchers to integrate these distinct approaches to studying thebrain. As Alan R. pointed out, FuGO is working to put description ofmicroarray experiments on a solid, formal footing, and I would expectone aspect of that will be to represent microarray data in RDF/OWL.This is not a trivial problem, given as much of the available data ismerely MIAME-compliant - MIAME not even being a data format, but justa collection of minimal data requirements. One need only look at thegreat complexity of the data submission process at the NCBI GEO siteto get an appreciation for how difficult this problem can be. A greatdeal of effort is being invested in the microarray field to come upwith a better means handle this issue, and the FuGO effort will be acritical clearinghouse for this work. The important thing to rememberwhen it comes to field-wide data pooling and re-analysis, it maysometimes be necessary to get right back to the microarray primaryimage files so as to reapply different criterial when performing thestatistical tests and reductions on pooled data. Given thisrequirement - one we also see in the neuroimaging domain - I believeit is very important to proceed in a well-reasoned manner when seekingto integrate across microarray datasets using semantic webtechnologies. Alan R. and myself - possibly others too - on this listare on the FuGO Coordinators Committee, so hopefully we can help tokeep those lines of communication open.
Sorry to go on so, but this is a topic on which I've labored quiteintensively over the past year. There is a lot being done on thisissue, and I think all efforts will get much further more quickly -and in a way that will carry more street cred with practicingneuroscientists - if we all try to work together.
Cheers,
Bill

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - [EMAIL PROTECTED]

Re: NeuroNames [was: slides for the UMLS presentation]

Reply via email to