Re: Experiment Ontology

Kei Cheung Sun, 09 Dec 2007 18:42:17 -0800

Nice summary and comments, Bill. This is the idea of open innovation andopen community.

The example I gave includes hypothesis. In addition to the ontologiesyou mentioned, we might also need to think about the SWAN ontology,which captures hypotheses.


Cheers,

-Kei

Bill Bug wrote:

Hi Susie,
We certainly do need an "Experiment Ontology" - or Ontology ofBiomedical Investigation (OBI).
I believe Matthias, Michael, and Kei have all made exactly the pointsI think are most important to consider:
1) Matthias's comments
Are you following "best practices" in creating the ontology. Ibelieve Matthias gives many instructive examples on how to adjust whatis here to bring it much more in sync with the emerging "bestpractices" that are coming out of the community developmentsurrounding a variety of OBO Foundry ontologies. Matthias also makesthe point that its important to seek to re-use (or directly contributeto) the emerging community ontologies to cover the required domains.In the case of this particular Experiment Ontology, the ontologies toconsider are Ontology of Biomedical Investigation (OBI), the OBORelations Ontology, the Gene Ontology (specifically the MolecularFunction and Cellular Component branches, the latter of which isdesigned to capture components down to the level of macromolecularcomplexes), the Sequence Ontology, Protein Ontology (nascent - butproceeding rapidly), the Cell Ontology - at a minimum. As many onthis list know - and I'm certain the talented folks at Lilly whoinvested time in assembling this ontology also learned - many of theseare not fully ready for prime-time, and/or may not FULLY cover thebreadth and depth of the domains a specific application requires.However, if one doesn't seek to work with these community efforts,you cannot expect to achieve the ultimately goal, which is to makeyour data maximally "semantically sticky", so as to ensure the leastamount of custom logic and human effort will be required to get themost value from your data. Otherwise, you stand the chance ofcreating what may be a useful ontology that meets your specificrequirements (as has been true of "investigation"-oriented ontologiesthat have come before such as the MAGE Ontology, ExperiBase, EXPO,myGRID KAVE, etc.), but don't help the community at-large toappropriately re-use your data. In each case, these ontologies or KRframeworks have been extremely useful in the local application contextfor which they were constructed, but they cannot be effectivelyemployed as the basis for semantically-driven integration across datasets that may not be able to accept the constraints (or lack thereof)of this application-oriented ontology.Would you know off-hand, Susie, whether the folks who worked on thisontology at Lilly have both reviewed the relevant community effortscited above and/or have sought to interact with those groups to getsome input on how best to meet the overall requirements that underliethis particular Experiment Ontology with the minimal required effortand in a manner that could help to ensure Lilly's sunk investmentcould be of benefit to us all.
2) Michael's comments
It's very helpful to know what the target is when it comes toexporting/exchanging the actual data. As Michael points out, a greatdeal of work has gone into the production of FuGE (and MaGE before it)to come up with the appropriate division of labor between thesemantically-opaque, syntactical requirements as represented in a datamodel such as MaGE or FuGE and the explicit semantics as captured inthe ontology. For those using FuGE, as Michael states, in the realmof syntax, the intention for FuGE is to provide a shared structure foruniversal elements such as biomaterials, experimentpopulations/pools/groups, protocol details, reagents details, etc..Built on that shared, generic foundation, any specific discipline -e.g., microarray expression, GC-MS, FISH, MRI, etc. - can sub-classFuGE components and add what additional detail required in theirdiscipline. In parallel with this effort on data structure, the OBIontology cooperative seeks to provide that same foundation for theshared semantic domains, and a clear set of recommended practices forhow to re-use entities from other OBO Foundry ontologies such asChEBI, Sequence Ontology, Protein Ontology, OBO Cell, OrganismTaxonomy (OWL versions of NCBI Tax), etc. to specify the criticalbiomedical entities and their complex relations. As I say above,these are works in progress. For those of us who must have somethingworking now, the recommended practice is to actively participate inthese projects with an eye toward following their practice - andreplacing any "proxy" you create in the interim with the communityontology, when it is ready for use. This is what we have done in theBIRN ontology BIRNLex. We actually have an OWL module called"BIRNLex-OBI-Proxy.owl" which we fully intend to replace with OBIentities, when they are ready for use. We also have"BIRNLex-Investigation.owl" that builds on this "proxy" to coverentities BIRN researchers must capture. We expect to eventually seethe contents of "BIRNLex-Investigation" in OBI in some form. Weintend to "contribute" those elements from this OWL file directly toOBI, when OBI is ready for them, and we have the time work throughthis migration process.
3) Kei's comments
Examples - examples - examples. This is critical. Working throughthe example Kei cites from the NIH Neuroscience Microarray Consortiumis a wonderful way to determine whether:- there are existing community ontologies that can meet the KR andprocessing requirements
- where the gaps are in those community ontologies
- whether the ontology you are creating effectively fills those gaps(if it does, that makes it very clear how the community effort canmake effective use of your ontology)In regards to Gene Lists, Kei is certainly correct. If these arecaptured through algorithmic means, it's critical to capture thedetails on that algorithm - typically both the version of thealgorithm as well as the version of the data repository you ran itagainst.Also - where gene entities are concerned - there is ongoing workbetween the GO groups, the Sequence Ontology, and the Protein Ontologythat is particularly targeted toward capturing the specific relationsbetween types of genomic sequence elements and types of biologicallyactive protein-based molecules (e.g., macromolecular complexescomposed of a collection of proteins in a variety ofpost-translationally modified states - e.g., GPC receptors, ionchannels, transporters, pathway enzymes, etc. - i.e., Rx drugtargets). These are the details we'll all require in order to doround-trip pharmacogenetics - i.e.,effects of genetic constructs ontarget susceptibility to drugs - AND - the ways in which drugsultimately alter macromolecular complexes by leading to changes ingene expression.
Just my $0.02 filtering on these helpful comments from Matthias,Michael, and Kei.
Cheers,
Bill

On Dec 3, 2007, at 1:00 PM, Kei Cheung wrote:
This is great!
I have a microarray experiment description (that has to do withAlzheimer Disease) extracted from NINDS microarray consortium:
http://arrayconsortium.tgen.org/np2/viewProject.do?action=viewProject&projectId=433773<http://arrayconsortium.tgen.org/np2/viewProject.do?action=viewProject&projectId=433773>
I just wonder how this example would fit this experiment ontology (aswell as others such as OBI) As shown in this example, we recordinformation such as organ type, organ region, cell type (layer IIpyramidal neuron), etc. NINDS microarry consortium uses differentarray platforms (e.g., agilent, Affymetrix, and cDNA) for differentorganisms so one may need to divide chips into groups correspondingto different platform types. Each group can then be further dividedinto subgroups corresponding to different organisms.
We also would like to capture gene lists (not the raw gene lists butthe ones (much shorter) that indicate what genes are over/underexpressed under certain experimental conditions). Such gene listswould usually be extracted from the literature. Also the analysispackage (including version) that was used to generate a gene listshould be identified. One possible use of these gene lists is tocompare them to identify genes are differentially expressed under thesame/similar experimental condition across different microarrayexperiments. This would help identify true signals from noises.
Hope it helps.

Cheers,

-Kei



Matthias Samwald wrote:
Hi Susie,

Susie wrote:
It would be great if you could take a look at it and providecomments. The
ontology is available at:
http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/Experiment_Ontology
* Some of the entities/properties are missing a rdfs:label or havean empty label (a string with lenght 0).* Some of the entities could be taken from existing ontologies likeOBI, RO or some of the OBO Foundry ontologies. This would save workand makes integration with other data sources and ontologies mucheasier. By the way, there seem to be several groups working onontologies for mircoarray experiments, or are at least planning todo that. It would be great if these groups could work together.* The class 'Chip type' should be removed and be replaced bysubclasses of 'chip', e.g., 'chip (human)', 'chip (mouse)' etc.* Some of the object properties appear like they are intended to bedatatype properties (e.g., 'has proteome id').* Many of the datatype properties could be replaced with objectproperties, possibly referring to third party ontologies -- ofcourse this would require a richer ontology and more work spent oncreating mappings. 'has molecular function' could refer to entitiesfrom the gene ontology, 'has associated organ' could refer to anontology about anatomy and so on.* Object properties and their ranges are quite redundant. Property'has reagent' has range 'Reagent', property 'has treatment' hasrange'Treatment' and so on. Maybe the ontology could be designed insuch a way that there are only some generic properties such as 'haspart'. This would make the ontology much easier to maintain, queryand understand in the long term.
* It is unclear how 'Gene list' is intended to be used.
* 'Hardware' and 'Software' should not be subclasses of 'Protocol'.
Many of the datatype properties in this ontology look veryinteresting and might provide requirements for other ontologies. Itwould be great if some of them could be described/commented in moredetail so that we know more about the requirements that motivatedthe creation of these properties.
I hope that was somewhat helpful.

cheers,
Matthias Samwald
William Bug, M.S., M.Phil.email: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
Ontological Engineer (Programmer Analyst III) work: (610) 457-0443
Biomedical Informatics Research Network (BIRN)
and
National Center for Microscopy & Imaging Research (NCMIR)
Dept. of Neuroscience, School of Medicine
University of California, San Diego
9500 Gilman Drive
La Jolla, CA 92093

Please note my email has recently changed

Re: Experiment Ontology

Reply via email to