RE: BioRDF Telcon

2009-09-01 Thread Miller, Michael D (Rosetta)
hi helen,

can you elaborate a bit more on this, i'm not sure exactly what you
mean.  what makes the overhead high and for who and what kind of errors
does the text mining produce?  what's the metric for performance?

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: public-semweb-lifesci-requ...@w3.org 
> [mailto:public-semweb-lifesci-requ...@w3.org] On Behalf Of 
> Helen Parkinson
> Sent: Tuesday, September 01, 2009 12:11 PM
> To: Nigam Shah
> Cc: Kei Cheung; HCLS; tomasz.adamus...@gmail.com
> Subject: Re: BioRDF Telcon
> 
> NCI is a good match also for microarray data, but the 
> overhead of using 
> it is high, and more errors are generated when text mining 
> and through 
> user error. We get better performance from EFO than from NCIT.
> 
> best
> 
> Helen
> 
> Nigam Shah wrote:
> > On Tue, Sep 1, 2009 at 6:37 AM, Helen Parkinson 
>  > > wrote:
> >
> > The performance depends largely on how well the ontology is
> > aligned with the input data, for this reason we 
> developed our own
> > ontology. 
> >
> >
> > This is very true. However, if there is already an existing 
> ontology 
> > that matches the input data well (e.g. the NCI Thesaurus in case of 
> > Tissue Microarray Annotations in TMAD (http://tma.stanford.edu/) .. 
> > then its possible to use the annotator tool at NCBO 
> > (http://bioportal.bioontology.org/annotate).
> >
> > Regards,
> > Nigam. 
> 
> 



RE: BioRDF Telcon

2009-09-01 Thread Miller, Michael D (Rosetta)
hi all,

yes, at http://bioportal.bioontology.org/ontologies/40510.  one odd
thing is that the link to the latest 1.3 is there three times and the
topmost explore link caused an error.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: public-semweb-lifesci-requ...@w3.org 
> [mailto:public-semweb-lifesci-requ...@w3.org] On Behalf Of 
> Maryann Martone
> Sent: Tuesday, September 01, 2009 12:26 PM
> To: Helen Parkinson
> Cc: Kei Cheung; Nigam Shah; HCLS; tomasz.adamus...@gmail.com; 
> Fahim Imam
> Subject: Re: BioRDF Telcon
> 
> It is there.  I know that Fahim uploaded the latest;  I don't 
> know if  
> it's exposed yet.  The BIoportal has had some troubles with the  
> modular design, but I think most have been worked out
> 
> On Sep 1, 2009, at 12:12 PM, Helen Parkinson wrote:
> 
> > Having the NIFSTD in bioportal would be ideal, we will 
> always defer  
> > to NIF for neuroscience terms,
> >
> > best
> >
> > Helen
> >
> > Kei Cheung wrote:
> >> Nigam Shah wrote:
> >>> On Tue, Sep 1, 2009 at 6:37 AM, Helen Parkinson  
> >>> mailto:parkin...@ebi.ac.uk>> wrote:
> >>>
> >>>The performance depends largely on how well the ontology is
> >>>aligned with the input data, for this reason we 
> developed our own
> >>>ontology.
> >>>
> >>> This is very true. However, if there is already an existing  
> >>> ontology that matches the input data well (e.g. the NCI 
> Thesaurus  
> >>> in case of Tissue Microarray Annotations in TMAD 
> (http://tma.stanford.edu/ 
> >>> ) .. then its possible to use the annotator tool at NCBO 
> (http://bioportal.bioontology.org/annotate 
> >>> ).
> >>>
> >>> Regards,
> >>> Nigam.
> >> I notice that EFO is accessible through BioPortal. As the 
> Neurolex  
> >> group is in the process of updating NIFSTD, perhaps the 
> new version  
> >> of NIFSTD may also be made accessible through BioPortal when it's  
> >> ready.
> >>
> >> Thanks,
> >>
> >> -Kei
> >>
> >
> 
> 
> 
> 
> Maryann Martone
> Professor-In-Residence
> Dept of Neurosciences
> University of California, San Diego
> San Diego, CA  92093-0446
> Tel:  858 822 0745
> Fax:  858 246 0644
> 
> 
> 
> 
> 
> 



RE: Can RDFa be used on XML: pharma information

2009-07-24 Thread Miller, Michael D (Rosetta)
hi kei,

there is already something better than RDFa tags in MAGE-ML, the
OntologyEntry tags.  Their purpose is exactly to provide the information
to link to the semantic web.  The examples you provided from NIH are
well annotated with those tags.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: public-semweb-lifesci-requ...@w3.org 
> [mailto:public-semweb-lifesci-requ...@w3.org] On Behalf Of Kei Cheung
> Sent: Friday, July 24, 2009 10:56 AM
> To: Ivan Herman
> Cc: Ralph R. Swick; Rick Jelliffe; public-semweb-lifesci@w3.org
> Subject: Re: Can RDFa be used on XML: pharma information
> 
> This may also be an interesting way of intersecting 
> microarray (mageml) 
> and semantic web (rdfa)  ...
> 
> -Kei
> 
>  Ivan Herman wrote:
> 
> >I am sorry if I come into this thread very late. Additionally to what
> >Ralph just said, the RDFa distiller running on the W3C site:
> >
> >http://www.w3.org/2007/08/pyRdfa/
> >
> >should actually work with an arbitrary XML file, although only SVG is
> >'announced' there (which is probably my mistake). If there 
> is a problem
> >then, well... it is my bug:-(
> >
> >Ivan
> >
> >Ralph R. Swick wrote:
> >  
> >
> >>At 10:48 PM 6/23/2009 +1000, Rick Jelliffe wrote:
> >>
> >>
> >>>I see that the 2008 draft
> >>> http://www.w3.org/2006/07/SWD/RDFa/rdfa-overview
> >>>says
> >>>"RDFa itself is intended to be a technique that allows for 
> adding metadata to any (XML) markup document, including SMIL, 
> RSS, SVG, MathML, etc. Note, however, that in the current 
> state, RDFa is being defined only for the (X)HTML family of 
> languages."
> >>>  
> >>>
> >>The RDFa specification was designed with the intent that other
> >>languages than XHTML could take advantage of RDFa markup.
> >>(The terminology "host language" was used in some drafts
> >>to signal this direction.)  The charter under which the group
> >>was operating was specific to XHTML, thus the wording in
> >>the W3C Recommendation.
> >>
> >>
> >>
> >>>So I think I will go ahead and add some RDFa markup to the
> >>>XML, 
> >>>  
> >>>
> >>By all means, reuse the RDFa vocabulary if it seems appropriate
> >>for your application.
> >>
> >>
> >>
> >>
> >
> >  
> >
> 
> 
> 



RE: BioRDF Telcon

2009-07-23 Thread Miller, Michael D (Rosetta)
hi helen,

> I can probably do something more quickly with an 
> rdf export n 
> of transformed data analysed for over/under expressions ...

the clinical genomics group at HL7 is looking for a good model to
represent gene expression based biomarkers, i.e. a set of genes and one
or more expression profiles for the set of genes where each profile maps
to a phenotype or factor.  altho the representation is more UMLish, this
sounds like it would be helpful.

> ... plus factor 
> values and genes and we'll have a student to work on this I hope

would this also be able to include the originating BioSource annotations
that aren't necessarily factor values?

cheers,
michael

> -Original Message-
> From: Helen Parkinson [mailto:parki...@ebi.ac.uk] 
> Sent: Thursday, July 23, 2009 11:11 AM
> To: Miller, Michael D (Rosetta)
> Cc: Kei Cheung; HCLS; James Malone
> Subject: Re: BioRDF Telcon
> 
> Hi
> 
> I meant to comment on this, I would not attempt a mage-ml->RDF 
> transform, I can probably do something more quickly with an 
> rdf export n 
> of transformed data analysed for over/under expressions plus factor 
> values and genes and we'll have a student to work on this I hope
> 
> Helen
> 
> Miller, Michael D (Rosetta) wrote:
> > hi kei and helen,
> >
> > like helen, i've been following the HCLS working groups with great
> > interest.  as one of the designers, with helen, of the MAGE-ML and
> > MAGE-TAB specs i might be able to provide a little technical insight
> > into the formats.
> >
> > (from helen)
> > "This is probably as we don't have data - here's a list of human 
> > experiments with the term neuron - if any of these are 
> useful, then I 
> > can prioritize their curation and inclusion in an atlas release"
> >
> > kei, are the NIH Neuroscience Microarry Consortium exeriments you've
> > cited and others like them in GEO or ArrayExpress?  a set 
> of those could
> > be a good starting point for helen.
> >
> > first, MAGE-ML is based on a DTD[1], not an XSD.  in early 
> 2002 as the
> > OMG Gene Expression specification[1] was being finalized, 
> XSD was still
> > in its infancy so we weren't comfortable at that point 
> generating a XSD.
> > the MAGE-OM UML[2], in a very early XMI format from 
> Rational Rose and
> > UniSys, was used to generate the DTD with code we wrote 
> ourselves[3]. 
> >
> > the UML model was designed to capture the flow of a microarray
> > experiment and how the resulting arrays were organized in 
> the experiment
> > based on how the samples were treated and/or on the 
> samples' phenotypes
> > for the purpose of a reviewer understanding the methodology 
> and for a
> > researcher replicating and/or re-analyzing the results.  
> >
> > some of the details of the flow may not be of much interest, i.e. it
> > might be worth simply connecting the BioSource elements 
> with their gene
> > expression data and not worrying about how the hybridization was
> > performed.  but that depends on what you want to do and you 
> know that
> > better than i.
> >
> > also, the data itself are specified in external files, 
> typically in a
> > white-space delimited format where the column headers are 
> specified in
> > the MAGE-ML file in the QuantitationTypeDimension element and the
> > identifiers of the row specified in one of the three
> > DesignElementDimension elements, Feature, Reporter, 
> CompositeSequence,
> > depending on how derived the data is.  Also the data can be 
> in a vendor
> > specific format such as the Affymetrix CEL (since the CEL file
> > internally specifies the dimensions often they are left out of the
> > MAGE-ML document).
> >
> > the ExperimentalFactor elements are certainly relevant and if you've
> > looked at some of the examples you will noticed that the BioSource
> > elements, in particular, and other elements are annotated by
> > OntologyEntry elements.  from the gene expression specification:
> >
> > "OntologyEntry
> > A single entry from an ontology or a controlled vocabulary. For
> > instance, category
> > could be 'species name,' value could be 'homo sapiens' and ontology
> > would be
> > taxonomy database, NCBI."
> >
> > for the element an ontology entry element is annotating, we 
> looked at it
> > as a way of specifying something like "the 

RE: BioRDF Telcon

2009-07-22 Thread Miller, Michael D (Rosetta)
hi kei and helen,

like helen, i've been following the HCLS working groups with great
interest.  as one of the designers, with helen, of the MAGE-ML and
MAGE-TAB specs i might be able to provide a little technical insight
into the formats.

(from helen)
"This is probably as we don't have data - here's a list of human 
experiments with the term neuron - if any of these are useful, then I 
can prioritize their curation and inclusion in an atlas release"

kei, are the NIH Neuroscience Microarry Consortium exeriments you've
cited and others like them in GEO or ArrayExpress?  a set of those could
be a good starting point for helen.

first, MAGE-ML is based on a DTD[1], not an XSD.  in early 2002 as the
OMG Gene Expression specification[1] was being finalized, XSD was still
in its infancy so we weren't comfortable at that point generating a XSD.
the MAGE-OM UML[2], in a very early XMI format from Rational Rose and
UniSys, was used to generate the DTD with code we wrote ourselves[3]. 

the UML model was designed to capture the flow of a microarray
experiment and how the resulting arrays were organized in the experiment
based on how the samples were treated and/or on the samples' phenotypes
for the purpose of a reviewer understanding the methodology and for a
researcher replicating and/or re-analyzing the results.  

some of the details of the flow may not be of much interest, i.e. it
might be worth simply connecting the BioSource elements with their gene
expression data and not worrying about how the hybridization was
performed.  but that depends on what you want to do and you know that
better than i.

also, the data itself are specified in external files, typically in a
white-space delimited format where the column headers are specified in
the MAGE-ML file in the QuantitationTypeDimension element and the
identifiers of the row specified in one of the three
DesignElementDimension elements, Feature, Reporter, CompositeSequence,
depending on how derived the data is.  Also the data can be in a vendor
specific format such as the Affymetrix CEL (since the CEL file
internally specifies the dimensions often they are left out of the
MAGE-ML document).

the ExperimentalFactor elements are certainly relevant and if you've
looked at some of the examples you will noticed that the BioSource
elements, in particular, and other elements are annotated by
OntologyEntry elements.  from the gene expression specification:

"OntologyEntry
A single entry from an ontology or a controlled vocabulary. For
instance, category
could be 'species name,' value could be 'homo sapiens' and ontology
would be
taxonomy database, NCBI."

for the element an ontology entry element is annotating, we looked at it
as a way of specifying something like "the object identified by the
element is an instance of the class/individual specified by the
OntologyEntry"

so from "kitm-affy-droso-176167" one sees that the BioSource is an
"instance of" Drosophila, whole animal, whole head and an age of 3 days:

 

   
  
 http://mged.sourceforge.net/ontologies/MGEDontology.php#Organism";>

   

 

  
   
   
  
 http://mged.sourceforge.net/ontologies/MGEDontology.php#OrganismPar
t">

   

 
  

   
   

   

   
  
 http://mged.sourceforge.net/ontologies/MGEDontology.php#Age";>

   

 
  
  
 

   http://mged.sourceforge.net/ontologies/MGEDontology.php#has_measure
ment">
  
 
  
   


   
  
 http://mged.sourceforge.net/ontologies/MGEDontology.php#Measurement
">

   

 
  
  
 

   http://mged.sourceforge.net/ontologies/MGEDontology.php#has_value";>
  
 
  
   


 

RE: Is OWL useful at all for Quantitative Science?

2009-03-31 Thread Miller, Michael D (Rosetta)
hi all,

i agree strongly with Chimezie, there are much better methodologies to
do quantitative science.  but once a result is arrived at (the measured
distance between two cities as opposed to the process of measuring the
distance), that can be captured by ontologies.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: public-semweb-lifesci-requ...@w3.org 
> [mailto:public-semweb-lifesci-requ...@w3.org] On Behalf Of 
> Chimezie Ogbuji
> Sent: Tuesday, March 31, 2009 5:26 AM
> To: Oliver Ruebenacker; public-semweb-lifesci
> Subject: Re: Is OWL useful at all for Quantitative Science?
> 
> My sense is that OWL (or any other 'truth'-based knowledge 
> representation
> languages) is not that useful for quantitative science (at 
> least not by
> itself).  Many of the work-arounds to this shortcoming seem 
> rudimentary at
> best:
> 
> - Modeling of 'may' in OWL
> - Direct incorporation of probability into description logics
> - Datatype reasoning
> - Increased use of external predicates and function symbols
> - Modeling compromises (such as trying to retrofit 
> quantitative concepts
> into binary concepts)
> 
> This is just my sense of things.
>  
> --
> Chimezie (chee-meh) Thomas-Ogbuji (oh-bu-gee)
> Heart and Vascular Institute (Clinical Investigations)
> Cleveland Clinic (ogbu...@ccf.org)
> Ph.D. Student Case Western Reserve University
> (chimezie.thomas-ogb...@case.edu)
> 
> 
> On 3/30/09 9:38 PM, "Oliver Ruebenacker"  wrote:
> 
> >  Hello, All,
> > 
> >   There recent discussion has made me wonder, whether OWL is at all
> > useful to do quantitative science, if we insist that it is used
> > correctly (incorrect OWL seems to be useful).
> > 
> >   Can any one give me a simple example of a useful application of
> > correct OWL in quantitative science?
> > 
> >   I have tried to come up with a simple example. Feel free 
> to come up
> > with a simpler one:
> > 
> >   Express in correct OWL: Washington DC is further away from Boston
> > than New York City
> > 
> >   Use case: I want to fly with my helicopter from Boston to 
> either DC
> > or NYC, whichever is closer.
> > 
> >  Take care
> >  Oliver
> 
> 
> ===
> 
> P Please consider the environment before printing this e-mail
> 
> Cleveland Clinic is ranked one of the top hospitals
> in America by U.S. News & World Report (2008).  
> Visit us online at http://www.clevelandclinic.org for
> a complete listing of our services, staff and
> locations.
> 
> 
> Confidentiality Note:  This message is intended for use
> only by the individual or entity to which it is addressed
> and may contain information that is privileged,
> confidential, and exempt from disclosure under applicable
> law.  If the reader of this message is not the intended
> recipient or the employee or agent responsible for
> delivering the message to the intended recipient, you are
> hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited.  If
> you have received this communication in error,  please
> contact the sender immediately and destroy the material in
> its entirety, whether electronic or hard copy.  Thank you.
> 
> 
> 



RE: blog: semantic dissonance in uniprot

2009-03-24 Thread Miller, Michael D (Rosetta)
hi all,

samw...@gmx.at wrote:
>>   Can any one name a real world example of where confusion between an
>> entity and its record was issue?
>> 
>
> I would say that 80% of the RDF/OWL ontologies lingering somewhere on the web 
> are examples. They are just so ill-designed that nobody wants to use them, 
> and nobody CAN use them. The creators of these ontologies were unknowingly 
> meandering between thinking describing things-in-reality, concepts, and 
> abstract database records while creating these ontologies; a no-mans-land 
> where almost any statement is somehow valid, and where there are thousand 
> different ways to talk about a thing, because you are not really sure WHAT 
> you are talking about.  
> Design processes like these lead to the kinds of difficulties described in 
> the classic paper "Are the current ontologies in biology good ontologies?" 
> [1]. I have worked with such ontologies, but they are bordering on being 
> completely unusable -- at least for me.
>
> [1] http://dx.doi.org/10.1038/nbt0905-1095

as a member of the MGED board i can say we took this article that appeared four 
years ago quite seriously in that it focused on the MGED Ontology.  to be one 
of the ground breakers, as the MGED Ontology was, is wonderful in the sense 
that it showed that people could use ontologies to annotate their experiments 
but, as an early attempt, the design suffered from lack of experience, as was 
pointed out. 

the effort soon shifted to the Ontology for Biomedical Investigations (OBI)[2] 
which reached out to a broader participation than MGED.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

[2] http://obi-ontology.org/


 

> -Original Message-
> From: public-semweb-lifesci-requ...@w3.org 
> [mailto:public-semweb-lifesci-requ...@w3.org] On Behalf Of 
> samw...@gmx.at
> Sent: Tuesday, March 24, 2009 5:43 AM
> To: Bijan Parsia; public-semweb-lifesci@w3.org
> Subject: Re: blog: semantic dissonance in uniprot
> 
> 
> > On 24 Mar 2009, at 12:20, samw...@gmx.at wrote:
> > 
> > >
> > >>   Can any one name a real world example of where 
> confusion between an
> > >> entity and its record was issue?
> > >
> > > I would say that 80% of the RDF/OWL ontologies lingering 
> somewhere  
> > > on the web are examples.
> > 
> > Such a violation of Sturgeon's Law[1] would be cause for 
> much rejoicing!
> 
> Yes, I was actually thinking about 90% first, but then the 
> OBO Foundry ontologies are becoming more widespread (and do 
> not have this issue), and DBpedia shows some awareness of 
> these issues, too...
> 
> > I'd be interested in doing a survey on this to determine how  
> > widespread the problem really is. Is there a reasonable 
> corpus that  
> > approximates your experience?
> 
> That was my experience when using the entire web as a corpus, 
> i.e., searching for existing ontologies for a certain 
> use-case via Swoogle, Sindice etc.
> 
>  -- Matthias
> -- 
> Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s 
> mit allen: http://www.gmx.net/de/go/multimessenger01
> 
> 



RE: blog: semantic dissonance in uniprot

2009-03-24 Thread Miller, Michael D (Rosetta)
hi eric,
 
this is probably a bit naive but i can think of two examples.
 
one is that i often do paper examples (i'm a bit of a luddite) when i'm
working out ideas so i might sketch out some object that i will then
annotate from OWL ontologies to 'see how it works.'  this might even be
in a group environment where it is done on a white board.
 
another example would be someone who is going to perform a biological
experiment (perhaps gene expression) where they will jot down in their
notebook some terms from OBI to describe the type of experiment.  the
experiment doesn't work out so it is never published.
 
by the by, i have also found the discussion useful, i do miss bill bug's
input.
 
cheers,
michael
Michael Miller 
Lead Software Developer 
Rosetta Biosoftware Business Unit 
www.rosettabio.com 




From: public-semweb-lifesci-requ...@w3.org
[mailto:public-semweb-lifesci-requ...@w3.org] On Behalf Of eric neumann
Sent: Tuesday, March 24, 2009 8:17 AM
To: Bijan Parsia
Cc: W3C HCLSIG hcls
Subject: Re: blog: semantic dissonance in uniprot


Bijan, 

I have a (possibly) naive question, but one that comes up in the
context of a digital record/rep of the protein :

Are OWL ontologies supposed to be applied to only digital
representations of real world things, or do some believe they actually
can be applied to the real-world things "even when no record of the
object exists in the digital space"?

That is, if one defines a bunch of formal assertions on classes
(based on real-world evidence/experience), do these work solely on
digital KR and data forms, or do they go beyond that? I guess it may
matter whether the "digital world" is being identified with the
"conceptual world" of the mind... and that may be opening more cans of
worms...

What I'm getting at is that if the above question is true
(ontologies only for digital forms), than the only things we can define
ontologies for are the records of things; hence why talk about explicit
record types if everything relevant is already a digital-record?  

In addition, I also don't see references to any object being
fundamentally different to a digital record (san descriptive triples
perhaps)... can someone provide me with a counter example?

cheers,
Eric


On Tue, Mar 24, 2009 at 10:21 AM, Bijan Parsia
 wrote:


On 24 Mar 2009, at 13:49, eric neumann wrote:



I think this discussion has been quite useful
and important, since there are some remaining issues to be clarified by
this community. I think all points raised are good, but not equally
valid. Bijan and Phil's thoughts are very useful for me, and would
probably resonate within the informatics groups at pharma companies.

I think a key guidance principle here is to
ensure that whatever is proposed "makes sense and works with molecular
biologists" (scientists). Perhaps existing information resources need a
major "enhancement" in order to work in a semantic web, but then let's
make it quite clear (to all possible users) what the readily perceivable
value of all these ontological adjustments will be.



BTW, I'm perfectly happy, albeit not until summer, to do
various sorts of empirical research to help ground this discussion. I've
done surveys fo the web and user studies before. I would be interested
in knowing what sorts of questions would help people make decisions.

In this sense I *am* all about the data :)

Cheers,
Bijan.







RE: The W3C mailing lists will be limited to interest group participants.

2008-06-25 Thread Miller, Michael D (Rosetta)

hi all,

i've also been lurking, even more so than phil.  priorities prevent me
from being more active but without access to this list i would not be
the proponent of SW at my work that i am.

(if the list is restricted, will public be removed from the list name?)

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Phillip Lord
> Sent: Wednesday, June 25, 2008 1:20 AM
> To: W3C HCLSIG hcls
> Subject: Re: The W3C mailing lists will be limited to 
> interest group participants.
> 
> 
> > "MS" == Matthias Samwald <[EMAIL PROTECTED]> writes:
> 
>   MS> Jonathan wrote:
>   >>> The W3C mailing lists will be limited to interest group
>   >>> participants.
>   >> 
>   >> You mean public-semweb-lifesci@w3.org, for example?
> 
>   MS> According to the last conference call, this might also apply to
>   MS> this mailing list. How many people are subscribed to 
> this mailing
>   MS> list at the moment, and how many of these will be 'kicked out'
>   MS> when the membership policy is enforced? 
> 
> It also depends on whether "limited" means reading or posting or both.
> I've been a highly active lurker (erm...) on this list for years and
> find it very useful for this. 
> 
> I wouldn't read it through public archives. Email or bust. 
> 
> Phil
> 
> 



RE: SenseLab note: should flaws in open source ontology editors be mentioned?

2008-05-16 Thread Miller, Michael D (Rosetta)

 
hi all,

i agree with michel's second point ("Concrete suggestions welcome.") but
i also think it is important, since this document can be used as a guide
by others, to point out pitfalls and problems that a new user will face.
in that vein it might be good, as suggested, to provide examples of what
and how issues with the software were worked around (i haven't read the
whole document yet, it is in my to do stack, so perhaps you have) and
how the software could be improved.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Michel_Dumontier
> Sent: Friday, May 16, 2008 7:06 AM
> To: public-semweb-lifesci@w3.org
> Subject: RE: SenseLab note: should flaws in open source 
> ontology editors be mentioned?
> 
> 
> While Xiaoshu brings up an important point of constructive criticism,
> it's not clear from the text that is being done. In the first 
> case, bugs
> happen, and these will get fixed, I don't think it's worth mentioning.
> In the second, I think the topic is much more relevant. However, how
> _exactly_ can the process of "editing the complex, expressive
> ontologies" be improved? Concrete suggestions welcome.
> 
> -=Michel=-
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoshu Wang
> Sent: May 16, 2008 6:54 AM
> To: Matthias Samwald
> Cc: public-semweb-lifesci@w3.org
> Subject: Re: SenseLab note: should flaws in open source 
> ontology editors
> be mentioned?
> 
> 
> 
> Matthias Samwald wrote:
> >
> > One feedback I got for the SenseLab conversion note 
> > (http://www.w3.org/2001/sw/hcls/notes/senselab/) was that 
> it might be 
> > inappropriate to mention that flaws in certain popular open source 
> > ontology editors caused problems for our work. To portions 
> of text in 
> > question are:
> I absolutely think it *is* appropriate to mention it.  People takes 
> criticisms too personally, which is not good for the health 
> of science. 
> Truth should be gained through intelligent but authoritarian 
> debate .  
> >
> > """
> > We experienced the following problems while using RDF/OWL:
> >
> > The open-source ontology editors used for this project were 
> relatively
> 
> > unreliable. A lot of time was spent with steering around 
> software bugs
> 
> > that caused instability of the software and errors in the generated 
> > RDF/OWL. Future versions of freely available editors or currently 
> > available commercial ontology editors might be preferable. [...]
> > """
> >
> > and
> >
> > """
> > We experienced clear benefits from using Semantic Web 
> technologies for
> 
> > the integration of SenseLab data with other neuroscientific 
> data in a 
> > consistent, flexible and decentralised manner. The main obstacle in 
> > our work was the lack of mature and scalable open source 
> software for 
> > editing the complex, expressive ontologies we were dealing 
> with. Since
> 
> > the quality of these tools is rapidly improving, this will 
> cease to be
> 
> > an issue in the near future.
> > """
> >
> > In my opinion, the errors in one of the most popular OWL ontology 
> > editors were problematic enough that they need to be mentioned -- I 
> > guess most people working with non-trivial OWL ontologies 
> know what I 
> > mean. What do you think?
> Do it.  I definitely think it should.  In fact, the more popular an 
> ontology, the more stentorian the criticism should be because the 
> potential damage a popular ontology can do is much more than a less 
> popular one.  The problem is the critics but those who is being 
> criticized.  They should take criticism as constructive advise to 
> improve their work but as destructive sense to take them out of their
> job.
> 
> Xiaoshu
> 
> 
> 



RE: An argument for bridging information models and ontologies at the syntactic level

2008-03-27 Thread Miller, Michael D (Rosetta)
hi all,
 
yes, i also agree that these are great points, except for a quibble.
 
"Data models like schemas, structures, and data formats are
implementation details"
 
in Model Driven Architecture (MDA), the Platform Independent Model data
model is free of implementation details.  as a developer who works
primarily with data models, whenever i deal with ontologies, it is at
the implementation level for ontologies.  that is, if i have a
transcript, i wish to know what gene ontology terms it is associated
with so that i can make inferences about other transcripts in regards to
their gene expression under certain conditions.
 
i also thought it worth mentioning that in the development of the data
model FuGE for functional genomics, we purposely developed the ontology
package so that objects would reference into ontologies cleanly to
maintain the separation in the points below.
 
cheers,
michael
Michael Miller 
Lead Software Developer 
Rosetta Biosoftware Business Unit 
www.rosettabio.com 




From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of jim herber
Sent: Wednesday, March 26, 2008 10:22 AM
To: Booth, David (HP Software - Boston)
Cc: Ogbuji, Chimezie; [EMAIL PROTECTED];
public-semweb-lifesci@w3.org
Subject: Re: An argument for bridging information models and
ontologies at the syntactic level


Chimezie, excellent observation.  Agree with principals you are
articulating.  

I would add:

1. .
2. Concept models operate at many levels.  As an example,
concept models may represent the entire data model as a concept, or they
may point at an element within a data model as a concept.
3. Different concept models that are unrelated or loosely
related may reference the same data model.
4. Keeping the two (data models and conceptual models) separate
allows them to evolve independently.
5. Pulling out the mapping versus attempting to represent
mapping and data model in conceptual language fits a basic tenant of
engineering principals, that is "loosely coupled modules with highly
cohesive functionality".

David, do you like "data model to conceptual mapping" better?


Jim Herber
Independent Consultant
jimherber_at_ gmail.com


On Wed, Mar 26, 2008 at 11:47 AM, Booth, David (HP Software -
Boston) <[EMAIL PROTECTED]> wrote:



+1.  Except I find the term "syntactic mapping" somewhat
misleading, because to my mind, the anti-pattern you are describing
involves the encoding of syntactic-level concerns into the ontology,
which as you point out, shouldn't be there.  So pertonally I would have
been more inclined to call it "semantic mapping", but maybe someone else
has a better idea.


David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  [EMAIL PROTECTED]
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do
not represent the official views of HP unless explicitly stated
otherwise.



> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On
Behalf Of
> Ogbuji, Chimezie
> Sent: Tuesday, March 25, 2008 9:07 PM
> To: [EMAIL PROTECTED];
public-semweb-lifesci@w3.org
> Subject: An argument for bridging information models
and
> ontologies at the syntactic level
>
> For some time I have had a concern about a theme in
the more
> common approaches to bridging information models and
> ontologies as a path towards bringing the advantages
of the
> Semantic Web technologies to 'legacy' healthcare
terminology systems.
>
> I wanted to speak on this topic  for some time but
have
> hesitated mostly because my thoughts were not fully
baked and
> (in addition) I thought this anti-pattern was an
anomaly, but
> today's conversation during the COI teleconference
suggested
> that I should speak up about it.
>
> To get right to the point, 1) I consider approaches
that
> attempt to perform this bridging directly between
information
> models and ontologies as examples of this
'anti-pattern.' 2)
> I think that performing this bridging at the syntactic
level
> addresses the important problem of properly separating
these
> two  in a way that emphasizes their strengths.
>
> I would like to offer an alternative view point
because I
 

RE: OMG Ontology RFI--link

2008-02-22 Thread Miller, Michael D (Rosetta)
hi all,
 
here is the link to the RFI:
 
http://www.omg.org/cgi-bin/doc?ontology/2008-02-01
 
cheers,
michael




From: Miller, Michael D (Rosetta) 
Sent: Friday, February 22, 2008 7:49 AM
To: public-semweb-lifesci@w3.org hcls
Subject: OMG Ontology RFI


hi all,
 
this may be of interest to some, it's an OMG Ontology RFI on
"Ontology Management Information" which is still in draft form but
solicits comments.
 
cheers,
michael



Michael Miller 
Lead Software Developer 
Rosetta Biosoftware Business Unit 
www.rosettabio.com 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bill Bug
Sent: Thursday, February 21, 2008 10:30 PM
To: public-semweb-lifesci@w3.org hcls
Subject: Google EPR announcement


Google to Store (and provide secure access to) Patients'
Health Records
http://apnews.myway.com/article/20080221/D8UUN0100.html

Competing with MS HealthVault:
http://www.healthvault.com/

and Revolution Health:
http://www.revolutionhealth.com/


Probably old news for some on this list, but I thought
I'd pass it around just in case. 

Cheers,
Bill



William Bug, M.S., M.Phil.
email: [EMAIL PROTECTED]
Ontological Engineer (Programmer Analyst III) work:
(610) 457-0443
Biomedical Informatics Research Network (BIRN)
and
National Center for Microscopy & Imaging Research
(NCMIR)
Dept. of Neuroscience, School of Medicine
University of California, San Diego
9500 Gilman Drive
La Jolla, CA 92093

Please note my email has recently changed





OMG Ontology RFI

2008-02-22 Thread Miller, Michael D (Rosetta)
hi all,
 
this may be of interest to some, it's an OMG Ontology RFI on "Ontology
Management Information" which is still in draft form but solicits
comments.
 
cheers,
michael



Michael Miller 
Lead Software Developer 
Rosetta Biosoftware Business Unit 
www.rosettabio.com 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bill Bug
Sent: Thursday, February 21, 2008 10:30 PM
To: public-semweb-lifesci@w3.org hcls
Subject: Google EPR announcement


Google to Store (and provide secure access to) Patients' Health
Records
http://apnews.myway.com/article/20080221/D8UUN0100.html

Competing with MS HealthVault:
http://www.healthvault.com/

and Revolution Health:
http://www.revolutionhealth.com/


Probably old news for some on this list, but I thought I'd pass
it around just in case. 

Cheers,
Bill



William Bug, M.S., M.Phil.
email: [EMAIL PROTECTED]
Ontological Engineer (Programmer Analyst III) work: (610)
457-0443
Biomedical Informatics Research Network (BIRN)
and
National Center for Microscopy & Imaging Research (NCMIR)
Dept. of Neuroscience, School of Medicine
University of California, San Diego
9500 Gilman Drive
La Jolla, CA 92093

Please note my email has recently changed





RE: Experiment Ontology

2007-12-11 Thread Miller, Michael D (Rosetta)

hi susie,

> The folks at Lilly who developed the ontology did review a number of
> existing ontologies, but they didn't meet our needs. 

this is the hard part of getting standardization accepted.  "but they didn't 
meet our needs" will always seem to be true because the most expedient way to 
organize ones data is based on how it is already organized.  no standard will 
look exactly like the way a particular organization choose to organize their 
information.

looking at the ExperimentOntology it is pretty easy to deduce how Lilly views 
experiment organization and i can tell you from experience that it is not like 
any of the pharma or biotechs way of doing things that i've seen in gene 
expression.  in fact, there are few details that overlap amongst any of them.

but there are common themes and we've been relatively successful in mapping to 
MAGE (which is UML, not an ontology, but that's a different discussion) for all 
these different organizations in order to import and export out of our product.

the trick is not in changing your ways but in mapping to a common language and 
then unmapping back into your datastore.  it actually looks like it wouldn't 
take much to map into FuGE with ontology terms coming from OBI for the most 
part.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Susie M Stephens
> Sent: Tuesday, December 11, 2007 9:21 AM
> To: Bill Bug
> Cc: public-semweb-lifesci@w3.org hcls
> Subject: Re: Experiment Ontology
> 
> 
> Hi Bill,
> 
> Thanks for all of your great feedback. :-)
> 
> The folks at Lilly who developed the ontology did review a number of
> existing ontologies, but they didn't meet our needs. I don't 
> have the full
> list of ontologies that they explored, but they definitely 
> took a look at
> OBI. We are very interested in working with the community to further
> develop the ontology, and are in the process of scheduling a 
> call with some
> of the OBI folks.
> 
> Cheers,
> 
> Susie
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>   
>  
>  Bill Bug 
>  
>  <[EMAIL PROTECTED]
>  
>  edu> 
>   To 
>Susie Stephens 
>  
>  12/06/2007 11:16  
> <[EMAIL PROTECTED]>
>  PM   
>   cc 
>Matthias Samwald 
> <[EMAIL PROTECTED]>,  
>
> "public-semweb-lifesci@w3.org hcls" 
>
> , Kei 
>Cheung 
> <[EMAIL PROTECTED]>,   
>"Karen (NIH/NIDA) [E] 
> Skinner"  
>
> <[EMAIL PROTECTED]>, Alan   
>Ruttenberg 
>  
>
> <[EMAIL PROTECTED]>  
>   
>  Subject 
>Re: Experiment 
> Ontology 
>   
>  
>   
>  
>   
>  
>   
>  
>   
>  
>   
>  
> 
> 
> 
> 
> Hi Susie,
> 
> We certainly do need an "Experiment Ontology" - or Ontology 
> of Biomedical
> Investigation (OBI).
> 
> I believe Matthias, Michael, and Kei have all made exactly 
> the points I
> think are most important to consider:
> 1) Matthias's comments
> Are you following "best practices" in creating the ontology.  
> I believe
> Matthias gives many instructive examples on how to adjust 
> what is here to
> bring it much more in sync with the emerging "best practices" that are
> coming out of the community development surrounding a variety of OBO
> Foundry ontologies.  Matthias also makes the point that its 
> important to
> seek to re-use (or directly contribute to) the emerging community
> ontologies to cover the required domains.  In the case of 
> this particular
> Experiment Ontology, the ontologies to consider are Ontology 
> of Biomedical
> Investigation (OBI), the OBO Relations Ontology, the Gene Ontology
> (specifically the Molecular Function and Cellular Component 
> branches

RE: Experiment Ontology

2007-12-03 Thread Miller, Michael D (Rosetta)

hi all,

i agree with much of what matthias has to say, including 

> Many of the datatype properties in this ontology look very 
> interesting and 
> might provide requirements for other ontologies.

having been one of the editors of FuGE, i'm wondering if there are any
tools that can translate UML to RDF or OWL?

the problem i see in this Experiment Ontology is one of the issues that
occurred in creating MAGE and lead to refinements in FuGE.  that is, the
choose of terms in life sciences always seems directed at a particular
type of experiment, here, things like chip type (which is more than just
a species, consider the differences between Affymetrix and Agilent way
of creating the chip and Illumina's bead type reporters) or wells as
opposed to spots/features, gene list, calling out particular gene and
protein databases.  also, Protocol in the Experiment Ontology is a very
restricted usage.

what FuGE tries to do is describe the backbone of a life sciences
investigation/experiment (other overloaded terms) in a generic way.  UML
models derived from FuGE bring in more domain specific details (such as
GelML and, still under development, MAGEv2) but still try and leave much
of the annotation to OntologyTerm elements which are designed to come
from other ontologies.

cheers,
michael

http://www.psidev.info/index.php?q=node/125
http://fuge.sourceforge.net/
https://www.cbil.upenn.edu/magewiki/index.php/magev2

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Matthias Samwald
> Sent: Sunday, December 02, 2007 2:25 PM
> To: public-semweb-lifesci@w3.org; Susie M Stephens
> Subject: Re: Experiment Ontology
> 
> 
> Hi Susie,
> 
> Susie wrote:
> > It would be great if you could take a look at it and 
> provide comments. The
> > ontology is available at:
> > 
> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/Experimen
> t_Ontology
> 
> * Some of the entities/properties are missing a rdfs:label or 
> have an empty 
> label (a string with lenght 0).
> * Some of the entities could be taken from existing 
> ontologies like OBI, RO 
> or some of the OBO Foundry ontologies. This would save work and makes 
> integration with other data sources and ontologies much 
> easier. By the way, 
> there seem to be several groups working on ontologies for mircoarray 
> experiments, or are at least planning to do that. It would be 
> great if these 
> groups could work together.
> * The class 'Chip type' should be removed and be replaced by 
> subclasses of 
> 'chip', e.g., 'chip (human)', 'chip (mouse)' etc.
> * Some of the object properties appear like they are intended 
> to be datatype 
> properties (e.g., 'has proteome id').
> * Many of the datatype properties could be replaced with 
> object properties, 
> possibly referring to third party ontologies -- of course 
> this would require 
> a richer ontology and more work spent on creating mappings. 
> 'has molecular 
> function' could refer to entities from the gene ontology, 
> 'has associated 
> organ' could refer to an ontology about anatomy and so on.
> * Object properties and their ranges are quite redundant. 
> Property 'has 
> reagent' has range 'Reagent', property 'has treatment' has 
> range'Treatment' 
> and so on. Maybe the ontology could be designed in such a way 
> that there are 
> only some generic properties such as 'has part'. This would make the 
> ontology much easier to maintain, query and understand in the 
> long term.
> * It is unclear how 'Gene list' is intended to be used.
> * 'Hardware' and 'Software' should not be subclasses of 'Protocol'.
> 
> 
> Many of the datatype properties in this ontology look very 
> interesting and 
> might provide requirements for other ontologies. It would be 
> great if some 
> of them could be described/commented in more detail so that 
> we know more 
> about the requirements that motivated the creation of these 
> properties.
> 
> I hope that was somewhat helpful.
> 
> cheers,
> Matthias Samwald
> 
> 
> 
> 
> 




RE: [Fwd: Re: identifier to use]

2007-08-26 Thread Miller, Michael D (Rosetta)

hi xiaoshu,

since LSID v1 is standardized via the OMG specification, it, has hilmar
has pointed out, is supported.

if the community (or a portion of) using LSID decides to modify it in a
way that is incompatible with that spec (it could be modified in such
way it would be backwards compatible) then the prefix should change from
'URN:LSID' to 'URN:LSID2' (or something along those lines) to
distinguish what protocols to apply.

if the URI begins 'URN:LSID' then a user should have every expectation
that the methods for resolution specified in the OMG spec apply.

cheers,
michael

> -Original Message-
> From: Hilmar Lapp [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, August 26, 2007 8:40 AM
> To: [EMAIL PROTECTED]
> Cc: Miller, Michael D (Rosetta); Eric Jain; Ricardo Pereira; 
> public-semweb-lifesci; Sean Martin
> Subject: Re: [Fwd: Re: identifier to use]
> 
> 
> On Aug 26, 2007, at 9:08 AM, Xiaoshu Wang wrote:
> 
> > If cannot do it through OMG, maybe LSID should be moved out of  
> > OMG.  No matter what, there is one consensus that is LSID won't be  
> > supported as is.
> 
> Consensus by whom? There are organizations that support it already,  
> such as TDWG, IPNI, uBio, to name a few.
> 
>   -hilmar
> -- 
> ===
> : Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
> ===
> 
> 
> 
> 
> 




RE: [Fwd: Re: identifier to use]

2007-08-25 Thread Miller, Michael D (Rosetta)

hi all,

i forgot the URL (or is it a URI or URN?)

cheers,
michael

http://www.omg.org/technology/documents/formal/life_sciences.htm


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Miller, Michael D (Rosetta)
> Sent: Saturday, August 25, 2007 11:29 AM
> To: Eric Jain; Ricardo Pereira
> Cc: public-semweb-lifesci; Sean Martin
> Subject: RE: [Fwd: Re: identifier to use]
> 
> 
> hi all,
> 
> > Is there any chance that this will find it's way back into 
> > the LSID spec?
> 
> great thought but...
> 
> the spec is an OMG spec through the Life Sciences working group.  i3c
> worked on it in collaboration with this group but i3c is dead and the
> members of the Life Sciences group most interested in the LSID are no
> longer members per se (not sure quite how true this is, IBM is still a
> member of OMG and sean martin did a great deal of the (excellent) work
> on the spec and implementation but i don't know how much interest they
> would have).  the OMG revision process is quite straight-forward,
> especially for something of this nature, but there have to be OMG
> members interested in doing the work.
> 
> not to say there aren't other venues, including de facto adoption.
> 
> cheers,
> michael
> 
> Michael Miller
> Lead Software Developer
> Rosetta Biosoftware Business Unit
> www.rosettabio.com
> 
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Eric Jain
> > Sent: Saturday, August 25, 2007 5:38 AM
> > To: Ricardo Pereira
> > Cc: public-semweb-lifesci
> > Subject: Re: [Fwd: Re: identifier to use]
> > 
> > 
> > Ricardo Pereira wrote:
> > > 
> > http://wiki.tdwg.org/twiki/bin/view/GUID/LsidHttpProxyUsageRec
> > ommendation
> > 
> > Looks like a good solution for people who are using LSID (for 
> > whatever 
> > reason) and want to make their data more accessible on the 
> > Semantic Web!
> > 
> > Together with the content negotiation mechanism described in 
> > one of the 
> > comments on this page, this could also make resolving an LSID into 
> > something useful (for normal people) as easy as resolving 
> e.g. a DOI.
> > 
> > Is there any chance that this will find it's way back into 
> > the LSID spec?
> > 
> > 
> > 
> > 
> 
> 
> 
> 




RE: [Fwd: Re: identifier to use]

2007-08-25 Thread Miller, Michael D (Rosetta)

hi all,

> Is there any chance that this will find it's way back into 
> the LSID spec?

great thought but...

the spec is an OMG spec through the Life Sciences working group.  i3c
worked on it in collaboration with this group but i3c is dead and the
members of the Life Sciences group most interested in the LSID are no
longer members per se (not sure quite how true this is, IBM is still a
member of OMG and sean martin did a great deal of the (excellent) work
on the spec and implementation but i don't know how much interest they
would have).  the OMG revision process is quite straight-forward,
especially for something of this nature, but there have to be OMG
members interested in doing the work.

not to say there aren't other venues, including de facto adoption.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Eric Jain
> Sent: Saturday, August 25, 2007 5:38 AM
> To: Ricardo Pereira
> Cc: public-semweb-lifesci
> Subject: Re: [Fwd: Re: identifier to use]
> 
> 
> Ricardo Pereira wrote:
> > 
> http://wiki.tdwg.org/twiki/bin/view/GUID/LsidHttpProxyUsageRec
> ommendation
> 
> Looks like a good solution for people who are using LSID (for 
> whatever 
> reason) and want to make their data more accessible on the 
> Semantic Web!
> 
> Together with the content negotiation mechanism described in 
> one of the 
> comments on this page, this could also make resolving an LSID into 
> something useful (for normal people) as easy as resolving e.g. a DOI.
> 
> Is there any chance that this will find it's way back into 
> the LSID spec?
> 
> 
> 
> 




RE: [Obo-relations] FuGE, Ontologies, NINDS Microarray, GENSAT, ABA, and AD

2007-06-04 Thread Miller, Michael D (Rosetta)

hi kai,

> I think these are related subjects

well, as we've seen, everything is related to everything else, but,
specifically, there's obviously the need for people to be able to not
only compare microarray experiments for similarities but all types of
system biology experiments, one interesting use case (among many) is
comparing proteomic results with gene expression results for the same
set of subjects.

> Before going into further 
> detail about how to convert NINDS microarray data into OWL ...

one item i was unclear on was what the scope of conversion of microarray
data into OWL you have in mind.  my take on this, as i said previously,
is that in MAGE and eventually in MAGEv2, the actual performance of the
microarray experiment and the results are captured in the XML schema
generated from the UML model.  there is also the ability to annotate,
using the ontology package, all the objects in the XML document.  it's
this annotation i believe can be easily translated into RDF or OWL,
based on the early version of the OMG Ontology Metamodel.  the rest of
the document is easily comparable with other documents at the XML or UML
level.

cheers,
michael

> -Original Message-
> From: Kei Cheung [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, June 02, 2007 7:26 AM
> To: Miller, Michael D (Rosetta)
> Cc: William Bug; Smith, Barry; Kashyap, Vipul; 
> [EMAIL PROTECTED]; public-semweb-lifesci@w3.org; 
> [EMAIL PROTECTED]; Paul Spellman; Shrikant Mane
> Subject: Re: [Obo-relations] FuGE, Ontologies, NINDS 
> Microarray, GENSAT, ABA, and AD
> 
> Hi Michael, Bill, Barry, et al.,
> 
> I think these are related subjects (see the modified title and 
> discussion below). I also copied this email to the Yale PI 
> (Dr. Shrikant 
> Mane) who is one of the PI's of the NINDS microarray 
> consortium (Yale is 
> among one of the four centers in the consortium) in the hope of 
> initiating an interaction between the consortium and 
> SWHCLS/ontology/MGED groups. I'm also trying to practice TBL's "Web 
> Science" synergizing different communities. Before going into further 
> detail about how to convert NINDS microarray data into OWL , 
> we should 
> probably step back and see what we might accomplish.
> 
> Since our HCLS paper (http://www.biomedcentral.com/1471-2105/8/S3/S2) 
> and Banff demo (http://esw.w3.org/topic/HCLS/Banff2007Demo) 
> centered on 
> a use case of Alzheimer Disease (AD), for the fun of it, I searched 
> using the keyword "Alzheimer Disease" to see if there are any AD 
> microarray data stored in the NINDS microarray consortium repository. 
> It's not a surprise that I found a number of neuroscience microarray 
> projects that have to do with the study of AD. Below is the 
> URL pointing 
> to the description of one of the AD microarray projects.
> 
> http://arrayconsortium.tgen.org/np2/viewProject.do?action=view
> Project&projectId=110
> 
> As you can see the content is structured and I think can be converted 
> into an OWL ontology. More interestingly, it seems to have some 
> annotation errors. For example, if you look at the following 
> field/value 
> pairs:
> 
> organ/tissue type: blood
> organ region: adrenal medulla
> 
> According to the description of the project (e.g., experimental 
> procedure and design), I think they should be:
> 
> organ/tissue type: brain
> organ region: entorhinal cortex
> 
> Basically, this microarray project generated differential gene 
> expression profiling data for neurons containing 
> neurofibrillary tangles 
> and normal neurons from the entorhinal cortex of AD cases (mid-stage).
> 
> Even more interestingly, one can integrate such gene expression data 
> with other types of gene expression data (e.g., image-based gene 
> expression data) stored in other repositiories. For example, 
> if you go 
> to GENSAT, you can find what genes are expressed in a given 
> brain region 
> (e.g., entorhinal cortex).
> 
> http://braininfo.rprc.washington.edu/Scripts/indexothersite.as
> px?ID=150&type=h&term=entorhinal_area&thterm=Entorhinal%20Cort
ex&city=Bethesda&country=USA&institue=NCBI,%20National%20Library%20of%>
20Medicine&namesite=GENSAT&URL=http://www.ncbi.nlm.nih.gov/pro
> jects/gensat/index.cgi?cmd=searchhjsAMPgene_name=hjsAMPgene_sy
m=hjsAMPage=anyhjsAMPexp_tech=anyhjsAMPplane=anyhjsAMPregion=Entorhinal+
CortexhjsAMPcell=anyhjsAMPl> evel=anyhjsAMPfmt=gene
> 
> Of course, similar integration can be performed with other 
> image-based 
> gene expression repositories such as Allen Brain Atlas.
> 
> Cheers,
> 
> -Kei
> 
> Miller, Michael D (Rosetta) wrote:
> 
> > hi bill and kei,
> >  
> > i've c

RE: [Obo-relations] FuGE and Ontologies

2007-06-01 Thread Miller, Michael D (Rosetta)
hi bill and kei,
 
i've changed the subject, since this is moving away from the original
topic.
 
"Yes - you are right, of course - right now the TGEN infrastructure for
the consortium is committed to providing MAGE-ML instances [1]."
 
that's great.
 
"the FuGE-stk [2] will provide a means to "convert" MAGE-ML to FuGE-ML"
 
not exactly, the folks (myself included) that worked on FuGE but with a
focus on microarrays are working on MAGEv2 and there is a commitment to
provide a way to translate to/from MAGEv1 <-> MAGEv2.  at the minimum
this would be a mapping but if there is time and resources available,
this would also have an implementation in the MAGEstk v2.  
 
MAGEv2 is being built on top of FuGE as an extension to add in
microarray specific classes (extending Data as ArrayDesign,
DesignElementData, etc, Material as Array, QPCRPlate, etc, and
DimensionElement as DesignElement extended by Feature, Reporter, and
CompositeElement).
 
we have not been quite as organized as the MAGEv1 effort so the work
doesn't have a high visibility yet, plus we have been working on
MAGE-TAB, a simplified, spreadsheet version of the MAGE-OM model.
 
i'm hoping we can get back on track soon, we are not that far from
completion, perhaps the NIH or BIRN microarray folks would be willing to
host a MAGEv2 meeting? (note, this would have little to do with ontology
development!)
 
"many of the experts working on FuGE .. are looking for assistance in
how to make use of ontologies when representing microarray data in a
FuGE instance"
 
yes, but note that this has nothing to do with ontology modeling per se
but simply the best way to model ontology annotation for the objects of
a FuGE document.  in essence, a FuGE object, such as a Material that
represents a rat or the Investigation itself, becomes either implicitly
or explicitly an Individual of the desired classes from whatever
ontologies that are appropriate.  it then inherits the properties of
those classes (if there are any) and can specify slot instances.  it is
anticipated that OBI will be usable for most of the basic annotation
then, perhaps, specialized ontologies in the domain of the particular
investigation can annotation more exactly.
 
most specifically, information and relationships of these referenced
classes would not be in the FuGE document, just the information
necessary to look them up in the ontology itself.
 
we modeled the FuGE ontology package from the Individual diagram of an
early draft of OMG's Ontology Definition MetaModel (ODM).  that section
was actually explicitly dropped from the final version of the ODM
because it had problems with OWL Full (and perhaps DL), but we
anticipate that the vast majority of desired ontology annotation can be
captured via this model.
 
temporal and containment/association relationships are actually intended
to be captured by the FuGE objects (the flow of Material and Data
through ProtocolApplications, the various associations between FuGE
classes)
 
"there is both an eye toward - and a need for - automatic conversion"
 
interestingly enough, if one can generate MAGEv1 to capture the details
of the microarray experiment, one could also use FuGE to export the
Material/BioSource individuals as stand alone with the ontology
annotation and tie them together via the identifier attribute.
 
"may require a MAGE-ML import into a FuGE DDL database - then export
from the database - I'm not clear on this yet"
 
since MAGE-ML has an in-memory model and FuGE does also, then it should
be just as easy to auto-generate bridging code based on a mapping
between the two in-memory models as to have to write to a database first
(which requires the same mapping!).
 
also note that the application/database doesn't have to be based on a
FuGE DDL database, it simply needs to be able to import MAGE-ML and
export FuGE.  i would be out of work if it did.
 
cheers,
michael
 
Michael Miller 
Lead Software Developer 
Rosetta Biosoftware Business Unit 
www.rosettabio.com 




From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of William Bug
Sent: Thursday, May 31, 2007 7:36 PM
To: Kei Cheung
Cc: Smith, Barry; Kashyap, Vipul; [EMAIL PROTECTED];
public-semweb-lifesci@w3.org; [EMAIL PROTECTED]
Subject: Re: [Obo-relations] Advancing translational research
with the Semantic Web (Not clear about definition of
)


Hi Kei, 

Yes - you are right, of course - right now the TGEN
infrastructure for the consortium is committed to providing MAGE-ML
instances [1]. 

My understanding from speaking with FuGE folks is that the the
FuGE-stk [2] will provide a means to "convert" MAGE-ML to FuGE-ML (may
require a MAGE-ML import into a FuGE DDL database - then export from the
database - I'm not clear on this yet).  Since many of the FuGE model
developers were a part of the MGED MAGE model development project, there
is both an eye toward - and a need for - auto

RE: A question on the vocabulary for 'persons' - ACL level of granularity?

2006-09-18 Thread Miller, Michael D (Rosetta)

Hi Xiaoshu,

Getting back to an earlier point in this discussion...

> Well, how can a computer knows my intension about the parts 
> that I don't
> "use/disagree"?  But, I think, if I disagree one portion of 
> the ontology, I
> certainly would not use the other part of the ontology at all 
> since if I
> make one contradicting statement, it will invalidate the entire model.

Consider an effort that creates an ontology to wrap the English language
(or any other language) so that it could be reasoned over.  This seems a
noble objective.

Now if it truly captured the 'essence' of the language, which many
people only understand overlapping parts of, others, perhaps those in a
particular scientific domain, have a specialized knowledge of a part of
the language that others don't, different reasoners ought to be able to
be created that can duplicate this ability of humans to (mostly)
communicate together at different levels of understanding.

If we can't, I believe this points out a current weakness in how we
express ontologies and write reasoners.  It's obviously possible to do,
we do it as people all the time.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoshu Wang
> Sent: Saturday, September 16, 2006 8:24 PM
> To: public-semweb-lifesci@w3.org
> Subject: RE: A question on the vocabulary for 'persons' - ACL 
> level of granularity?
> 
> 
> 
> > > I wish it could be that simple when you handle the task to 
> > machine.  
> > > Show me how you can only import the foaf:Person without 
> > fetching the 
> > > foaf:geekcodes as well? From other perspective, can you do 
> > something 
> > > like, I only use this part of GO but not the other part? 
> > Even if you 
> > > are allowed to do so, what do we mean sharing an ontology. 
> > If someone 
> > > agress only ome portion, but others agrees to other 
> > portion, of the ontology?
> > 
> > You're right that there's no way to "dis-import" (i.e., refuse to
> > import) parts of an ontology you disagree with.  But we have 
> > to be careful to distinguish the parts you disagree with from 
> > the parts you simply don't use.  In the case of "geekcodes," 
> > I'm guessing that you don't have any opinions about them one 
> > way or the other; you just think they're not relevant.  In 
> > that case, it's harmless to import the ontology.  In 
> > practice, this happens a lot.
> 
> Well, how can a computer knows my intension about the parts 
> that I don't
> "use/disagree"?  But, I think, if I disagree one portion of 
> the ontology, I
> certainly would not use the other part of the ontology at all 
> since if I
> make one contradicting statement, it will invalidate the entire model.
> 
> Hence, even if I don't disagree but just no use certain part 
> of an ontology.
> How do I know if those who want to use my ontology but 
> disagree the imported
> other part.  For example, if I develop a ex:Patient and make it a
> rdfs:subClassOf the foaf:Person.  Personally, I don't care the
> foaf:geekcodes.  But what if other, for example, Chris Mungall like my
> ex:Patient but not the foaf:geekcodes, it will force him to not use
> ex:Patient but develop another cm:Patient, where he might 
> make a statement
> saying that "there is no such thing as foaf:geekcodes".  Now the world
> becomes messy because a simple mappying from ex:Patient to 
> cm:Patient with
> owl:equivalentClass won't be able to remove the 
> contradiction.  Then, what
> if someone disagree the online account part of foaf, or 
> Organization, or
> even Agent?  
> 
> > Another remark, which may be too obvious to be worth 
> making, but here
> > goes: You can use a namespace, and thus the symbols from an 
> > ontology, without importing it.  In some cases, one does this 
> > just to declare that you want to use that symbol to avoid 
> > making up one of your own; and you don't need the axioms that 
> > formally constrain the symbol's meaning.  In other cases, 
> > there may be only a few such axioms, and you can simply copy 
> > them.  I don't know if this is a good idea.  We're getting 
> > into a whole mess of hard questions about version control, 
> > partial importing of ontologies, etc. etc. that I wish I had 
> > answers to.  
> 
> Do you mean just use the URI without importing it? If so, I 
> am not sure how
> it will work?  One of the neat features of the web is its 
> loosely coupled
> nature.  But you need to follow your nose to know more about 
> the resource.
> Without "importing", i.e., to fetch the resource description from the
> namespace, what is the use of it?  For instance, if given a 
> dubline core URI
> http://purl.org/dc/elements/1.1/creator, without following 
> the URI, I won't
> even know how I should label it.
> 
> Or, did I misunderstand your "using namespace without 
> import"? If so, can
> you give me an example?
> 
> Cheers,
> 
> Xiaoshu
> 
> 
> 
> 




RE: Playing with sets in OWL...

2006-09-08 Thread Miller, Michael D (Rosetta)

Hi Alan,

What you are describing is described in MAGE-OM/MAGE-ML, as a UML model
to capture the real world aspects of running a microarray experiment.

Typically at the end of this process a set of genes is identified as
being interesting for some reason and one wants to know more about this
set of genes beyond the microarray experiment that has been performed.

I might be wrong but I think that is where Marco is starting, at the end
of the experiment for follow-up.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Alan Ruttenberg
> Sent: Friday, September 08, 2006 3:07 PM
> To: Marco Brandizi
> Cc: [EMAIL PROTECTED]; public-semweb-lifesci@w3.org
> Subject: Re: Playing with sets in OWL...
> 
> 
> 
> Hi Marco,
> 
> There are a number of ways to work with sets, but I don't think I'd  
> approach this problem from that point of view.
> Rather,  I would start by thinking about what my domain instances  
> are, what their properties are, and what kinds of questions I 
> want to  
> be able to ask based on the representation. I'll sketch this out a  
> bit, though the fact that I name an object or property doesn't mean  
> that you have to supply it (remember OWL is open-world) - still  
> listing these make the ontology makes your intentions clearer  and  
> the ontology easier to work with by others.
> 
> The heading in each of these is a class, of which you would make one  
> or more instances to represent your results.
> The indented names are properties on instances of that class.
> 
> An expression technology:
> Vendor:
> Product: e.g. array name
> Name of spots on the array
> Mappings:  (maps of spot to gene - you might use e.g. 
> affymetrix,  
> or you might compute your own)
> 
> ExpressionTechnologyMap
>SpotMapping: (each value a spot mapping)
> 
> Spot mapping:
>SpotID:
>GeneID:
> 
> An expression profile experiment (call yours exp0)
> When done:
> Who did it:
> What technology was used: (an expression technology)
> Sample: (a sample)
> Treatment: ...
> Levels: A bunch of pairs of spot name, intensity
> 
> Spot intensity
>SpotID:
>Intensity:
> 
> A  computation of which spots/genes are "expressed" (call yours c1)
> Name of the method : e.g. mas5 above threshold
> Parameter of the method: e.g. the threshold
> Experiment: exp0
> Spot Expressed: spots that were over threshold
> Gene Computed As Expressed: genes that were over threshold
> 
> And maybe:
> 
> Conclusion
> What was concluded:
> By who:
> Based on: c1
> 
> All of what you enter for your experiment are instances (so 
> there are  
> no issues of OWL Full)
> 
> Now, The gene set you wanted can be expressed as a class:
> 
> Let's define an inverse property of   
> "GeneComputedAsExpressed", call  
> it "GeneExpressedAccordingTo"
> 
> Class(Set1 partial restriction(GeneExpressedAccordingTo hasValue(c1))
> 
> Instances of Set1 will be those genes. You may or may not want to  
> actually define this class. However I don't think that youneed
> to add any properties to it. Everything you would want to say  
> probably wants to be said on one of the instances - the experiment,  
> the computation, the conclusion, etc.
> 
> Let me know if this helps/hurts - glad to discuss this some more
> 
> -Alan
> 
> 
> 
> 
> 2)
> 
> On Sep 8, 2006, at 11:58 AM, Marco Brandizi wrote:
> 
> >
> > Hi all,
> >
> > sorry for the possible triviality of my questions, or the 
> messed-up  
> > mind
> > I am possibly showing...
> >
> > I am trying to model the grouping of individuals into sets. In my
> > application domain, the gene expression, people put 
> together, let's  
> > say
> > genes, associating a meaning to the sets.
> >
> > For instance:
> >
> > Set1 := { gene1, gene2, gene3 }
> >
> > is the set of genes that are expressed in experiment0
> >
> > (genei and exp0 are OWL individuals)
> >
> >
> > I am understanding that this may be formalized in OWL by:
> >
> > - declaring Set1 as owl:subClassOf Gene
> > - using oneOf to declare the membership of g1,2,3
> > (or simpler: (g1 type Set1), (g2 type Set1), etc. )
> > - using hasValue with expressed and exp0
> >
> > (right?)
> >
> > Now, I am trying to build an application which is like a semantic  
> > wiki.
> >
> > Hence users have a quite direct contact with the underline  
> > ontology, and
> > they can write, with a simplified syntax, statements about a subject
> > they are describing (subject-centric approach).
> >
> > Commiting to the very formal formalism of OWL looks a bit 
> too much...
> > formal... ;-) and hard to be handled with a semantic wiki-like  
> > application.
> >
> > Another problem is that the set could have properties on 
> its own, for
> > instance:
> >
> > Set1 hasAuthor Jhon
> >
> > meaning that John is defining it. But hasAuthor is 
> typically used for
> > individuals, and I wouldn't like to fall in OWL-Full, by 
> making an OWL
> >

RE: [BioRDF] global uniqueness requirement of LSIDs and RDF

2006-08-14 Thread Miller, Michael D (Rosetta)
Title: Message



Hi 
Sean,
 
Thanks 
for your clarification, exactly what John's e-mail brought to my mind but much 
better explained.
 
A 
similar use case might be a gene _expression_ experiment that is sent into 
ArrayExpress.  At some point someone who downloads the experiment discovers 
that one of the hybridization is totally clustering with a different set of 
replicates than the one it was assigned.  The original investigator takes a 
look and discovers that the lab technician had grabbed the sample 
aliquot from the wrong shelf and recorded the original sample's 
LSID.
 
So to 
update ArrayExpress, the Hybridization is still the same but it needs a new 
version and needs to be associated with the proper sample LSID.  The 
experiment itself needs to get a new version and have the Hybridization be moved 
to the proper set of replicates and the data needs to have new versions and the 
DataCubes updated with the new, recalculated replicate 
DataCubes..
 
cheers,
Michael

  
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Sean 
  MartinSent: Monday, August 14, 2006 5:47 AMTo: 
  public-semweb-lifesci@w3.orgSubject: Re: [BioRDF] global uniqueness 
  requirement of LSIDs and RDFHello John, > > > How I've 
  come to think about this is that some properties are intrinsic> > to 
  the type of record, for a person, perhaps their SSN if American, and> 
  > some are not, such as a person's age.  But even this becomes 
  context> > dependent if one wishes to track the state of the person 
  once a year.> > If I understand the uniqueness requirement of 
  LSIDs, then a new LSID for> "Michael Miller" must be created every year 
  when the age property changes.This is not 
  quite how it is meant to work. You would only create a new LSID for Michael 
  Miller each year if he was a data file and somehow his bytes changed :-) 
   In the case you describe Michael is more of an idea (sorry Michael!) 
  with many facets, some that can be concretely represented as bytes (the bytes 
  named) and some conceptual that can be described in metadata (that further 
  describe the concept named) and  have no associated unique data (that is 
  named) bytes. You could use an LSID (or 
  any kind of URI) without any directly associated data bytes to represent 
  Michael as a central concept. Then a metadata graph associated with this 
  conceptual URI might tell you his date of birth, it might also contain links 
  to LSIDs and other URIs that contain separate concrete representations of 
  Michael - for example x-ray images, MRIs, his DNA sequence or results for 
  other tests that have a binary representation and where it makes sense to 
  uniquely name each as a discrete data item. These different representations 
  may even be made available in different contexts/formats (e.g. images of 
  differing size, resolution or binary format like png and gif) and each with 
  its own LSID. Similarly if for some reason one of these images is changed 
  later (say a better algorithm for sharpening), that new image instance could 
  be made available as an LSID revision by incrementing the version area of the 
  LSID name. Kindest regards, 
  Sean -- Sean Martin IBM 
Corp.


RE: [BioRDF] global uniqueness requirement of LSIDs and RDF

2006-08-12 Thread Miller, Michael D (Rosetta)

Hi John,

Another version of this problem has existed in the relational world when
importing records from the outside world, which is 'when should an
existing record be updated and when should a new record be created.'
Because the record is coming from the outside world, an alternative key
must be used to see if it already exists, that is, some subset of the
record's properties.

How I've come to think about this is that some properties are intrinsic
to the type of record, for a person, perhaps their SSN if American, and
some are not, such as a person's age.  But even this becomes context
dependent if one wishes to track the state of the person once a year.

The fact that you only have one property for each of your objects
probably oversimplifies the problem.  

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> John Barkley
> Sent: Tuesday, August 01, 2006 5:31 AM
> To: public-semweb-lifesci@w3.org
> Subject: [BioRDF] global uniqueness requirement of LSIDs and RDF
> 
> 
> 
> The global uniqueness requirement of LSIDs is clear if one is 
> talking about
> something like an image. As we discussed on yesterday's 
> telecon, if one bit
> in an image is changed, then the LSID must change. My 
> question is how does
> this work with an LSID that dereferences to RDF. Consider the 
> following
> simple examples:
> 
> 1. Suppose in my namespace I create lsid-A in an RDF file 
> where lsid-A has a
> single datatype property. Following the global uniqueness 
> requirement, if I
> change the value in that datatype property, then I have to 
> create a new
> LSID.
> 
> 2. Now suppose I create another LSID, lsid-B, that has a single object
> property whose object is the lsid-A from (1). Once again, if 
> I change the
> value of lsid-A's datatype property, then I have to create a 
> new LSID for
> lsid-A, and also, depending on the meaning I want for lsid-B, 
> a new LSID for
> lsid-B with object property the new lsid-A.
> 
> 3. Now suppose that I create lsid-C with a single object 
> property whose
> object is url-A, also in my namespace. url-A has one datatype 
> property. What
> should happen if I change the value of url-A's datatype 
> property?  Do I need
> to create a new LSID for lsid-C? I would think the answer 
> would be yes.
> 
> 4. In (3), url-A is in my namespace. What should happen if 
> url-A is not in
> my namespace and the value of its datatype property changes?
> 
> Putting these questions more generally:
> 
> 1. For RDF, does the global uniqueness requirment mean that only the
> immediate set of properties and their object names/values 
> need be unique?
> 
> 2. Does it mean that, for RDF, an LSID's closure (of any 
> kind) within the
> namepace that I control need be unique?
> 
> 3. With RDF, do I have to be concerned about an LSID's 
> closure (of any kind)
> in other peoples' namespaces?
> 
> 
> thanks,
> 
> jb
> 
> 
> 
> 
> 




RE: A precedent suggesting a compromise for the SWHCLS IG Best Practices (ARK)

2006-08-11 Thread Miller, Michael D (Rosetta)

Hi Xiaoshu,

Here's where we have a bit of a different viewpoint.

> The problem is that semantic web is intended to make machine to 
> understand.  

I would say that the purpose of the semantic web is to increase the sum
of knowledge for use by the human practitioners.  If only a machine
understood, it would only be useful to the machine.

Now the more the underlying infrastructure makes it easy for a machine
to understand, the easier the job is for people like us who are trying
to enable the semantic web to get useful results from the machine and
create applications that translate the results for the human
practitioners.

> And
> the clarity is a prerequisite to instruct machine unambiguously.

There is still no clarity in the Life Sciences as to what is a gene and,
more so, what is an instance of a particular gene.  This, mostly, has
lead to the situation in the Life Sciences where there is a welter of
not exactly consistent consensus on what constitutes a precise
definition of the genome for a particular species.

In a domain where clarity isn't easily obtained and where ambiguity
lies, it is up to those who program the machines to cope with that
existing ambiguity.  Even with the current state there are ways to glean
extremely useful information, if the machines are programmed properly
for the information source, it is just not as clean as against nice and
tidy ontologies.

And the true end users, the ones who use google and would use the
semantic web if they didn't have to understand how it works, would
expect that these invaluable, if tangled, gene resources, are part of
what is supported.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, August 10, 2006 6:55 PM
> To: Miller, Michael D (Rosetta)
> Cc: Alan Ruttenberg; Mark Wilkinson; 
> public-semweb-lifesci@w3.org; [EMAIL PROTECTED]
> Subject: RE: A precedent suggesting a compromise for the 
> SWHCLS IG Best Practices (ARK)
> 
> 
> Quoting "Miller, Michael D (Rosetta)" <[EMAIL PROTECTED]>:
> 
> > You're correct here but it is the state of the art.  Interestingly
> > enough, I've found that in general the biology-based scientists and
> > investigators are not all that bothered by this confusion 
> and despite
> > the confusion seem to make their way through it.
> 
> The problem is that semantic web is intended to make machine to 
> understand.  And
> the clarity is a prerequisite to instruct machine unambigously.
> 
> Xiaoshu
> 
> 
> 




RE: A precedent suggesting a compromise for the SWHCLS IG Best Practices (ARK)

2006-08-10 Thread Miller, Michael D (Rosetta)

Hi All,

> I frequently see genes, transcripts, dna and mrna and their  
> sequences, proteins, protein sequences, transcripts,  and peptides  
> all confusedly identified by overlapping identifiers. I don't 
> see how  
> any identifier scheme, in itself, lsid's included, currently fixes  
> this problem.   It is this problem that I personally want to see  
> progress on.

You're correct here but it is the state of the art.  Interestingly
enough, I've found that in general the biology-based scientists and
investigators are not all that bothered by this confusion and despite
the confusion seem to make their way through it.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Alan Ruttenberg
> Sent: Sunday, July 30, 2006 7:08 PM
> To: Mark Wilkinson
> Cc: Alan Ruttenberg; public-semweb-lifesci@w3.org; 
> [EMAIL PROTECTED]; Sean Martin; Henry S. Thompson; 
> Phillip Lord; [EMAIL PROTECTED]; Dan Connolly
> Subject: Re: A precedent suggesting a compromise for the 
> SWHCLS IG Best Practices (ARK)
> 
> 
> 
> Excellent response! I 95% heartedly agree (all but the "I stand by  
> LSIDS part" :)
> 
> I will note however that whenever there are versions of something,  
> there tends to some concept of the thing that they are versions of.  
> So even though there are versions of the sequence, there ought to  
> still be some thing which represents the thing that all the versions  
> are of.
> 
> Back to your point, is there anyone out there who has minted LSIDs  
> for genes and for the sequences distinctly and related them? Do the  
> gene LSIDs ever get versions? Do the sequence LSIDs ever not have  
> versions? When there are different authorities for the genes and  
> sequences, what are the relations that people use to relate them?  
> Let's put these examples on the table.
> 
> If any one has done this in the context of NCBI databases in  
> particular I think it would be helpful to share the specifics of how  
> these ids were used and conceptualized.
> 
> My experience has been that there is routine confusion of the sort  
> that you describe throughout the life sciences community and that  
> this bleeds into the discussion of identifiers (as it just did,  
> though I have to admit I was baiting for exactly this discussion :)
> 
> I frequently see genes, transcripts, dna and mrna and their  
> sequences, proteins, protein sequences, transcripts,  and peptides  
> all confusedly identified by overlapping identifiers. I don't 
> see how  
> any identifier scheme, in itself, lsid's included, currently fixes  
> this problem.   It is this problem that I personally want to see  
> progress on.
> 
> LSID's contract seems more to do with persistence, mutability,  
> cacheability, and discoverability of byte sequences  - not around  
> issues of the identifiers and their relations making 
> ontological sense.
> 
> While I understand that in some contexts the issues around data  
> management are central, they aren't in all contexts. Because I think  
> that optimization of the data management issues, while in some ways  
> elegantly handled by the LSID protocol, aren't central to the issue  
> of representation in the life sciences, and because I don't see LSID  
> addressing the representation issues, I worry that  imposing the use  
> of the LSID protocol puts a burden on all, for the benefit of  
> relatively few.  And for those relatively few who are going 
> to go out  
> of their way to have internal copies of data and the like, I don't  
> see why a custom system that is circumvents http for efficiency  
> reasons is too much of a burden.
> 
> How do you see things otherwise?
> 
> -Alan
> 
> (Being deliberately provocative here - my assigned role in this  
> debate :)
> 
> On Jul 30, 2006, at 9:06 PM, Mark Wilkinson wrote:
> 
> > On Sun, 30 Jul 2006 16:46:21 -0700, Alan Ruttenberg  
> > <[EMAIL PROTECTED]> wrote:
> >
> > I may be speaking out-of-turn here, and should probably let Sean  
> > answer this one since he may have (no doubt) thought-through it  
> > more deeply than I have; however I think you may be mixing up  
> > several different entities here (as so often happens in a URL  
> > world ;-) )
> >
> > In the case you cite above you are likely talking about a "gene",  
> > not a "sequence".  A "gene" will have its own LSID, and it 
> is (even  
> > by the strict genetic definition) a conceptual entity defined by  
> > complementation.  A "gene" and its "sequence" are not the same  
> > thing!  So... I don't see a problem.  When you need to 
> refer to the  
> > gene in the abstract, you can refer to the gene's LSID.  When you  
> > need to talk about a concrete sequence, you refer to *it's* LSID.   
> > The metadata of the gene will (in a sensible world) include 
> triples  
> > that describe its possible sequences, and these will have versi

RE: [HCLS] RE: scientific publishing task force update

2006-08-08 Thread Miller, Michael D (Rosetta)

Hi Kei,

> It means that  things might not overlap at 
> the same level, but may overlap at different levels between different 
> ontologies (entity modeled at a higher level of granularity may be 
> mapped to one modeled at a lower level of granularity) . 

Excellent point, and I just want to add (explicitly!) that one also has
to consider that a concept in one ontology might overlap partially or
completely two concepts in another ontology and if you map those two
concepts back to the first ontology, they have interesting overlaps to
not only the original concept but other concepts in the first ontology,
and so on.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of kei cheung
> Sent: Monday, July 31, 2006 7:57 AM
> To: [EMAIL PROTECTED]
> Cc: 'w3c semweb hcls'
> Subject: Re: [HCLS] RE: scientific publishing task force update
> 
> 
> 
> Hi Don et. al,
> 
> I'm also catching up with all the exciting communications 
> that have been 
> going on within the HCLSIG forum. Different neuroscience 
> databases store 
> different but related types of information at possibly 
> different levels 
> of detail and granularity. It means that  things might not overlap at 
> the same level, but may overlap at different levels between different 
> ontologies (entity modeled at a higher level of granularity may be 
> mapped to one modeled at a lower level of granularity) . It would 
> therefore be important to be able to address these issues in our 
> integration framework (e.g., the one proposed by Eric).   I'm in the 
> process drafting a scenario involving integration CoCoDat and 
> NeuronDB. 
> I'll make it available to the group as soon as possible.
> 
> Cheers,
> 
> -Kei
> 
> Donald Doherty wrote:
> 
> >Kei is correct that there is overlap in the approach I think 
> we're taking to
> >and Eric's ideas. My mentor Karl Pribram wrote about 
> neuroscience as a
> >modern day "Tower of Babel" in his 1972 "Languages of the Brain."
> >
> >Not only is the situation is much the same today but I don't 
> believe that
> >will ever change (nor would it be desirable if it did...we 
> need all of the
> >ideas, viewpoints, etc. that we can get). So, there will 
> always be multiple
> >ontologies that change over time (some slowly some not).
> >
> >That is why it seems especially important to provide a way 
> to build bridges
> >between ontologies that enable individuals and organizations 
> to contemplate
> >more than one semantic view of any given dataset.
> >
> >[Please ignore the above if this has been covered 
> already...I'm currently
> >trying to catch up with about one and a half months of 
> email! I had to
> >finish a prototype that is now in debug hell...but that's 
> another story.]
> >
> >Don
> >
> >-
> >Donald Doherty, Ph.D.
> >Brainstage Research, Inc.
> >www.brainstage.com
> >[EMAIL PROTECTED]
> >412-478-4552
> >
> >
> >-Original Message-
> >From: [EMAIL PROTECTED]
> >[mailto:[EMAIL PROTECTED] On Behalf Of kei cheung
> >Sent: Thursday, June 15, 2006 1:04 PM
> >To: Eric Neumann
> >Cc: Phillip Lord; w3c semweb hcls
> >Subject: Re: scientific publishing task force update
> >
> >
> >Hi Eric et al,
> >
> >The more I think of, would your OntologyCovering task relate to Don 
> >Doherty's Bridging Ontology task 
> >(http://esw.w3.org/topic/HCLS/OntologyTaskForce/Create_Bridgi
> ng_Ontology_bet
> >ween_NeuronDB_and_CoCoDat_databases_and_UMLS_Common_Vocabular
> y#preview)?
> >
> >In other words, can your Ontology Covering technique potentially be 
> >applied to mapping between NeuronDB and CoCoDat OWL ontologies?
> >
> >Just my 2-cent observation.
> >
> >Cheers,
> >
> >-Kei
> >
> >Eric Neumann wrote:
> >
> >  
> >
> >>Following up to Phil's point, an alternative to building upper 
> >>ontologies (UO) first, is to consider constructing a "Covering Map" 
> >>between apparent overlapping sets of "related" ontologies. 
> These are 
> >>light weight, RDF associations that can help "pin-down" potentially 
> >>related items/classes from different ontologies. I also agree the 
> >>notion of "guides" is very powerful when dealing with a diverse 
> >>community, yet trying to get things up and running sooner 
> than later...
> >>
> >>I've written this up on the HCLS/OntologyTaskForce wiki:
> >>http://esw.w3.org/topic/HCLS/OntologyTaskForce/OntologyCovering
> >>
> >>As BioRDF progresses in making more life sciences data available as 
> >>RDF, we will have to deal with such ontological issues more 
> >>frequently, so it's very useful for everyone to be discussing these 
> >>issues at this point.
> >>
> >>cheers,
> >>Eric
> >>
> >>
> >>
> >>
> >>
> >>--- Phillip Lord <[EMAIL PROTECTED]> wrote:
> >>
> >>
> >>
> >>>"SC" == Steve Chervitz <[EMAIL PROTECTED]> writes:
> >>>  
> >>>
> They also wrote an interesting paper on the state of
> bio-ontologies.
> 
> Nature Biotechnology 23, 1095 - 1098 (2005)
> doi:10.1038/nbt0905-

RE: Semantic content negotiation (was Re: expectations of vocabulary)

2006-07-28 Thread Miller, Michael D (Rosetta)

Hi Henry,

> I spend a lot of time introducing the Semantic Web to beginners. The  
> best way to learn is to teach :-)

Ah, but my point is that just as the average person doesn't have to
understand the heuristics behind how Google returns results, the average
joe clinician or researcher also doesn't care about the heuristics
behind the semantic web.

they either have a patient who they might be curious if the symptoms
indicate early onset of Alzheimer or a gene expression experiment on a
promising compound and liver cancer and are curious if other experiments
had the same set of genes upregulated.

I would like to see that a good number of use cases all begin and end
with this average joe clinician or researcher.

they are unlikely to care about http headers, rdf graphs and
particularly about triples, tho all of those are going to be involved in
getting the results.

cheers,
Michael  

(but the blog is interesting!)

> -Original Message-
> From: Henry Story [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, July 26, 2006 8:50 AM
> To: Miller, Michael D (Rosetta)
> Cc: w3c hcls semweb; Semantic Web
> Subject: Re: Semantic content negotiation (was Re: 
> expectations of vocabulary)
> 
> 
> 
> On 26 Jul 2006, at 17:45, Miller, Michael D (Rosetta) wrote:
> > I've seen relatively little discussion that targets this 80% that is
> > available right now, warts and all.
> 
> Have a look at my blog:
> 
> http://blogs.sun.com/bblfish/
> 
> I spend a lot of time introducing the Semantic Web to beginners. The  
> best way to learn is to teach :-)
> 
> Henry
> 
> > cheers,
> > Michael
> 
> 
> 




RE: Semantic content negotiation (was Re: expectations of vocabulary)

2006-07-26 Thread Miller, Michael D (Rosetta)

Hi Xiaoshu,

I think many excellent points and discussions are being made but I'm
feeling frustrated because, in the 80/20 paradigm (80% is easy to
implement the last 20% is much harder), these discussions are in the
20%, I might even venture that they are in the top 5%.

The vast majority of the potential consumers (the 80%) of the semantic
web are just the group I was pointing out, normal researchers who don't
care how google works, or http, they just use it.  What they would want
to get from the semantic web is probably out there already and if the
infrastructure of the semantic web could be set up to reach the already
existing resources (GO, MO, NCI metathesaurus, etc) in even an
admittedly limited fashion, adoption and additional resources would
become available for the semantic web.

There seems lately in these discussions to be an emphasis on making the
semantic web useable for the 5% who care about transitive closure and
for perfect modularity.  These are great things, but they will come
faster, I believe, if the semantic web becomes available with what we
have now.

I've seen relatively little discussion that targets this 80% that is
available right now, warts and all.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoshu Wang
> Sent: Monday, July 24, 2006 7:10 PM
> To: public-semweb-lifesci@w3.org; 'Semantic Web'
> Subject: RE: Semantic content negotiation (was Re: 
> expectations of vocabulary)
> 
> 
> 
> --Michael,
> 
> > This group seems to have forgotten that for the semantic web 
> > to be used by more than a handful of hardcore researchers, it 
> > will need have tools that are easy to use for the average joe 
> > researcher.  It feels like there are a lot of levels that 
> > have gotten mixed up in the recent discussions.
> 
> When I said consumer, I meant it to be those who wrote the 
> software agent.
> Not the consumer who actually use the agent. Because no 
> matter what, it is
> the consumer who sends the request info to the provider.  What kind of
> closure always comes from the consumer.  The issue here is 
> who should be
> handle it.
> 
> Xiaoshu
> 
> 
> 
> 




RE: Semantic content negotiation (was Re: expectations of vocabulary)

2006-07-24 Thread Miller, Michael D (Rosetta)

Hi Xiaoshu,

> because the
> designer of one ontology would have no idea how it is going 
> to be used.

I agree with you here but ...

> And I think how to treat a set of RDF
> triples, i.e., in what kind of closure is a consumer's issue 

... I believe this is a semantic web issue also, but at a different
level--the consumer should likewise have no sense of this, the average
consumer should have some piece of information they are curious about
(for example, their patient has a cold with these interesting symptoms)
and wishes to know if there is interesting information about this.

This group seems to have forgotten that for the semantic web to be used
by more than a handful of hardcore researchers, it will need have tools
that are easy to use for the average joe researcher.  It feels like
there are a lot of levels that have gotten mixed up in the recent
discussions.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoshu Wang
> Sent: Monday, July 24, 2006 11:02 AM
> To: public-semweb-lifesci@w3.org; 'Semantic Web'
> Subject: RE: Semantic content negotiation (was Re: 
> expectations of vocabulary)
> 
> 
> 
> --Danny,
> 
> > In a manner, this approach is already flying. An RDF store 
> > may contain billions of triples, but by using a query 
> > protocol (i.e. SPARQL) it's possible to deliver only the 
> > subset of interest to the client. This is less burdensome 
> > that directly serving a document containing the billions of triples.
> 
> But querying is a totally different matter from web closure.  
> Because a
> query implies the result returns to meet certain criteria.  Content
> negotiation is a different representation of the same 
> resource, but not the
> different part of the same resource.  It is wrong in a 
> fundamental way.  
> 
> Let me put it in this way, if I have a resource R that is 
> composed with two
> parts A and B.  uri(R) should always return the 
> representation of R, ie.,
> (A+B) right?  If as you suggested, the uri(R) would have 
> three possible
> results:
> (1) A
> (2) B
> (3) A+B
> 
> It fundemantally breaks the purpose of URI, don't you think? 
> 
> By the same logic of this semantic cookies proposal, we can also using
> cookies for XML path to XML document, or even worse to ask 
> the HTTP URI
> fragment identifier should be handled at the server side but 
> not the client
> side?  
> 
> How to partition content is the concern of content provider.  
> How to use the
> content is a consumer's issue.  And I think how to treat a set of RDF
> triples, i.e., in what kind of closure is a consumer's issue 
> because the
> designer of one ontology would have no idea how it is going 
> to be used.
> 
> > But it seems to me there should be less demanding approaches. 
> > If, for example, the client is only interested in say Geo 
> > data, that hint could be passed (somehow) in a HTTP header. 
> > For a server that recognised that header when an RDF-bearing 
> > URI was gotten, it would be possible to redirect the client 
> > to a more appropriate resource, such as one with a static 
> > file containing only Geo-related data.
> 
> What you described is a query engine but not the transport 
> engine like HTTP.
> A URI of http://www.foo.com/?x=y is not the same as 
> http://www.foo.com,
> right?
> 
> Xiaoshu
> 
> 
> 
> 




RE: BioRDF [Telcon]--RDF->XML

2006-07-21 Thread Miller, Michael D (Rosetta)

Hi Tim,

"We wrote a note to Michael Miller about this, but he wasn't very
interested"

Not so much not interested as 1) I barely have the time to keep up with
the e-mail threads (which are very useful and thought provoking) and 2)
it's not germane to my priorities here.

What I hope to peck away at in my not so copious free time is the notion
of creating triples that consists of something like "MAGE
class.identifier:MAGE association:MAGE Ontology Entry", submit that to a
(hypothetical at this point) semantic web app and get back  a set of
triples "MAGE Ontology Entry:MAGE association:MAGE class.identifier",
which is, more or less, MAGE-OM UML->RDF.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Susie Stephens
> Sent: Friday, July 21, 2006 4:29 PM
> To: 'public-semweb-lifesci'
> Subject: Re: BioRDF [Telcon]
> 
> 
> 
> The minutes from the July 17 BioRDF call are now posted:
> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Meetings/2006-0
> 7-17_Conference_Call
> 
> Thanks Joanne for being the scribe.
> 
> Susie
> 
> 
> 
> 




RE: BioRDF: URI Best Practices

2006-07-21 Thread Miller, Michael D (Rosetta)
Title: Message



Hi 
All,
 
(from 
Sean)
 
"The issues of broken links is a difficult one 
because once the primary source at a particular location disappears you have 
nothing left to go on to find a copy of the thing named besides what you can 
find in the WayBack machine or perhaps a Google 
cache."
 
A 
great example of this is the www.i3c.org link 
where the primary work on implementing the LSID specification was documented 
originally.
 
cheers,
Michael
 
Michael Miller Lead Software Developer 
Rosetta Biosoftware Business Unit www.rosettabio.com 

  
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Sean 
  MartinSent: Friday, July 21, 2006 8:10 AMTo: 
  public-semweb-lifesci@w3.orgSubject: Re: BioRDF: URI Best 
  PracticesHi Alan, 
  > So my proposal suggests a class that defines 
  ways of transforming the  > URI you find in a SW document into 
  URLs that get specific types of  > information. The fact that a 
  transform to URL is provided means you get  > the transport 
  (because it is part of the transformed URL). Different  > 
  properties of the class let you retrieve different patterns for  > 
  different sorts of information (1.). The representation 2., is not 
   > explicitly represented, it should instead be part of the 
  definitions of  > the properties. We typically want to know 
  *before* we dereference, what  > we would get back. 
  > There's more to elaborate about such a proposal 
  and details to work  > out, but I think, for instance, that it can 
  handle the LSID use cases.> This 
  sounds interesting. Please could you elaborate a little so that we can think 
  it through to see what exactly it does address and what it would entail. It 
  seems to me some of it may well work in the situations where the web is 
  current and you actually have a SW document (an LSID is also intended as an 
  persistent independent reference which can be used as the key to a 3rd party 
  annotation for example), but as a long term naming/dereferencing solution it 
  breaks down as the web backing it ages and bit rot sets in. 
  Perhaps this is one of the key problems 
  with using URLs as names for things that have a digital existence. The issues 
  of broken links is a difficult one because once the primary source at a 
  particular location disappears you have nothing left to go on to find a copy 
  of the thing named besides what you can find in the WayBack machine or perhaps 
  a Google cache. As I suggested in my last post, have a look at your emails 
  from a year or two back and see what percentage of URL links still work. The 
  fact is that organizations change direction, people move, machines break or 
  are reorganized and so too does the web that echoes this.  I have always 
  found that the web reflecting what is current is usually ok, but the web 
  reflecting the past state of things is much much less so. Of course sometimes 
  it is even hard to figure out what is actually current too! Unfortunately 
  repeatable science and of course legal obligations require us to have decent 
  answers here too, or am I missing something? Kindest regards, Sean -- Sean Martin 
  IBM Corp


RE: LSIDs and ontology segmentation

2006-07-13 Thread Miller, Michael D (Rosetta)

Hi All,

> I would think that an author of an ontology of this size 
> would want to consider fragmenting the ontology (perhaps by 
> sub-domains) and linking them with owl:imports.  In such a 
> scenario, the 
> terms could simply be identifiers asserted within each 
> ontology fragment 
> and only the ontology fragments would need URLs for dynamic 
> resolution.

My impression of the GO ontology (the example given) is that it can definitely 
be divided into three partitions, Molecular Function, Biological Process and 
Cellular Component, but beyond that, any partioning would be entirely 
arbitrary.  It and the Taxon ontology are essentially a DAG and a simple Tree 
respectively, so the only things of interest for the huge majority of current 
use cases is traversing these paths, which can be easily done by incremental 
fetches given a starting term from a gene without reading in the entire 
ontologies.

The other point I would want to make is that many times one doesn't want to do 
any reasoning, the search is simply for objects that have been annotated with 
individuals that are a subclass of a class in some ontology and to display the 
definitions from the ontology to the users. It is also of interest to go out 
and ask for a particular object one is interested in, what objects out there in 
the wide world that are also annotated with individuals for the same classes.  

So if I have an ExperimentDesign annotated with the MGED Ontology individual 
cellular_modification_design of the class ExperimentDesignType, I might like to 
find similar ExperimentDesigns.  How far I would want a tool to go in 
traversing what is similar is a function of time and resources, I might want a 
tool that did exact matches, a tool that searched for linked terms in other 
ontologies and objects associated with those terms, a tool that understood that 
a linked term in another ontology lead through a relationship in that ontology 
to other terms and so on but in the end I am only interested in like 
ExperimentDesigns not in the ontologies themselves.

This use case, which is what is happening ad hoc now, is what I would love to 
see the semantic web support initially.  I don't think the life sciences need 
complex reasoning enabled as of yet because it doesn't even have the simple 
cases hooked to the semantic web yet.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Chimezie Ogbuji
> Sent: Thursday, July 13, 2006 11:44 AM
> To: Mark Wilkinson
> Cc: public-semweb-lifesci@w3.org
> Subject: Re: LSIDs and ontology segmentation
> 
> 
> 
> 
> > In a publication that will be available soon [1] we 
> (briefly) discuss
> > the problem of actually *using* the currently available 
> ontologies in a
> > "real" Semantic Web setting - i.e. dynamically downloading whatever
> > ontologies are necessary given the predicates that you find in some
> > discovered RDF instance document.
> >  The OWL representation of GO is over
> > 10 Meg... for heavens sake!... and GO is a small ontology 
> compared to
> > things like the NCI Metathesaurus.
> >
> > The problem with using document#fragment URLs to identify 
> ontology nodes
> > is that the defined behaviour for resolving such an identifier is to
> > drop the fragment (since that isn't available server-side 
> anyway) and to
> > return the entire document... all 10Meg's of GO... each time...  We
> > would argue, therefore, that the URL (if you adopt its default
> > behaviour) is not only a bit of a nuisance, it is a blocker 
> in some/many
> > cases.
> 
> I don't think this particular case has much to do with URLs 
> themselves but 
> as to how an ontology author wishes to distribute his/her 
> ontology.  The 
> behavior you mention is only the case if the ontology terms 
> are URLs - 
> i.e., they are locators as well as identifiers.  Even for 
> ontologies of 
> small size, I would consider this a bad practice for ontology 
> distribution.  There are many consequences for resolving 
> terms from an 
> ontology out of context, the primary one being that in doing 
> so you may 
> not have enough closure to faciliate reasoning.
> 
> Automatically attempting to dereference vocabulary terms in 
> an instance 
> graph in order to tie them in with their defining ontology is 
> one of many options. 
> In an earlier thread, it's been pointed out that more 'controlled' 
> mechanisms can be used to do this.  For one thing 
> interpreting a Semantic 
> Web in this way this assumes that the terms are URLs 
> specifically - which
> is not practical (for reasons you've pointed out as well as 
> the issues 
> with reasoning).
> 
> I would think that an author of an ontology of this size 
> would want to consider fragmenting the ontology (perhaps by 
> sub-domains) and linking them with owl:imports.  In such a 
> scenario, the 
> terms could simply b

RE: ontology specs for self-publishing experiment

2006-07-10 Thread Miller, Michael D (Rosetta)

Hi All,

> Yes, but put another way, you have refactored the problem of  
> "incommensurateness" into two more tractable pieces - one about the  
> data structures to convey meaning, the other about the meanings  
> conveyed.  You have also removed the risk of conflating the two ... 

Thanks, Alan, this conveys much better what I was trying to say.

What I would add is that "the data structures to convey meaning" are
mostly those objects that are unique to an investigation.  These
instances would likely, in themselves, not have much worth in the
translation to RDF and analysis by semantic web tools.

But their annotations, on the other hand, would.  If I understand what
Marco is up to at the EBI (and I'm likely getting the terminology
wrong), is for a particular gene expression experiment deposited in
ArrayExpress, forming triples based on the MAGE class to the ontology
annotations and going from there.

On another related thread from Phil,

"My point is that XSLT is not
good for operating on RDF because there are many syntactic ways of
representing the same thing. In general, I wouldn't use XSLT at all as
I hate it, but that's a different issue."

In general, we've found that importing the contents of the MAGEv1
documents to our application and then operating on the contents within
the application to be much preferable to dealing with the XML directly.

The good news for the semantic web effort is that there are many
applications that are making active use of the GO ontology, currently in
an ad hoc manner for the most part.  In terms of an import from a MAGE
document, one can find the genes of interest and based on their
identifiers get their associated GO terms (which did not have to be in
the MAGE document itself, only its GENBANK or similar identifier). Then
one can take a GO annotated version of BioPathways and take 
these Genes of interest and map onto the pathways via the GO terms and
so on.

And, if one looks at the experiments at ArrayExpress, there is a lot of
annotation for the BioMaterials and for the Experiments that isn't being
exploited yet but could easily to look for interesting matches today
between ArrayExpress, GEO, NCI and other repositories.

cheers,
Michael

> -Original Message-
> From: Alan Rector [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, July 08, 2006 11:57 AM
> To: William Bug
> Cc: Miller, Michael D (Rosetta); Tim Clark; w3c semweb hcls; 
> SWAN Team; Trish Whetzel; chris mungall
> Subject: Re: ontology specs for self-publishing experiment
> 
> 
> 
> On 6 Jul 2006, at 19:22, William Bug wrote:
> 
> >
> > 2) Doesn't this lead down a road similar to that of 
> MIAME, only  
> > now you've shifted the border of incommensurateness beyond the  
> > level for data format and into the semantic domain?
> 
> Yes, but put another way, you have refactored the problem of  
> "incommensurateness" into two more tractable pieces - one about the  
> data structures to convey meaning, the other about the meanings  
> conveyed.  You have also removed the risk of conflating the two  
> problems thereby making both harder.  The UML/XML models are about  
> conveying meanings; the ontologies are about the meanings conveyed.   
> The constraints in the UML/XML models ensure that software can  
> process the data structures correctly.  Violating such a constraint  
> means that the structure is invalid.  The constraints in the 
> ontology  
> are about what we understand about the biology.  Violating a  
> constraint in the ontology means that the meaning is incorrect or  
> even inconsistent.  Getting that relationship between the data  
> structures and meanings clearly defined is a key issue for many  
> standardisation efforts.
> 
> In practice, the ontologies/terminologies/vocabularies are often  
> maintained by different groups than the data structures/exchange  
> formats and there are often requirements to use the same exchange  
> format with different ontologies/terminologies and vice versa.  
> (Analogous problems are common in the medical community.)
> 
> However, factoring the problem in this way does mean that you don't  
> get full interoperability unless you agree on _both_ the data  
> structures/exchange formats and the ontologies/terminologies.  (Or  
> define  mappings and equivalences between them)
> 
> > What I mean is, won't there still be difficulty determining even  
> > approximate semantic equivalency for all of the details of data  
> > provenance - many of which absolutely must be resolved in order to  
> > perform large-scale re-pooling of related observations made in the  
> > context of different studies - even if nearly identical assays/ 
> > instruments/reagents are used?

RE: how to deal with different requirements for experiment self-publishing

2006-07-07 Thread Miller, Michael D (Rosetta)
Title: Message



Hi 
All,
 
"On one end, some researchers want a 
quick and easy way to share an experiment, e.g. simply decompose an experiment 
to hypothesis, data, results, procedure, protocols used, who did it, what 
project it belongs to, etc."
 
Even 
stronger, at a high level this is what a researcher wants to see and can 
comprehend rapidly.  This is why in FuGE, this is what is 
pre-eminent, and the tools in this domain typically allow visualization of 
this level of detail.  The other thing to note is that an Experiment is a 
series of events and products that will only occur once, that is, the production 
of a sample from a cell line is unique, the hybridization of a sample to a chip 
will be unique, the scan, they are all single instances that will never occur 
again.
 
"On the other end of the spectrum, 
some researchers want to describe it with domain-specific terms as detailed as 
possible, e.g using FuGO or BioPAX terms.  In the middle of the 
spectrum,  one may want to describe an experiment in general terms but with 
great details, e.g. using the terms Bill Bug provided from 
BIRN."
 
But 
then as AJ says, the question becomes what is this experiment like, who used 
similar tissue, what other experiment found a set of genes that jumped out like 
in this experiment.  Different labs have a varying number of resources to 
go further into this.  This is why in FuGE there is no preconception of 
what annotation and how much will be used, because, as AJ points out, there are 
varying views and abilities to do the annotation.  If the annotation is 
available, it can be exported into a FuGE document at what ever level is 
available.
 
We 
early on in MAGEv1 found ourselves trying to decide what attributes/associations 
should a BioMaterial like a Cell Line have in the UML model.  We quickly 
listed 50+, cut it back to 20, argued over the ones left out, the ones left in, 
then punted by creating an association to OntologyEntry called 
Characteristics.  In other words we decided it would be better to let 
groups like this work out what and how Characteristics could be annotated so 
that MAGE and now FuGE could concentrate on allowing the sharing of 
experiments.
 
One of 
my interests in this list is to track annotation tools and ontologies.  As 
they mature, if our users are interested, we will use these tools and ontologies 
to enhance our product so that our users can easily annotate their data.  
This will then be reflected in our export and available for others to 
import.  Then, I envision that semantic web tools will be able to take this 
annotation (which I see separate and independent of the semantic web tools 
but available to) and search for interesting correlations.
 
my 
further 2c,
Michael

  
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of AJ 
  ChenSent: Friday, July 07, 2006 12:43 AMTo: 
  public-semweb-lifesci@w3.orgSubject: how to deal with different 
  requirements for experiment self-publishing
  All,>From the discussions so far, I see a whole spectrum of needs 
  for publishing experiment information.  On one end, some researchers want 
  a quick and easy way to share an experiment, e.g. simply decompose an 
  experiment to hypothesis, data, results, procedure, protocols used, who did 
  it, what project it belongs to, etc. On the other end of the spectrum, some 
  researchers want to describe it with domain-specific terms as detailed as 
  possible, e.g using FuGO or BioPAX terms.  In the middle of the 
  spectrum,  one may want to describe an experiment in general terms but 
  with great details, e.g. using the terms Bill Bug provided from 
  BIRN.Because of this diversity of requirements, I think it is not 
  realistic to expect one huge ontology will fit all. I would suggest we think 
  of this task in terms of multiple phases so that incremental progress can be 
  made within short time frame. In the first phase (current phase), we focus on 
  a small ontology that can be used to develop quick and easy tools for 
  self-publishing.  In the next phase, we can add more granularity to it. 
  In the third phase, we may figure out how to bridge this general-purpose 
  ontology to domain-specific ontologies that are developed by other 
  groups.  An alternative approach is to have separate tasks to meet 
  different requirements at the same time.  What do you think? If 
  we take the multi-phase approach, I would suggest further discussions to be 
  focused on the objective of the current phase, i.e. a small and simple 
  ontology.  If anyone likes the multi-task approach, please consider to 
  propose a new task. AJ


RE: ontology specs for self-publishing experiment

2006-07-06 Thread Miller, Michael D (Rosetta)
Title: Message



Hi 
Tim,
 
Essentially the idea of FuGE-OM is that it will be complete in itself as 
a Platform Independent Model (in OMG/MDA terms) and will have a FuGE-ML XML 
schema (a Platform Specific Model--PSM) generated by AndroMDA.  MGED will 
(most likely) provide support in the form of a FuGEstk with most likely Java and 
Perl PSM support.
 
It is 
possible that some group may want an RDF PSM version of 
FuGE!
 
It 
will soon be vetted through a process by PSI, MGED and any interested parties 
and be available for extending into whatever life sciences domains.  PSI 
has extended it for GEL-OM (http://psidev.sourceforge.net/gps/index.html), 
as a great example, and work has started to extend it as 
MAGEv2.
 
FuGE 
provides the underpinnings for describing the flow of material and data as 
protocols are applied, including annotation.  One thing to remember is that 
the ontology support in FuGE is entirely neutral as to what ontologies these 
ontology individuals are referencing--no more information about particular 
ontologies or how ontology classes are related belong in a FuGE derived 
document except the URI to get to the referenced class if it is in an 
existing ontology.  It is expected that applications importing FuGE 
documents will either have or look up the information on these referenced 
ontologies after import if the application wishes to support knowledge based 
tools.  Use of FuGE does not mandate that an application be ontology aware, 
FuGE is a data and annotation exchange specification.
 
It is 
hoped that in the different domains of life sciences that have a need to 
describe experiments/studies/investigations, that FuGE provides a good core 
model to extend into the domain-specific data/material/protocols.  It is 
actually a mistake to mention FuGE development and ontology development as 
needing to go together.  The only real need the FuGE model needs as 
feedback is how well the Ontology Individual support is modeled in UML.  

 
Then, 
I do believe, for best use of the FuGE model and its extensions, great 
ontologies are needed and tools to take these references in a FuGE document to 
go out to the semantic web and make connections and to allow researchers to have 
ontologies to annotate their experiments to be exported.  But FuGE 
development itself doesn't need awareness of this ontology development 
effort.
 
I am 
always reminded of two observations, if one has a hammer, everything looks like 
a nail and anything can be programmed in COBOL.  Not everything, I believe 
is best modeled as an ontology, in particular, as I have said, the real life 
flow of a life science experiment/investigation.  Yes, it can be done but 
it is an awkward stretch.
 
cheers,
Michael
 
Michael Miller Lead Software Developer Rosetta Biosoftware Business 
Unit www.rosettabio.com 

  
  -Original Message-From: William Bug 
  [mailto:[EMAIL PROTECTED] Sent: Thursday, July 06, 2006 
  8:20 AMTo: Tim ClarkCc: Miller, Michael D (Rosetta); 
  Eric Neumann; AJ Chen; w3c semweb hcls; SWAN TeamSubject: Re: 
  ontology specs for self-publishing experiment
  Dear Tim,
  
  I think this is an excellent idea - and comes at a very propitious 
  time.
  I would suggest including 
  participants on the FuGO, PaTO, and EXPO projects as well.
  
  Cheers,
  Bill
  
  
  On Jul 6, 2006, at 9:23 AM, Tim Clark wrote:
  Michael

The FuGE project may have some interesting overlaps with SWAN.  
Current phase of SWAN is focused on construction of annotation and 
publishing tools for semantically characterized hypotheses, claims, 
findings, counterclaims, etc on digital resources in neuromedicine, at the 
community level.  This is planned to be followed by a complementary 
phase involving management and characterization of laboratory results using 
an extension of the same ontology.  

I propose we arrange mutual presentations and discussions to see if any 
synergies exist such that we might take advantage or each others' 
work. 

Best

Tim



--
Tim Clark 617-947-7098 (mobile)

Director of Research Programs
Harvard University Initiative in Innovative 
Computing
60 Oxford Street, Cambridge, MA 02138
http://iic.harvard.edu

Director of Informatics
MassGeneral Institute for Neurodegenerative 
Disease
114 16th Street, Charlestown, MA 02129
http://www.mindinformatics.org
--


On Jul 5, 2006, at 7:38 PM, Miller, Michael D (Rosetta) wrote:

  Hi Eric,
   
  Just wanted to point out how this overlaps with the current FuGE 
  (http://fuge.sourceforge.net/) 
  and FUGO (http://fugo.sourceforge.net/) 
  efforts.  These are focused on systems biology and are intended to 
  p

RE: ontology specs for self-publishing experiment

2006-07-05 Thread Miller, Michael D (Rosetta)
Title: Message



Hi 
Eric,
 
Just 
wanted to point out how this overlaps with the current FuGE (http://fuge.sourceforge.net/) and FUGO 
(http://fugo.sourceforge.net/) 
efforts.  These are focused on systems biology and are intended to provide 
the underpinnings of reporting gene _expression_, gel, mass spec, and -omics 
experiments/investigations.
 
The 
goal of FuGE (Functional Genomic Experiments) is for the most part to 
provide:
 
"a. Publishing 
Protocols
b. Publishing Regants and 
Products
c. Stating the Hypothesis 
(and model using RDF) that is being tested by the experiment; this includes 
which citations are supportive or alternative to ones 
hypothesis
d. Publishing Experimental 
Data (possibly as RDF-OWL aggregates and tables)
e. Articulating the Results and 
Conclusions; specifically, whether the experiment refutes or supports the 
central Hypothesis (most of us agree we cannot 'prove' a hypothesis, only 
disprove it)"
 
But it is a UML based model that will then have an equivalent XML Schema 
generated.  The advantage, I think, this approach has over a pure ontology 
representation is that it better captures the actual work-flow of these 
experiments for the interchange of data and annotation.  That being said, 
the UML model incorporates a way to annotate the class objects with ontology 
Individuals with a reference to the Individual's RDF class and its 
ontology.  The UML model adds the additional semantics of identifiers 
(typically expressed as LSIDs) that allows tying reference 
elements generated in the XML Schema to the full definition of an 
object.  So a biological sample can be fully described in one document then 
referenced by a treatment that incorporates it into a prep.
 
So, 
for instance, typically a hypothesis is specific to the particular 
experiment/investigation.  In FuGE, it is simply a Description class with a 
text attribute associated by a Hypothesis association to the Investigation 
class.  But in the XML document, this specific Description can be annotated 
by references to ontologies that allow hypothesis to be translated to RDF 
upon import.  We used the OMG Ontology Definition Metamodel specification 
mapping of Individuals from OWL/RDF to UML so that these could then be mapped 
back to an OWL/RDF representation for reasoning (http://www.omg.org/ontology/ontology_info.htm#RFIs,RFPs).
 

FUGO 
is intended to become part of the OBO ontologies and FUGO's goal is to provide 
general annotation terms for these type of experiments.
 
cheers,
Michael

Michael Miller 
Lead Software 
Developer Rosetta 
Biosoftware Business Unit www.rosettabio.com 

-Original Message-From: 
[EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Eric 
NeumannSent: Monday, July 03, 2006 6:57 AMTo: AJ 
ChenCc: w3c semweb hclsSubject: Re: ontology specs for 
self-publishing experiment

  
  AJ,
  
  This is a great start, and thanks for taking this on! 
  I would like to see this task force propose a conceptual framework within the 
  two months. It does not have to be final, but I think we need to have others 
  on the list review the ontologies (http://esw.w3.org/topic/HCLS/ScientificPublishingTaskForce?action="">) 
  and requirements (http://esw.w3.org/topic/HCLS/SciPubSPERequirements) you have 
  proposed, ask questions about them, and adjust/expand as 
  needed.
  
  I think there has been good discussions on this topic 
  in the past, and I would also refer folks to the SWAN paper by Gao et 
  al.  http://www.websemanticsjournal.org/ps/pub/2006-17 
  . This work is inline with with what Tim Clark has been proposing to the 
  group, and I think it is a useful model to consider. Perhaps we can combine 
  these efforts and propose a workable (demo anyone?) by the end of 
  summer...
  
  In terms of gathering more Scientific Publishing of 
  Experiments (SPE) requirements, I wanted to list some items that appear to be 
  inter-related and relevant:
  
  1. By Publishing experiments, one must also consider 
  (i.e., include in the ontology):
  a. Publishing Protocols
  b. Publishing Regants and Products
  c. Stating the Hypothesis (and model using RDF) 
  that is being tested by the experiment; this includes which citations are 
  supportive or alternative to ones hypothesis
  d. Publishing Experimental Data (possibly as 
  RDF-OWL aggregates and tables)
  e. Articulating the Results and Conclusions; 
  specifically, whether the experiment refutes or supports the central 
  Hypothesis (most of us agree we cannot 'prove' a hypothesis, only disprove 
  it)
  
  2. Hypotheses should be defined in terms of authorship 
  (ala DC), what the proposed new concepts is, and what (experimental) fact (or 
  claim) is required to support it. It should also refer to earlier hypotheses 
  either by:
  a. extension of an earlier tested and supported 
  hypothesis: refinement
  b. similarity or congruence with another 
  untested hypothesis: supportive
  c. being an alternative to another hypothesi

RE: [rdf] Re: URIs

2006-06-19 Thread Miller, Michael D (Rosetta)

Hi All,

Just to add a quick 2c to what Bill and others have to say about
practical use cases.

Since we at Rosetta Biosoftware develop applications for end users we
are rather neutral to how we can get annotation, we're happy with just
about anything.  The key for us is that our users want the
annotation--then we figure out how to get it, that includes the sequence
annotation from the public sources using whatever parsers we find that
can read their one-off formats, GO, where we have a custom import
because we didn't know enough to deal directly with ontologies, and so
on.

The great part of this effort is to regularize how that annotation can
be obtained, regardless of the type of annotation.  Once there is a
systematic way to get to the whatever annotation, when our users want a
new source of annotation, we can stop worrying about how to get it and
simply tell them to point our application at the annotation using the
(now or future) standard access tools.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of William Bug
> Sent: Monday, June 19, 2006 8:20 AM
> To: John Madden
> Cc: Alan Ruttenberg; w3c semweb hcls
> Subject: Re: [rdf] Re: URIs
> 
> 
> 
> I think this is an excellent reference to work from, when dealing  
> with the issue of URIs in RDF generation & processing.
> 
> As I have always seen it (this is admittedly a the view of an RDF  
> naif), DOIs and LSIDs both seek to fulfill the role one would expect  
> to be played by URIs in the STM literature and biomedical object  
> domains, respectively.
> 
> For those who had the chance to read the paper, I would specifically  
> point to the discussion of the CrossRef & OpenURL projects.  Both  
> relate to how you resolve a DOI tied to very practical Use Cases.   
> One is very much focussed on the commercial issue of dis-ambiguating  
> which journals a given library system has a subscription to.  The  
> goal for this (OpenURL - http://www.exlibrisgroup.com/ 
> sfx_openurl.htm) was to create an infrastructure for publishers (and  
> aggregators) to resolve this issue in a way that is transparent for  
> the user as they click on a link to an article (HTML or PDF).  The  
> SFX system many may be familiar with seeing in the search engines  
> hosted by their library systems.
> 
> CrossRef (http://www.crossref.org/) is more designed to address the  
> core issue on the article of how you both maintain stable 
> pointers to  
> inherently unstable online resources, and also providing a URI-like  
> generic resource pointer which can be resolved to the actual 
> resource  
> the moment a reader clicks on the reference in a bibliography.   
> CrossRef is much more focussed on dealing with the many different  
> scenarios related to the latter task and coming up with a way that -  
> again - from the user's point transparently gets them to the correct  
> resource.  CrossRef the organization seems to pitch 
> themselves as the  
> service designed to de-reference DOIs - which obviously makes the  
> work they've done very relevant to this conversation.
> 
> Clearly, both of these issues are ones the folks from BMC & PLoS can  
> give us some very practical insight into.
> 
> The one major project related to the topic in the article that the  
> author seemed to neglect is the Internet Archive (http:// 
> archive.org).  This is a long standing project (in Internet time,  
> anyway - going back to 1996).  They trawl the entire public net and  
> backup it up as often as possible.  They have massive, 
> robotics-based  
> tape drive systems working round the clock.  The original archive  
> took almost a year to crawl the entire "public" net (it still takes  
> about 2 months to cover everything, though there is a lot of effort  
> they've put into to categorizing the rate at which contents 
> changes -  
> with content having a more rapid turnover getting more frequent  
> observation).  After the end of the 1996 presidential campaign -  
> within weeks, the only source for historians to analyze use of the  
> web in the election was the Internet Archive.  This has continued to  
> be the case for many research projects focussed on the use or and  
> evolution of web content.  The IA has set up to donate 
> periodic dumps  
> to the Library of Congress.  They're technology has greatly improved  
> over the years (they now have PetaByte storage racks and much a much  
> more mature software layer).  Though IA doesn't solve the issue of  
> the "hidden"/dynamic web all that much better than the other search  
> engines (which is the space in which most if not all scientific  
> literature lives), they clearly provide a great utility to difficult  
> to manage mess the HTML web often devolves into.  IA is also  
> intimately involved in the discussions in the library science  
> community on this iss

RE: scientific publishing task force update

2006-06-12 Thread Miller, Michael D (Rosetta)

Hi Kei,

Once an OWL ontology is published there are no typos!  If one wants to
use the ontology, one must conform to the definitions.

cheers,
Michael

> -Original Message-
> From: kei cheung [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 12, 2006 9:33 AM
> To: Miller, Michael D (Rosetta)
> Cc: public-semweb-lifesci@w3.org
> Subject: Re: scientific publishing task force update
> 
> 
> "Eperimental" looks to me like a typo, it should be "Experimental"?
> 
> Cheers,
> 
> -Kei
> 
> Miller, Michael D (Rosetta) wrote:
> 
> >Hi All,
> >
> >  
> >
> >>It might be interesting to compare this with FUGE.
> >>
> >>
> >
> >Yes, there's definite overlap.
> >
> >Also wanted to pass along one of the top-level classes:
> >
> >" >rdf:about="http://www.hozo.jp/owl/EXPOApr19.xml/EperimentalDe
> signTask">"
> >
> >I'm not exactly sure what "Eperimental" means!
> >
> >cheers,
> >Michael
> >
> >  
> >
> >>-Original Message-
> >>From: [EMAIL PROTECTED] 
> >>[mailto:[EMAIL PROTECTED] On Behalf Of 
> kei cheung
> >>Sent: Monday, June 12, 2006 8:07 AM
> >>To: William Bug
> >>Cc: Mark Musen; public-semweb-lifesci@w3.org
> >>Subject: Re: scientific publishing task force update
> >>
> >>
> >>
> >>Hi Bill and Mark et al.,
> >>
> >>I also went the EXPO site 
> (http://sourceforge.net/projects/expo/) and 
> >>found the EXPO ontology in OWL format (I agree that it's 
> >>quite hidden). 
> >>I have unzipped it and make it available at:
> >>
> >>http://twiki.med.yale.edu/kei_web/sw_group/EXPO04-19-06.owl
> >>
> >>It might be interesting to compare this with FUGE.
> >>
> >>Cheers,
> >>
> >>-Kei
> >>
> >>William Bug wrote:
> >>
> >>
> >>
> >>>This was a new one on me too, Mark.  It was posted to 
> Slashdot the  
> >>>other day, and the Sorceforge site the article points to is  
> >>>essentially empty.
> >>>
> >>>http://sourceforge.net/projects/expo/
> >>>
> >>>As you might gather, EXPO is not a very good term to 
> search in all  
> >>>the usual suspect search engines - INSPEC, PubMed, IEEE XPlore,  
> >>>CiteSeer.IST, and Google/Google Scholar.  Only a very few 
> specific  
> >>>studies using EXPO in the title came up in:
> >>>
> >>>PubMed:
> >>>
> >>>CT-expo--a novel program for dose evaluation in CT
> >>>Rofo. 2002 Dec;174(12):1570-6.
> >>>
> >>>
> >>>
> >>> INSPEC:
> >>>
> >>>The extended Poincare generating function type (EXPO)
> >>>
> >>>Extrasolar Planet Observatory (ExPO)
> >>>
> >>>EXPO is the integration of two programs, EXTRA and 
> >>>  
> >>>
> >>SIRPOW.92 and is a  
> >>
> >>
> >>>program for full powder decomposition and crystal structure 
> >>>  
> >>>
> >>solution.
> >>
> >>
> >>>
> >>>ACL Anthology of research papers in Comp. Linguistics
> >>>
> >>>A FORMAL GRAMMAR OF EXPRESSIVENESS FOR SACRED LEGENDS
> >>>acl.ldc.upenn.edu/C/C80/C80-1023.pdf
> >>>
> >>>(an absolutely fascinating manuscript in no way related to this  
> >>>research project)
> >>>
> >>>
> >>>There is certainly much interesting and relevant research 
> >>>  
> >>>
> >>going on in  
> >>
> >>
> >>>this center at the University of Aberystwyth 
> >>>  
> >>>
> >>(http://www.aber.ac.uk/ 
> >>
> >>
> >>>compsci/Research/bio/grants.shtml), but I wasn't able to find an  
> >>>specific reference to EXPO anywhere, though clearly it 
> >>>  
> >>>
> >>could be the  
> >>
> >>
> >>>result of research in any one of several of the projects listed.
> >>>
> >>>In the end, I just gave up.
> >>>
> >>>Cheers,
> >>>Bill
> >>>
> >>>
> >>>On Jun 9, 2006, at 1:29 PM, Mark Musen wrote:
> >>>
> >>>  
> >>>

RE: scientific publishing task force update

2006-06-12 Thread Miller, Michael D (Rosetta)

Hi All,

> It might be interesting to compare this with FUGE.

Yes, there's definite overlap.

Also wanted to pass along one of the top-level classes:

"http://www.hozo.jp/owl/EXPOApr19.xml/EperimentalDesignTask";>"

I'm not exactly sure what "Eperimental" means!

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of kei cheung
> Sent: Monday, June 12, 2006 8:07 AM
> To: William Bug
> Cc: Mark Musen; public-semweb-lifesci@w3.org
> Subject: Re: scientific publishing task force update
> 
> 
> 
> Hi Bill and Mark et al.,
> 
> I also went the EXPO site (http://sourceforge.net/projects/expo/) and 
> found the EXPO ontology in OWL format (I agree that it's 
> quite hidden). 
> I have unzipped it and make it available at:
> 
> http://twiki.med.yale.edu/kei_web/sw_group/EXPO04-19-06.owl
> 
> It might be interesting to compare this with FUGE.
> 
> Cheers,
> 
> -Kei
> 
> William Bug wrote:
> 
> >
> > This was a new one on me too, Mark.  It was posted to Slashdot the  
> > other day, and the Sorceforge site the article points to is  
> > essentially empty.
> >
> > http://sourceforge.net/projects/expo/
> >
> > As you might gather, EXPO is not a very good term to search in all  
> > the usual suspect search engines - INSPEC, PubMed, IEEE XPlore,  
> > CiteSeer.IST, and Google/Google Scholar.  Only a very few specific  
> > studies using EXPO in the title came up in:
> >
> > PubMed:
> >
> > CT-expo--a novel program for dose evaluation in CT
> > Rofo. 2002 Dec;174(12):1570-6.
> >
> >
> >
> >  INSPEC:
> >
> > The extended Poincare generating function type (EXPO)
> >
> > Extrasolar Planet Observatory (ExPO)
> >
> > EXPO is the integration of two programs, EXTRA and 
> SIRPOW.92 and is a  
> > program for full powder decomposition and crystal structure 
> solution.
> >
> >
> >
> > ACL Anthology of research papers in Comp. Linguistics
> >
> > A FORMAL GRAMMAR OF EXPRESSIVENESS FOR SACRED LEGENDS
> > acl.ldc.upenn.edu/C/C80/C80-1023.pdf
> >
> > (an absolutely fascinating manuscript in no way related to this  
> > research project)
> >
> >
> > There is certainly much interesting and relevant research 
> going on in  
> > this center at the University of Aberystwyth 
> (http://www.aber.ac.uk/ 
> > compsci/Research/bio/grants.shtml), but I wasn't able to find an  
> > specific reference to EXPO anywhere, though clearly it 
> could be the  
> > result of research in any one of several of the projects listed.
> >
> > In the end, I just gave up.
> >
> > Cheers,
> > Bill
> >
> >
> > On Jun 9, 2006, at 1:29 PM, Mark Musen wrote:
> >
> >>
> >> On Jun 8, 2006, at 10:09 PM, AJ Chen wrote:
> >>
> >>> The first task is to develop an ontology for self-publishing of  
> >>> experiment. I have proposed a list of objects and properties  
> >>> related to self-publishing experiment. Please download 
> the  attached 
> >>> file under Task Status and review the proposal. Your  
> feedback and 
> >>> comments will be greatly appreciated.  You may also  edit 
> the file 
> >>> directly and email me the edited file.
> >>>
> >>
> >> A colleague just pointed me to this (rather vacuous) 
> article.  Does  
> >> anyone know more about this work?
> >>
> >> http://www.newscientisttech.com/article/dn9288-translator-lets- 
> >> computers-understand-experiments-.html
> >>
> >> Mark
> >>
> >
> > Bill Bug
> > Senior Analyst/Ontological Engineer
> >
> > Laboratory for Bioimaging  & Anatomical Informatics
> > www.neuroterrain.org
> > Department of Neurobiology & Anatomy
> > Drexel University College of Medicine
> > 2900 Queen Lane
> > Philadelphia, PA19129
> > 215 991 8430 (ph)
> > 610 457 0443 (mobile)
> > 215 843 9367 (fax)
> >
> >
> > Please Note: I now have a new email - [EMAIL PROTECTED]
> >
> >
> >
> >
> >
> >
> >
> > This email and any accompany attachments are confidential. This 
> > information is intended solely for the use of the 
> individual to whom 
> > it is addressed. Any review, disclosure, copying, 
> distribution, or use 
> > of this email communication by others is strictly 
> prohibited. If you 
> > are not the intended recipient please notify us immediately by 
> > returning this message to the sender and delete all copies. 
> Thank you 
> > for your cooperation.
> >
> 
> 
> 
> 
> 




RE: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-12 Thread Miller, Michael D (Rosetta)

Hi Marco,

> Moreover, a single chip is is related, for instance to an 
> array design 
> or an extraction protocol, you could talk about an array design, for 
> instance in which studies it has been most used, or the 
> conditions under 
> wich the protocol performs best.

Yes, a chip is related to other objects, but it's not particularly
related to another chip, except by its context and relationship to other
objects (like the ExperimentDesigns it's in).  My argument is that
creating a class 'Array' in some ontology won't give you much more than
the annotations of the objects in the MAGE-ML.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Marco Brandizi
> Sent: Sunday, June 11, 2006 1:26 PM
> To: public-semweb-lifesci@w3.org
> Cc: public-semweb-lifesci@w3.org
> Subject: Re: BioRDF [Telcon]: slides for the UMLS presentation
> 
> 
> 
> Miller, Michael D (Rosetta) wrote:
> 
> > One other issue is that the actual objects in a microarray 
> experiment
> >  are for the most part one offs--i.e. a chip is a chip, it is only 
> > hybridized once, the current investigation is only performed once
> > with a set of chips and so on.  Even the genes that are being
> > investigated, nothing new is being added to them except the
> > interpretation of the data from the investigation.
> > 
> 
> Hi Micheal,
> 
> I am not sure I got it: attaching the interpretation of data 
> should be 
> valuable, as well as finding a good way to model such interpretation. 
> Moreover, a single chip is is related, for instance to an 
> array design 
> or an extraction protocol, you could talk about an array design, for 
> instance in which studies it has been most used, or the 
> conditions under 
> wich the protocol performs best.
> 
> Cheers.
> 
> 
> -- 
> 
> ==
> =
> Marco Brandizi <[EMAIL PROTECTED]>
> http://gca.btbs.unimib.it/brandizi
> 
> 
> 
> 




RE: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-09 Thread Miller, Michael D (Rosetta)

Hi Marco,

I'm hopeful of putting together a usecase for how the annotation of
MAGE-ML documents with ontology terms makes sense for the semantic
web--I usually end by saying that once the ontology information is
loaded into the local data store, knowledge based tools can be used on
them.

It looks like that's where you pick up the ball.  I'm hopeful that Bill
will help round out the use case but I believe your ideas will be
valuable also. 

> - I think that converting the whole MAGE or FUGE into RDF is hard in 
> practice and maybe not so useful. The problems are:

One other issue is that the actual objects in a microarray experiment
are for the most part one offs--i.e. a chip is a chip, it is only
hybridized once, the current investigation is only performed once with a
set of chips and so on.  Even the genes that are being investigated,
nothing new is being added to them except the interpretation of the data
from the investigation.

But as you say, it is, however, useful to annotate these one off objects
with where they fit into the larger world.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Marco Brandizi
> Sent: Friday, June 09, 2006 8:39 AM
> To: public-semweb-lifesci@w3.org
> Subject: Re: BioRDF [Telcon]: slides for the UMLS presentation
> 
> 
> 
> Hi all,
> 
> some notes on the discussion of MAGE/FUGE/FUGO and RDF.
> 
> - I think that converting the whole MAGE or FUGE into RDF is hard in 
> practice and maybe not so useful. The problems are:
> 
>- FUGE and MAGE are object models and should be reviewd in 
> order to 
> provide an OWL modelling. For instance the use of OntologyEntry as a 
> pointer to an ontology term doesn't make much sense in OWL.
> 
>- FUGE/MAGE are used to represent huge quantities of data 
> (an average 
> MAGE-ML file is sized some hundreds MBs) and I am not sure 
> that current 
> technologies would support such requirement.
> 
> - Maybe only some aspects of a Functional Genomics models are really 
> needed in the context of Semantic Web. For instance telling 
> in RDF that 
> an experiment has been performed to study a given disease would be 
> useful, telling to the whole web the concentration value of the 
> application of an extraction protocol maybe is more 
> implementation specific.
> 
> 
> 
> 
> - I am modelling something about microarrays, although my 
> intent is not 
> to convert MAGE and to face its degree of details. I am more 
> interested 
> in a less detailed knowledge representation about Microarrays, and in 
> the management of the knowledge that is achieved from the 
> study of Gene 
> Expression.
> 
> Here an introduction about that:
> http://gca.btbs.unimib.it/brandizi/mysite/phdintro
> 
> My latest version of the ontology (very draft actually), plus 
> some notes 
> about the user interface I am developing:
> 
>http://gca.btbs.unimib.it/brandizi/mysite/phdv1
> 
> 
> Cheers.
> 
> -- 
> 
> ==
> =
> Marco Brandizi <[EMAIL PROTECTED]>
> http://gca.btbs.unimib.it/brandizi
> 
> 
> 
> 




RE: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi Kei,

> My question is that can we use FuGO or FuGO-OM or other ontologies ..

Quick note, there is no FuGO-OM, just FuGO.  There is FuGE-OM (and its
generated FuGE-ML) which is a UML model, quite a different animal.

If you go to the "Available Actions" box there is a "View MAGE-ML"
button.  Click that and you will see a MAGE-ML document that has already
been rather nicely annotated using the MGED Ontology.  What MAGE-OM
does, and this is reflected in the MAGE-ML of course, is allows
references into an ontology to associate a UML class instance in the
MAGE-ML with annotations from an ontology.  You'll see the
 elements for the BioMaterial classes and
ExperimentDesign classes that use  and its nested
elements to reference into an Ontology.  The power of references is that
the entire ontology doesn't have to be used, only the relevant terms
referenced in the XML, and any appropriate ontology can be used, not
just the MGED Ontology.


  
 
http://mged.sourceforge.net/ontologies/MGEDontology.php#ExperimentD
esignType">
  
 
  

  


After the information has been downloaded into a local datastore
(MAGE-ML is intended to only exchange information, not be used
directly), a user can run whatever applications they want on the
information, including ontology inference machines that can go out to
the referenced ontologies.

cheers,
Michael

> -Original Message-
> From: kei cheung [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 06, 2006 10:34 AM
> To: Miller, Michael D (Rosetta)
> Cc: Alan Ruttenberg; [EMAIL PROTECTED]; 
> [EMAIL PROTECTED]; Daniel Rubin; 
> public-semweb-lifesci@w3.org; Matthew Cockerill
> Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> the UMLS presentation
> 
> 
> Hi Michael et al.,
> 
> Karen Skinner (of NIDA) has once pointed me to the following public 
> microarray experiment related to the study of Parkinson's Disease
> 
> http://arrayconsortium.tgen.org/np2/viewProject.do?action=view
> Project&projectId=61
> 
> 
> My question is that can we use FuGO or FuGO-OM or other ontologies 
> (MGED-OWL) to describe this experiment (or part of it)? This 
> might also 
> serve as a Parkinson use case study for the ontology working group.
> 
> Cheers,
> 
> -Kei
> 
> Miller, Michael D (Rosetta) wrote:
> 
> >Hi All,
> >
> >  
> >
> >>Thanks for pointing us to FuGO. To me, it seems like that the FuGO 
> >>community is currently defining an upper ontology that can be 
> >>universally used to describe different types of genomic/proteomic 
> >>experiments including microarray experiments. 
> >>
> >>
> >
> >This, I believe, is their goal.
> >
> >  
> >
> >>but 
> >>there is no 
> >>example use for microarray experiments. 
> >>
> >>
> >
> >To actually describe microarray experiments (and proteomics, 
> etc.), it
> >is expected that this will be done through extensions of 
> FuGE-OM, a UML
> >model that takes experiences from MAGEv1 and PEDRo (and others) to
> >create an object model that describes what is basic to all biological
> >based experiments.  MAGEv2 will extend this model.
> >
> >One goal of FuGO is:
> >
> >"The purpose of this ontology is to support the consistent 
> annotation of
> >functional genomics experiments, regardless of the 
> particular field of
> >study."
> >
> >So it is unlikely at this point that there will be a class
> >"Hybridization" or "FeatureExtraction" in the ontology but there is
> >likely to be terms that describe particular ways of doing a
> >hybridization or feature extraction.
> >
> >FuGO is still at an early stage, see http://fugo.sourceforge.net/ for
> >how someone can contribute.
> >
> >cheers,
> >Michael
> >
> >  
> >
> >>-Original Message-
> >>From: [EMAIL PROTECTED] 
> >>[mailto:[EMAIL PROTECTED] On Behalf Of 
> kei cheung
> >>Sent: Tuesday, June 06, 2006 9:52 AM
> >>To: Alan Ruttenberg
> >>Cc: [EMAIL PROTECTED]; 
> >>[EMAIL PROTECTED]; 'Daniel Rubin'; 
> >>public-semweb-lifesci@w3.org; 'Matthew Cockerill'
> >>Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> >>the UMLS presentation
> >>
> >>
> >>
> >>Hi Alan,
> >>
> >>Thanks for pointing us to FuGO. To me, it seems like that the FuGO 
> >>community is currently defining an upper ontology that can be 
> >>universally used to d

RE: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi Kei,

Not surprising that Gavin is involved there.  I'm not sure of his main
priorities but he might be interested.  You can give it a shot.

cheers,
Michael

> -Original Message-
> From: kei cheung [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 06, 2006 10:18 AM
> To: Miller, Michael D (Rosetta)
> Cc: Alan Ruttenberg; public-semweb-lifesci@w3.org
> Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> the UMLS presentation
> 
> 
> Hi Michael and Larry,
> 
> As I understand it, Gavin Sherlock who is a member of MGED is 
> involved 
> in the ontological/informatic aspect of  the NIH Neuroscience 
> Microarray 
> Consortium. I'm also involved in the informatic aspect of this 
> Consortium. Should we contact Gavin to see if he is also 
> interested in 
> this? Since all of us are very busy and I'm a believer of 
> incrementality, we probably should aim at something small and 
> simple to 
> begin with and work our way up.
> 
> Cheers,
> 
> -Kei
> 
> Miller, Michael D (Rosetta) wrote:
> 
> >Hi Kei and Larry,
> >
> >Well, since I'm on the MGED board and principal editor of 
> the OMG Gene
> >Expression (MAGE) specification, I'm probably good as anyone 
> as a point
> >person.
> >
> >What exactly do you envision that the MGED community can do 
> and how we
> >can work together?  It's a grass-roots organization that in 
> the last few
> >years became an official non-profit, so consortium is a bit 
> too strong a
> >word.  It's an open community so anyone can participate, it 
> is a matter
> >of having the time and desire.
> >
> >One of my problems is that I'm typically very constrained 
> for time, my
> >employees actually expect me to write code from time to 
> time.  As such
> >I've typically just been listening in.  When I do participate, as
> >recently, it is as someone whose main focus is NOT ontologies or the
> >underpinnings of the semantic web but as someone for whom 
> ontologies and
> >the semantic web are but two (important) considerations of the many
> >considerations for the application I work on.
> >
> >cheers,
> >Michael
> >
> >Michael Miller
> >Lead Software Developer
> >Rosetta Biosoftware Business Unit
> >www.rosettabio.com
> >
> >  
> >
> >>-Original Message-
> >>From: kei cheung [mailto:[EMAIL PROTECTED] 
> >>Sent: Tuesday, June 06, 2006 9:41 AM
> >>To: Miller, Michael D (Rosetta)
> >>Cc: Alan Ruttenberg; public-semweb-lifesci@w3.org
> >>Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> >>the UMLS presentation
> >>
> >>
> >>Hi Michael,
> >>
> >>Thanks for sharing your thoughts. MGED still has rooms for 
> >>ontological 
> >>improvement. As Larry suggested, we would have a better luck 
> >>if we can 
> >>work with the MGED consortium.
> >>
> >>Cheers,
> >>
> >>-Kei
> >>
> >>Miller, Michael D (Rosetta) wrote:
> >>
> >>
> >>
> >>>Hi Kei,
> >>>
> >>> 
> >>>
> >>>  
> >>>
> >>>>Is there a converter available which can take existing 
> datasets in 
> >>>>mage-ml format and convert them into mged-owl format?
> >>>>   
> >>>>
> >>>>
> >>>>
> >>>You're asking the wrong person here, but not that I know of.
> >>>
> >>>The reason I am the wrong person is that I don't believe 
> that MAGE-OM
> >>>(from which MAGE-ML, MAGEJava and MAGEPerl is generated) is best
> >>>represented as an ontology.  I believe there is much that 
> >>>  
> >>>
> >>ontologies can
> >>
> >>
> >>>do but the best way to capture the process of performing microarray
> >>>experiment is to call out the actual pipeline process of wet lab
> >>>biologist processing samples, bench technicians performing the
> >>>hybridization and scans, the bioinformaticists 
> interpreting the scan
> >>>data and the overall design of the experiment.  The way 
> >>>  
> >>>
> >>these different
> >>
> >>
> >>>steps relate to each other does not, to my mind, fit best into an
> >>>ontology model, that by calling them out as first class 
> >>>  
> >>>
> >>objects in the
> &

RE: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi All,

> Thanks for pointing us to FuGO. To me, it seems like that the FuGO 
> community is currently defining an upper ontology that can be 
> universally used to describe different types of genomic/proteomic 
> experiments including microarray experiments. 

This, I believe, is their goal.

> but 
> there is no 
> example use for microarray experiments. 

To actually describe microarray experiments (and proteomics, etc.), it
is expected that this will be done through extensions of FuGE-OM, a UML
model that takes experiences from MAGEv1 and PEDRo (and others) to
create an object model that describes what is basic to all biological
based experiments.  MAGEv2 will extend this model.

One goal of FuGO is:

"The purpose of this ontology is to support the consistent annotation of
functional genomics experiments, regardless of the particular field of
study."

So it is unlikely at this point that there will be a class
"Hybridization" or "FeatureExtraction" in the ontology but there is
likely to be terms that describe particular ways of doing a
hybridization or feature extraction.

FuGO is still at an early stage, see http://fugo.sourceforge.net/ for
how someone can contribute.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of kei cheung
> Sent: Tuesday, June 06, 2006 9:52 AM
> To: Alan Ruttenberg
> Cc: [EMAIL PROTECTED]; 
> [EMAIL PROTECTED]; 'Daniel Rubin'; 
> public-semweb-lifesci@w3.org; 'Matthew Cockerill'
> Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> the UMLS presentation
> 
> 
> 
> Hi Alan,
> 
> Thanks for pointing us to FuGO. To me, it seems like that the FuGO 
> community is currently defining an upper ontology that can be 
> universally used to describe different types of genomic/proteomic 
> experiments including microarray experiments. There is a draft OWL 
> version of FuGO 
> (http://fugo.sourceforge.net/ontology/FuGO.owl). A list 
> of use case uses is also shown at: 
> http://fugo.sourceforge.net/ontologyInfo/ontology.php, but 
> there is no 
> example use for microarray experiments. So it might be worthwhile to 
> take a look at how FuGO can be used to describe microarray 
> experiments 
> (at least at a high level).
> 
> Cheers,
> 
> -Kei
> 
> Alan Ruttenberg wrote:
> 
> >
> > On Jun 5, 2006, at 9:15 PM, kc28 wrote:
> >
> >> It might be time to think about how to convert mged ontology or 
> >> mage-ml into RDF/OWL. The following are two related articles:
> >>
> >> http://www.nature.com/msb/journal/v2/n1/full/msb4100052.html
> >> http://www.nature.com/nbt/journal/v23/n9/full/nbt0905-1095.html
> >>
> >> Cheers,
> >>
> >> -Kei
> >
> >
> > As I understand it, this is the nature of the FuGO project: 
> > http://fugo.sourceforge.net/
> > They have an upcoming workshop
> > http://www.ebi.ac.uk/microarray/General/Events/FuGO2006/index.html
> >
> > -Alan
> >
> >
> 
> 
> 
> 
> 




RE: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi Kei and Larry,

Well, since I'm on the MGED board and principal editor of the OMG Gene
Expression (MAGE) specification, I'm probably good as anyone as a point
person.

What exactly do you envision that the MGED community can do and how we
can work together?  It's a grass-roots organization that in the last few
years became an official non-profit, so consortium is a bit too strong a
word.  It's an open community so anyone can participate, it is a matter
of having the time and desire.

One of my problems is that I'm typically very constrained for time, my
employees actually expect me to write code from time to time.  As such
I've typically just been listening in.  When I do participate, as
recently, it is as someone whose main focus is NOT ontologies or the
underpinnings of the semantic web but as someone for whom ontologies and
the semantic web are but two (important) considerations of the many
considerations for the application I work on.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: kei cheung [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 06, 2006 9:41 AM
> To: Miller, Michael D (Rosetta)
> Cc: Alan Ruttenberg; public-semweb-lifesci@w3.org
> Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> the UMLS presentation
> 
> 
> Hi Michael,
> 
> Thanks for sharing your thoughts. MGED still has rooms for 
> ontological 
> improvement. As Larry suggested, we would have a better luck 
> if we can 
> work with the MGED consortium.
> 
> Cheers,
> 
> -Kei
> 
> Miller, Michael D (Rosetta) wrote:
> 
> >Hi Kei,
> >
> >  
> >
> >>Is there a converter available which can take existing datasets in 
> >>mage-ml format and convert them into mged-owl format?
> >>
> >>
> >
> >You're asking the wrong person here, but not that I know of.
> >
> >The reason I am the wrong person is that I don't believe that MAGE-OM
> >(from which MAGE-ML, MAGEJava and MAGEPerl is generated) is best
> >represented as an ontology.  I believe there is much that 
> ontologies can
> >do but the best way to capture the process of performing microarray
> >experiment is to call out the actual pipeline process of wet lab
> >biologist processing samples, bench technicians performing the
> >hybridization and scans, the bioinformaticists interpreting the scan
> >data and the overall design of the experiment.  The way 
> these different
> >steps relate to each other does not, to my mind, fit best into an
> >ontology model, that by calling them out as first class 
> objects in the
> >UML model and modeling their specific associations/relationships, all
> >different from each other and specific to the object).
> >
> >To be able to annotate all these objects with ontology terms is
> >definitely needed but to me a separate piece.
> >
> >Granted, if you, or anyone, wish to generate an OWL model 
> from MAGE-OM,
> >it should be possible and I would certainly be interested in 
> the result.
> >The current code to generate the MAGE-ML, MAGEJava and 
> MAGEPerl provide
> >good examples how to do this
> >(http://mged.sourceforge.net/software/index.php).
> >
> >But one could also generate a MAGECobol implementation.
> >
> >cheers,
> >Michael
> >
> >  
> >
> >>-Original Message-
> >>From: kei cheung [mailto:[EMAIL PROTECTED] 
> >>Sent: Tuesday, June 06, 2006 7:54 AM
> >>To: Miller, Michael D (Rosetta)
> >>Cc: Alan Ruttenberg; public-semweb-lifesci@w3.org
> >>Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> >>the UMLS presentation
> >>
> >>
> >>Hi Michael et al,
> >>
> >>Is there a converter available which can take existing datasets in 
> >>mage-ml format and convert them into mged-owl format?
> >>
> >>Thanks,
> >>
> >>-Kei
> >>
> >>Miller, Michael D (Rosetta) wrote:
> >>
> >>
> >>
> >>>Hi Alan and All,
> >>>
> >>>The MGED Ontology is now available as OWL.  There has been a recent
> >>>revision to correct some of the formal problems such an early
> >>>implementation has had.
> >>>
> >>>http://mged.sourceforge.net/ontologies/MGEDOntology.owl
> >>>
> >>>Also, the FuGO project would love any feedback, thanks for 
> >>>  
> >>>
> >>pointing out
> >>
> >>
> >>>the up

RE: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi Kei,

> Is there a converter available which can take existing datasets in 
> mage-ml format and convert them into mged-owl format?

You're asking the wrong person here, but not that I know of.

The reason I am the wrong person is that I don't believe that MAGE-OM
(from which MAGE-ML, MAGEJava and MAGEPerl is generated) is best
represented as an ontology.  I believe there is much that ontologies can
do but the best way to capture the process of performing microarray
experiment is to call out the actual pipeline process of wet lab
biologist processing samples, bench technicians performing the
hybridization and scans, the bioinformaticists interpreting the scan
data and the overall design of the experiment.  The way these different
steps relate to each other does not, to my mind, fit best into an
ontology model, that by calling them out as first class objects in the
UML model and modeling their specific associations/relationships, all
different from each other and specific to the object).

To be able to annotate all these objects with ontology terms is
definitely needed but to me a separate piece.

Granted, if you, or anyone, wish to generate an OWL model from MAGE-OM,
it should be possible and I would certainly be interested in the result.
The current code to generate the MAGE-ML, MAGEJava and MAGEPerl provide
good examples how to do this
(http://mged.sourceforge.net/software/index.php).

But one could also generate a MAGECobol implementation.

cheers,
Michael

> -Original Message-
> From: kei cheung [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 06, 2006 7:54 AM
> To: Miller, Michael D (Rosetta)
> Cc: Alan Ruttenberg; public-semweb-lifesci@w3.org
> Subject: Re: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for 
> the UMLS presentation
> 
> 
> Hi Michael et al,
> 
> Is there a converter available which can take existing datasets in 
> mage-ml format and convert them into mged-owl format?
> 
> Thanks,
> 
> -Kei
> 
> Miller, Michael D (Rosetta) wrote:
> 
> >Hi Alan and All,
> >
> >The MGED Ontology is now available as OWL.  There has been a recent
> >revision to correct some of the formal problems such an early
> >implementation has had.
> >
> >http://mged.sourceforge.net/ontologies/MGEDOntology.owl
> >
> >Also, the FuGO project would love any feedback, thanks for 
> pointing out
> >the upcoming workshop.
> >
> >cheers,
> >Michael
> >
> >  
> >
> >>-Original Message-
> >>From: [EMAIL PROTECTED] 
> >>[mailto:[EMAIL PROTECTED] On Behalf Of 
> >>Alan Ruttenberg
> >>Sent: Tuesday, June 06, 2006 6:40 AM
> >>To: kc28
> >>Cc: [EMAIL PROTECTED]; 
> >>[EMAIL PROTECTED]; 'Daniel Rubin'; 
> >>public-semweb-lifesci@w3.org; 'Matthew Cockerill'
> >>Subject: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the 
> >>UMLS presentation
> >>
> >>
> >>
> >>On Jun 5, 2006, at 9:15 PM, kc28 wrote:
> >>
> >>
> >>
> >>>It might be time to think about how to convert mged ontology or 
> >>>mage-ml into RDF/OWL. The following are two related articles:
> >>>
> >>>http://www.nature.com/msb/journal/v2/n1/full/msb4100052.html
> >>>http://www.nature.com/nbt/journal/v23/n9/full/nbt0905-1095.html
> >>>
> >>>Cheers,
> >>>
> >>>-Kei
> >>>  
> >>>
> >>As I understand it, this is the nature of the FuGO project: 
> >>http://fugo.sourceforge.net/
> >>They have an upcoming workshop
> >>http://www.ebi.ac.uk/microarray/General/Events/FuGO2006/index.html
> >>
> >>-Alan
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >  
> >
> 
> 
> 
> 




RE: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi Ben,

Thanks for your thoughtful reply.

> 1) Accessible URIs for UMLS terms would make it possible (easier..)  
> for any one to annotate their data through reference to this 
> expertly  
> curated common vocabulary.  This makes it much more likely that  
> people would actually do so - thus enabling serendipitous/automatic  
> data integration down the road.

LSIDs provide this without the need of translating to an imperfect OWL.
Even if there is no true LSID resolver service for UMLS terms, one can
decompose the LSID and use the varying existing URI to access directly
the information on UMLS terms.

> 2) Even without using the semantics of the relationships, the simple  
> fact that certain concepts are somehow associated with each other is  
> excellent fodder for computation and exploration.

Yes, I agree.  This might be the most compelling reason.  But how much
can ontology reasoning tools use the specialized relationships in the
OWL representation to draw inferences?  If it is purely statistical on
the standard OWL relationships, many additional linkages will be missed
and even those that are found, the significance will only be apparent by
going back to the source.

> 3) By providing RDF, it would be possible to automatically merge RDF  
> generated in other projects with the UMLS.

True, and I agree the statistical possibilities of integration expands. 

My main point is what role has these specialized relationships for UMLS
OWL added in the above scenarios?  Why not stick with straight OWL
representation and do any further analysis from the UMLS source?  My
complaint is that the expansion of the OWL representations to have these
special case relationships obscures the benefits of OWL. 

cheers,
Michael

> -Original Message-
> From: Benjamin Good [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 05, 2006 10:45 PM
> To: Miller, Michael D (Rosetta)
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; public-semweb-lifesci
> Subject: Re: BioRDF [Telcon]: slides for the UMLS presentation
> 
> 
> 
> On Jun 5, 2006, at 1:04 PM, Miller, Michael D (Rosetta) wrote:
> 
> > Hi Ben and Matt,
> >
> >> An RDF version of the
> >> UMLS knowledge sources would be seem to be very useful - at
> >> least for
> >> bioinformatics research purposes ...
> >
> > What I was trying to discover was, from a purely RDF/ontology
> > standpoint, what is gained without any other knowledge of UMLS?
> 
> 1) Accessible URIs for UMLS terms would make it possible (easier..)  
> for any one to annotate their data through reference to this 
> expertly  
> curated common vocabulary.  This makes it much more likely that  
> people would actually do so - thus enabling serendipitous/automatic  
> data integration down the road.
> 2) Even without using the semantics of the relationships, the simple  
> fact that certain concepts are somehow associated with each other is  
> excellent fodder for computation and exploration.
> 3) By providing RDF, it would be possible to automatically merge RDF  
> generated in other projects with the UMLS.
> 
> Did I miss any ?
> 
> >
> > This question comes more from my ignorance, probably, but when  
> > knowledge
> > gets buried in specialized tags the tendency is that 
> without going to
> > the source to understand the specialization, what's left is 
> not all  
> > that
> > useful.
> 
> There is no question that you would get maximum value out of a  
> thorough understanding of the meaning of all of the relationships in  
> the RDF and that the farther we can push this understanding into an  
> ontology (or other reasonable representation) the better; 
> but, moving  
> things like the UMLS into RDF and onto the Web IMHO increases their  
> value to the community tremendously and is a necessary step towards  
> actually realizing the semantic web for the life sciences (not A  
> semantic web for the life sciences).
> 
> -Ben
> 
> 
> 
> 
> 




RE: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-06 Thread Miller, Michael D (Rosetta)

Hi Alan and All,

The MGED Ontology is now available as OWL.  There has been a recent
revision to correct some of the formal problems such an early
implementation has had.

http://mged.sourceforge.net/ontologies/MGEDOntology.owl

Also, the FuGO project would love any feedback, thanks for pointing out
the upcoming workshop.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Alan Ruttenberg
> Sent: Tuesday, June 06, 2006 6:40 AM
> To: kc28
> Cc: [EMAIL PROTECTED]; 
> [EMAIL PROTECTED]; 'Daniel Rubin'; 
> public-semweb-lifesci@w3.org; 'Matthew Cockerill'
> Subject: MGED/FuGO. was: Re: BioRDF [Telcon]: slides for the 
> UMLS presentation
> 
> 
> 
> On Jun 5, 2006, at 9:15 PM, kc28 wrote:
> 
> > It might be time to think about how to convert mged ontology or 
> > mage-ml into RDF/OWL. The following are two related articles:
> >
> > http://www.nature.com/msb/journal/v2/n1/full/msb4100052.html
> > http://www.nature.com/nbt/journal/v23/n9/full/nbt0905-1095.html
> >
> > Cheers,
> >
> > -Kei
> 
> As I understand it, this is the nature of the FuGO project: 
> http://fugo.sourceforge.net/
> They have an upcoming workshop
> http://www.ebi.ac.uk/microarray/General/Events/FuGO2006/index.html
> 
> -Alan
> 
> 
> 
> 




RE: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-05 Thread Miller, Michael D (Rosetta)

Hi Ben and Matt,

> An RDF version of the  
> UMLS knowledge sources would be seem to be very useful - at 
> least for  
> bioinformatics research purposes ...  

What I was trying to discover was, from a purely RDF/ontology
standpoint, what is gained without any other knowledge of UMLS?  

This question comes more from my ignorance, probably, but when knowledge
gets buried in specialized tags the tendency is that without going to
the source to understand the specialization, what's left is not all that
useful.

cheers,
Michael  

> -Original Message-
> From: Benjamin Good [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 05, 2006 12:52 PM
> To: Miller, Michael D (Rosetta)
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
> public-semweb-lifesci@w3.org
> Subject: Re: BioRDF [Telcon]: slides for the UMLS presentation
> 
> 
> Hi,
> 
> I would tend to agree with Matt on this one.  An RDF version of the  
> UMLS knowledge sources would be seem to be very useful - at 
> least for  
> bioinformatics research purposes - without the benefits of the  
> "correct" OWL ontology with which to describe the relationships  
> included.
> 
> Though there is no doubt that some ontologies are better than others  
> and that there is clear value in investing in building good ones  
> (e.g. a good OWL representation of the UMLS S.N.), the idea that, as  
> a community,  we will ever reach a consensus for the one 
> right way to  
> interpret the relationships in a resource as broad as the UMLS seems  
> unlikely.  It seems to me that if we really want to see the Semantic  
> Web take off in biology, then the first thing is to get as much RDF  
> data accessible online as we can.  Waiting for the perfect 
> ontologies  
> to emerge seems like a non-starter - we have to be able to handle  
> change as well as multiple perspectives in ontologies, so we 
> might as  
> well get started with something.
> 
> 2 cents
> -Ben
> 
> 
> 
> On Jun 5, 2006, at 10:21 AM, Miller, Michael D (Rosetta) wrote:
> 
> > Hi All,
> >
> >> But presumably the relations which characterize the structure
> >> of UMLS could be given their own URIs, no?
> >> Along with the concepts themselves.
> >>
> >> And then UMLS could then be expressed in RDF, using UMLS
> >> specific relations, rather than standard OWL relations.
> >
> > This, of course, works to a certain extent but brings up an issue  
> > that I
> > think will significantly impact end users and the adoption of
> > ontologies.
> >
> > One can get a rudimentary idea of a annotation source using  
> > standard RDF
> > tools through this sort of representation but that will miss the  
> > actual
> > semantic value that is embedded in domain specific relations.
> >
> >> It would then at least be URI-ified and so connected into the
> >> RDF-universe, and different implementors could experiment
> >> with different approximative mappings to OWL relationships,
> >> for some or all of UMLS, according to their particular needs.
> >
> > The end user is now back to having to go to the source 
> itself, so why
> > bother setting up this approximate mapping?
> >
> > I've noticed a creep from several RDF representations to using these
> > domain specific relations which means needing a one-off parsing of  
> > each
> > one to get the true semantics.
> >
> > This seems to defeat the purpose.
> >
> > cheers,
> > Michael
> >
> > Michael Miller
> > Lead Software Developer
> > Rosetta Biosoftware Business Unit
> > www.rosettabio.com
> >
> >> -Original Message-
> >> From: [EMAIL PROTECTED]
> >> [mailto:[EMAIL PROTECTED] On Behalf Of
> >> [EMAIL PROTECTED]
> >> Sent: Monday, June 05, 2006 10:06 AM
> >> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> >> Cc: public-semweb-lifesci@w3.org
> >> Subject: RE: BioRDF [Telcon]: slides for the UMLS presentation
> >>
> >>
> >>
> >> But presumably the relations which characterize the structure
> >> of UMLS could be given their own URIs, no?
> >> Along with the concepts themselves.
> >>
> >> And then UMLS could then be expressed in RDF, using UMLS
> >> specific relations, rather than standard OWL relations.
> >>
> >> It would then at least be URI-ified and so connected into the
> >> RDF-universe, and different implementors could experiment
> >> with different approximative mappings to OWL relationships,
> >> for some or all of UMLS, according to the

RE: BioRDF [Telcon]: slides for the UMLS presentation

2006-06-05 Thread Miller, Michael D (Rosetta)

Hi All,

> But presumably the relations which characterize the structure 
> of UMLS could be given their own URIs, no?
> Along with the concepts themselves.
> 
> And then UMLS could then be expressed in RDF, using UMLS 
> specific relations, rather than standard OWL relations.

This, of course, works to a certain extent but brings up an issue that I
think will significantly impact end users and the adoption of
ontologies.

One can get a rudimentary idea of a annotation source using standard RDF
tools through this sort of representation but that will miss the actual
semantic value that is embedded in domain specific relations.

> It would then at least be URI-ified and so connected into the 
> RDF-universe, and different implementors could experiment 
> with different approximative mappings to OWL relationships, 
> for some or all of UMLS, according to their particular needs.

The end user is now back to having to go to the source itself, so why
bother setting up this approximate mapping?

I've noticed a creep from several RDF representations to using these
domain specific relations which means needing a one-off parsing of each
one to get the true semantics.

This seems to defeat the purpose.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> [EMAIL PROTECTED]
> Sent: Monday, June 05, 2006 10:06 AM
> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Cc: public-semweb-lifesci@w3.org
> Subject: RE: BioRDF [Telcon]: slides for the UMLS presentation
> 
> 
> 
> But presumably the relations which characterize the structure 
> of UMLS could be given their own URIs, no?
> Along with the concepts themselves.
> 
> And then UMLS could then be expressed in RDF, using UMLS 
> specific relations, rather than standard OWL relations.
> 
> It would then at least be URI-ified and so connected into the 
> RDF-universe, and different implementors could experiment 
> with different approximative mappings to OWL relationships, 
> for some or all of UMLS, according to their particular needs.
> 
> There might, for example, be some well defined bits of UMLS 
> where the relations can be reasonably mapped to is_a and 
> part_of relations.
> 
> Matt
> 
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] Behalf Of Olivier
> > Bodenreider
> > Sent: 05 June 2006 18:00
> > To: Benjamin Good
> > Cc: 'public-semweb-lifesci'
> > Subject: Re: BioRDF [Telcon]: slides for the UMLS presentation
> > 
> > 
> > 
> > Benjamin Good wrote:
> > > Are there any plans to release the UMLS or parts thereof as 
> > RDF / OWL ?
> > Not to my knowledge, Ben. And I certainly would be very 
> > cautious of any 
> > attempt to doing it. The main reason is that many relations 
> used for 
> > creating hierarchies in biomedical vocabularies are not true 
> > hierarchical relations (isa, part_of), but simply reflect 
> the purpose 
> > for which these terminologies were created. For example, it 
> > makes sense 
> > in MeSH (i.e., for information retrieval) to have "accident 
> > prevention" 
> > listed as a child of "accidents". It would be wrong to assume 
> > that all 
> > child_of relations can be represented by subclassof 
> relations. And an 
> > accurate representation of MeSH in OWL would be difficult to obtain.
> > 
> > -- Olivier
> > 
> > 
> This email has been scanned by Postini.
> For more information please visit http://www.postini.com
> 
> 
> 
> 
> 




RE: Use of LSID's "in the wild"

2006-06-05 Thread Miller, Michael D (Rosetta)

Hi Mark,

Just to add a bit to Steve's reply...

> > If the only thing that comes from the LSID spec is a notion of an  
> > identifier
> > syntax that becomes widely adopted by bioinformatics data 
> providers, it
> > would be a huge success, ...

In the gene expression domain, best practice states that with-in MAGE-ML
documents that an LSID like syntax be used for identifiers.  Although it
is still ad hoc to resolve these, for well-know data sources such as
Affymetrix, Agilent, GenBank, ArrayExpress, etc., when parsing through a
MAGE document, one can at least decompose the NamingAuthority, Namespace
and ObjectID, and based on the NamingAuthority convert the information
into the 'native' URI for that NamingAuthority.

This has proved very useful to a number of consumers of MAGE-ML.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Mark Wilkinson
> Sent: Friday, June 02, 2006 6:42 PM
> To: Steve Chervitz; public-semweb-lifesci@w3.org
> Subject: Re: Use of LSID's "in the wild"
> 
> 
> 
> Excellent!  Thanks Steve!
> 
> M
> 
> 
> On Fri, 02 Jun 2006 18:34:19 -0700, Steve Chervitz  
> <[EMAIL PROTECTED]> wrote:
> 
> >
> >
> > Hi Mark:
> >
> >> I'm writing a manuscript at the moment where I discuss 
> LSIDs, and I'm
> >> trying to get a sense of how many people are using them 
> "in the wild".
> >> I know that biopathways has set up a lot of "proxy" LSID 
> resolvers, but
> >> that's kinda cheating :-)  I'm wondering who is actually 
> using the LSID
> >> standard in a production environment.  I know that BioMOBY and
> >> myGrid/Tverna both use LSIDs, but who else?
> >
> > Here's a paper that describes the use of LSIDs by three 
> other early  
> > adopters
> > besides myGrid and BioMoby. It's co-authored by one of the 
> authors of the
> > LSID spec (Sean Martin):
> >
> > The impact of Life Science Identifiers on Informatics data
> > 
>
http://www.broad.mit.edu/cgi-bin/cancer/publications/pub_paper.cgi?mode=
view&paper_id=126
> >
> > When you ask, "who is using the LSID standard?" you might want to
> > differentiate between organizations who have adopted LSID 
> syntax vs those
> > who are also hosting and maintaining LSID resolution services. The  
> > latter is
> > a much bigger cookie to swallow, but the former is still a 
> step forward.
> >
> > If the only thing that comes from the LSID spec is a notion of an  
> > identifier
> > syntax that becomes widely adopted by bioinformatics data 
> providers, it
> > would be a huge success, which was also noted in the 
> conclusion of an IBM
> > article you may have seen:
> > http://www-128.ibm.com/developerworks/webservices/library/os-lsid2/
> >
> > Another useful distinction to make when considering who is 
> using LSIDs is
> > whether they are data providers or application providers 
> (or both).  
> > Ideally,
> > you'd like to see the application providers using LSIDs 
> that are created  
> > and
> > managed by the data providers, rather than used only 
> internally within  
> > the
> > application.
> >
> > Here are some LSID users that would fall into the data 
> provider category:
> >
> > * The HapMap project. I don't know if they provide a 
> resolution service.
> > Search for 'LSID' on this page:
> > http://www.hapmap.org/downloads/index.html.en
> >
> > * Affymetrix uses an LSID-like syntax in its MAGE-ML formatted files
> > containing NetAffx annotations of the sequences used in 
> array designs.
> > Here's a snippet from one such file:
> >
> > 
> >   
> >  > identifier="Affymetrix.com:Transcript:HG-U133A.1007_s_at"
> >  
> >
> > It's not a true LSID since it doesn't begin with 
> 'urn:lsid'. Affy doesn't
> > host an LSID resolver or provide any sort of lookup service 
> using these  
> > ids.
> > I summarized some of the issues we ran into here (see the 
> section titled
> > 'LSIDs and Content Negotiation' in particular):
> >
> > 
> http://lists.w3.org/Archives/Public/public-swls-ws/2004Nov/att
-/Affy_Sem
> Web-LifeSci_position_paper.pdf
>
> * Pseudogene.org. Don't know if they offer a resolution service:
>
http://www.pseudogene.org/cgi-bin/set-results.cgi?tax_id=9606&all=View+A
ll+S
> ets&criterion0=&operator0=&searchValue0=&sort=0&output=html
>
> The following would fall in the application provider category of LSID

> users.
> While these may not fully qualify as "in the wild", one could ask: How
> widely are these apps being used by other parties, either in an R&D or
> production setting?
>
> * BioPathways Consortium (as you mention above).
> http://lsid.biopathways.org/authorities.shtml
>
> * Intellidimensions's RDF Gateway and Eric Jain's UniProt RDF project:
> http://labs.intellidimension.com/uniprot/query.rsp?q=10
> http://expasy3.isb-sib.ch/~ejain//rdf/migration.html
>
> * KIM. See Sean Martin's presentation from the W3C meeting. It
describes  
> a
> system that ma

RE: [BiONT][BioRDF] Mussels

2006-05-04 Thread Miller, Michael D (Rosetta)

Hi Alan,

Interesting use case.

> are often matches at this level of description. The downside is that  
> you lose the superclass relations that you have in taxonomy, 

One would think that the intersection/merging of the results from a
query to the wikipedia and a query from taxonomy might do this?

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Alan Ruttenberg
> Sent: Wednesday, May 03, 2006 10:42 PM
> To: public-semweb-lifesci@w3.org
> Subject: Re: [BiONT][BioRDF] Mussels
> 
> 
> 
> Another thought is to use  wikipedia URL's as the identifier - there  
> are often matches at this level of description. The downside is that  
> you lose the superclass relations that you have in taxonomy, 
> e.g. the  
> ability to query for mammal, and get back all the primates, mice,  
> rats, etc.
> 
> e.g. http://en.wikipedia.org/wiki/Mussel
> 
> -Alan
> 
> On May 4, 2006, at 1:29 AM, Alan Ruttenberg wrote:
> >
> > I am inclined to create a class which is the union of all these  
> > classes and then annotate the antibody with that class.
> 
> 
> 
> 




RE: 44-52 That';s the Number

2006-04-11 Thread Miller, Michael D (Rosetta)

Hi All,

> 2. Squeezing legacy identifiers into LSIDs can be tricky; some life 
> sciences databases use colons in their identifiers (GO and MGD), or 
> separate version numbers with dots (EMBL).

One can use the URI escape sequence syntax for these.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Eric Jain
> Sent: Tuesday, April 11, 2006 5:37 AM
> To: Phillip Lord
> Cc: public-semweb-lifesci@w3.org
> Subject: Re: 44-52 That';s the Number
> 
> 
> 
> Phillip Lord wrote:
> >   TH> Background: The "info" URI scheme is a means of grandfathering
> >   TH> legacy namespaces onto the Web in their own right (e.g. PubMed
> >   TH> identifiers, ADS bibcodes, etc., etc.). Many Web applications
> >   TH> expect identifiers to be packaged as URIs (Uniform Resource
> >   TH> Identifiers) and "info" fulfils that need.
> > 
> > So do LSID's. Would you like to comment and 
> advantages/disadvantages? 
> 
> Three reasons I can think of:
> 
> 1. People may be reluctant to use something called "life sciences 
> identifiers" in a non-life-sciences context (even if they 
> could see that 
> there isn't anything life-sciences-specific about these identifiers).
> 
> 2. Squeezing legacy identifiers into LSIDs can be tricky; some life 
> sciences databases use colons in their identifiers (GO and MGD), or 
> separate version numbers with dots (EMBL).
> 
> 3. LSIDs are a bit verbose, and once you provide an LSID you may be 
> expected to implement the entire WS resolution stack, which 
> some people may 
> not consider worth the trouble.
> 
> 
> 
> 




RE: [BioRDF] Scalability

2006-04-04 Thread Miller, Michael D (Rosetta)

Hi Roger,

I believe I can provide some comfort for the scalability issue with our
experience with MAGE-ML.

One thing that greatly alleviates the problem is to use compress
writers/readers (Java provides nice ones), for regularly formatted XML
this can compress to 2-10% the original size.

> 3 - Limit the amount of information that is actually put into RDF to
> some sort of descriptive metadata and keep pointers to the real data,
> which is in some other format.

MAGE-ML has the ability to reference the external data from the
microarray feature extractor software and this has worked well.  Also,
the information can be broken into several files and use references in
one file to the actual definition in another file.

But, surprisingly, even uncompressed large XML files have not been all
that much of an issue.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Cutler, Roger (RogerCutler)
> Sent: Tuesday, April 04, 2006 9:35 AM
> To: public-semweb-lifesci@w3.org
> Subject: [BioRDF] Scalability
> 
> 
> 
> Somewhere down near the bottom of the lengthy thread that 
> started with a
> query about ontology editors, someone casually mentioned that 
> 53 Mby of
> data that was "imported" -- from which I infer it was not binary,
> compressed data but in some sort of text format -- turned 
> into over 800
> Mby of RDF.  Frankly, a factor of 15 in size, possibly from a format
> that is fairly large to start out with, worries me.  There have since
> been some comments that sound like people think that they are going to
> deal with this by generating RDF only on-the-fly, as needed.  It seems
> to me, given the networked nature of RDF, that this is likely to have
> its own problems.  None of the solutions of which I am aware that
> actually are in operation work this way, but I will freely 
> admit that my
> experience level here is pretty low.
> 
> It seems to me that there are at least three ways that one 
> might try to
> cope with this issue:
> 
> 1 - Generate the RDF on-the-fly (as I said, I'm personally 
> dubious about
> this one).
> 
> 2 - Make the RDF smaller somehow (maybe by making the URI's shorter, a
> al tinyurl???)
> 
> 3 - Limit the amount of information that is actually put into RDF to
> some sort of descriptive metadata and keep pointers to the real data,
> which is in some other format.
> 
> I think that the third approach is what I have seen done, but 
> I get the
> impression that people may not be thinking in this way in this group.
> 
> I've prefaced this [BioRDF] because there has already been some
> discussion of scalability in that context and I believe that 
> this issue
> has recently been upgraded in the deliverables of this subgroup.
> 
> Incidentally, what happened to the BioRDF telcons on Monday?  I was on
> vacation for a while and when I came back it didn't seem to be there.
> 
> 
> 
> 




RE: Apply Ontology Automatically (was: Ontology editor + why RDF?)

2006-04-03 Thread Miller, Michael D (Rosetta)
Hi All,

I've also (as I think I've said) found OMG proposed specification
"Ontology Definition Metamodel" very useful for thinking of practical
ways of dealing with Ontologies, especially for referencing from the
non-onotolgy UML FuGE (Functional Genomics) model.  I've included an
image of the class diagram.

All the classes in FuGE derive from Describable so have an 0..n
association to OntologyTerm.  The desired design is to associate Objects
in FuGE with Individuals, not directly with Ontology Classes.

So for a rather contrived example, a garage object might be associated
to two cars described in the following example:

Imagine a vehicle Ontology and the class Car, a Car has_roof which
points to a class Roof, has_engine which points to a class Engine, and
has_hemi which is true or false but has no default value and is
optional.

In this example the ontology itself defines no Individuals.

So an application might define an Individual of Mustang whose has_roof
slot points to a Roof Individual of Canvas/Retractable, a has_engine
slot that points to an Engine Individual of 425cc and has_hemi which
takes the value true.

It also might define an Individual of Camry whose has_roof slot points
to a Roof Individual of Metal, a has_engine slot that points to an
Engine Individual of 245cc and does not define a has_hemi value.

Based on the XML that would be generated for these classes, it becomes:




  
  



  
  



  
  

  
  



  
  

  
  







  



  
  


  
  



  
  

  
  



  
  

  
  








(FuGE has an Identifiable class from which the identifier attribute is
inherited, the *_ref are typically URI LSID-like constructions which
have the added semantics that they are references to the actual object,
which may or may not be in the current document but should be resolvable
somewhere in the real world)

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> [EMAIL PROTECTED]
> Sent: Monday, April 03, 2006 8:52 AM
> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Cc: public-semweb-lifesci@w3.org
> Subject: RE: Apply Ontology Automatically (was: Ontology 
> editor + why RDF?)
> 
> 
> 
> Here's my favorite example of useful automated ontology 
> application, achieved by combining two readily available technologies:
> 
> http://www.hackdiary.com/archives/70.html
> 
> Matt
> 
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] Behalf Of Internet
> > Business Logic
> > Sent: 03 April 2006 16:44
> > To: Phillip Lord
> > Cc: public-semweb-lifesci@w3.org
> > Subject: Re: Apply Ontology Automatically (was: Ontology 
> editor + why
> > RDF?)
> > 
> > 
> > 
> > Phillip --
> > 
> > You wrote (below) "ability ... to be able to apply the 
> > ontology automatically in some circumstances"
> > 
> > This could be the major selling point.  Otherwise, the value 
> > of the ontology depends on how well programmers read, 
> > understand, and use it.  And, if they did that well, was it 
> > their value-add, not that of the ontology?
> > 
> > Do you have examples in which an ontology has been applied 
> > automatically to do a significant real world task?
> > 
> > (Questions intended constructively).
> > 
> >  Thanks-- Adrian Walker
> > 
> > -- 
> > 
> > Internet Business Logic (R)
> > Executable open vocabulary English
> > Online at www.reengineeringllc.com
> > Shared use is free
> > 
> > Reengineering,  PO Box 1412,  Bristol,  CT 06011-1412,  USA
> > 
> > Phone 860 583 9677 Mobile 860 830 2085 Fax 860 314 1029
> > 
> > 
> > 
> > 
> > Phillip Lord wrote:
> > 
> > >>"Anita" == deWaard, Anita (ELS) 
> <[EMAIL PROTECTED]> writes:
> > >>
> > >>
> > > 
> > >  Anita> I am reminded of a saying on a Dutch proverb calendar: "If
> > >  Anita> love is the answer, could you please repeat the 
> > question?" If
> > >  Anita> semantics are the answer - what is the problem 
> that is being
> > >  Anita> solved, in a way no other technology lets you? b
> > >
> > >To be honest, I think that this is a recipe of despair; I 
> don't think
> > >that there is any one thing that SW enables you do to that 
> could not
> > >do in another way. It's a question of whether you can do 
> things more
> > >conveniently, or with more commonality than other wise; 
> > after all, XML
> > >is just an extensible syntax and, indeed, could do exactly nothing
> > >that SGML could not do (when it came out -- XML standards 
> exceed SGML
> > >ones now). XML has still been successful. 
> > >
> > >It's more a question of whether, RDF or OWL provides a 
> combination of
> > >things that we would not get otherwise. With OWL (DL and lite), I
> > >rather like the ability to check my model with a reasoner, 
> and to be
> > >able to apply the ontology automatically in some 
> circums

RE: Ontology editor + why RDF?

2006-03-31 Thread Miller, Michael D (Rosetta)
Title: Message



Hi Jim 
and Vipul,
 
"If all three of us published to the 
Web, and used common URIs (or a third party expressed equivalences) then the 
system as a whole would have the information..."
 
Just 
to echo what Jim is saying, the Life Science Identifier (LSID), a type of 
URI, basically came to be based on the experience and need in the MAGE 
specification for such a common identifier.  This has helped tremendously 
already in linking information between gene _expression_ experiments, particularly 
in referencing the MGED Ontology and GO.
 
cheers,
Michael
 
Michael Miller Lead Software Developer Rosetta Biosoftware Business 
Unit www.rosettabio.com 

  
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Jim 
  HendlerSent: Friday, March 31, 2006 9:46 AMTo: Kashyap, 
  Vipul; Danny AyersCc: 
  public-semweb-lifesci@w3.orgSubject: RE: Ontology editor + why 
  RDF?
  Vipul - not sure this is best thread for this whole discussion, but 
  here's a quick answer and if you want longer, I can point you to various 
  things starting from the Scientific American article [1] and also an article 
  on integrating applications on the Web that Tim, Eric and I did, originally 
  published in Japanese but available in English at [2].  More detailed 
  technical stuff is also available if you want, but you should be familiar with 
  that literature...
   The point your missing, which I've been making for years, is that 
  on the Web "a little semantics goes a long way" -- here's a simple 
  example.  If you have a Database that says you live in Boston, I have one 
  that says I live near BWI airport, and Danny has one that has tables of 
  distances between airports and cities, but these all use their own terminology 
  and live in their own boxes, then none of the three of us (nor any third 
  party) would know how far apart you and I live.  If all three of us 
  published to the Web, and used common URIs (or a third party expressed 
  equivalences) then the system as a whole would have the information - so there 
  would be Semantics available on the Web that would not be available before -- 
  the "sameas" type information (expressed through same URI names) is very 
  powerful, even before you start worrying about the next levels of 
  semantics.
   I'm not arguing that expressive semantics is bad, it is very 
  valuable in some applications, but the traditional AI community often ignores 
  the importance of breadth, despite Google rubbing our noses in it every 
  day.   Even more important, once the data is on the Web in RDF, it 
  can be INCREMENTALLY extended, by the original provider or by third parties, 
  in ways that do add the expressivity - something not doable when the 
  datasources are not Web accessible.
   Here's another way to think about it - on the Web my documents can 
  point to your documents.  However, my databases (or their schemas) cannot 
  point at elements in your databases. my thesaurus cannot point to words in 
  your thesaurus, etc.   The Web showed us that the network effect is 
  unbelievably powerful, and we need to be able to use that power for data, 
  terminologies, ontologies and the rest. 
   -Jim H.
  p.s. You might also want to check out my article "Knowledge is power: the 
  view from the Semantic Web" which appeared in AI Magazine in the January 06 
  issue.  It's not available on line free to non-AAAI members, but I can 
  send you a preprint version if you'd like - it's aimed at explaining the value 
  of the linking stuff to the applied AI audience.
  
  [1] 
  http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21
  [2] http://www.w3.org/2002/07/swint
  
  
  At 8:08 -0500 3/31/06, Kashyap, Vipul wrote:
  >> I saw a quote not long ago, not sure of the source (recognise 
  this>> Jim?), approximately: "what's new about the Semantic Web 
  isn't the>> semantics but the web".>>[VK] This is a 
  great quote and expresses clearly that the value 
  proposition> in representing and linking 
  vocabularies using URIs stems from the> Web 
  more than "semantics">>> I take VK's point that this in 
  itself isn't going to convince many IT>> folks. I think the big 
  persuader there is data integration, even on a>> sub-enterprise kind 
  of scale.>>[VK] Agreed, one of the clearer value propositions is 
  data integration.>>> Being able to use ontologies to infer 
  new information is a massive>> plus (I imagine especially in the 
  lifesciences). Bigger still are the>> (anticipated) benefits of the 
  Semantic Web when the network effect>> kicks in. But the ability to 
  use RDF to simply merge data from>> multiple sources consistently 
  (and query across it), without needing>> complete up-front schema 
  design is a very immediate, tangible gain. The work 
  done around SKOS (and specific tasks like expressing WordNet>> in 
  RDF) does suggest RDF/OWL is a particularly good technology choi

RE: [BIONT] Parkinson's Disease Use Cases

2006-03-21 Thread Miller, Michael D (Rosetta)
Title: Message



Hi 
Vipul,
 
Here's 
a paper at EBI that illustrates some of what they are thinking about in terms 
of the semantic web.
 
http://www.ebi.ac.uk/mygrid/mygrid.pdf
 
the 
myGrid home page (which I'm sure people are familiar with)
 
http://www.mygrid.org.uk/
 
When I 
was at a MGED MAGE programming jamboree a while back, one of the people there 
wrote, in a few hours, a simple application that he hooked up to myGrid that 
took an MGED Ontology class and returned the individuals.  He wrote a 
client and viola.
 
cheers,
Michael

Michael Miller 
Lead Software 
Developer Rosetta 
Biosoftware Business Unit www.rosettabio.com 


  
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Kashyap, 
  VipulSent: Tuesday, March 21, 2006 5:53 AMTo: 
  public-semweb-lifesciSubject: FW: [BIONT] Parkinson's Disease Use 
  Cases
  
  We will be discussing 
  Don Doherty’s use  case in the BIONT Teleconference 
  today.
  (attached with this 
  e-mail)
   
  SW_HCLS(Bio-Ont 
  WG)SW Life Sciences 
  IGTuesdays11:00am-12:00pm/16:00-17:00 UTCZakim Bridge +1.617.761.6200, 
  conference 24668 ("BIONT")
   
  Agenda:
   
  Donald Doherty and Xiaoshu Wang 
  will be presenting focused use
  case on the information needs of a 
  biomedical researcher and clinical
  practitioner in the neuroscience 
  domain.
   
   
  ---Vipul
   
  
  ===
  Vipul Kashyap, 
  Ph.D.
  Senior Medical 
  Informatician
  Clinical Informatics 
  R&D, Partners HealthCare System
  Phone: 
  (781)416-9254
  Cell: 
  (617)943-7120
  http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik
   
  To keep up you need 
  the right answers; to get ahead you need the right 
  questions
  ---John Browning and 
  Spencer Reiss, Wired 6.04.95
  
  
  
  
  From: Donald 
  Doherty [mailto:[EMAIL PROTECTED] Sent: Monday, March 20, 2006 11:45 
  PMTo: Kashyap, 
  VipulSubject: [BIONT] 
  Parkinson's Disease Use Cases
   
  Here are the first baby 
  steps…
   
  Don
   
  -
  Donald Doherty, 
  Ph.D.Brainstage Research, 
Inc.412-478-4552
   


RE: [BioRDF] UML/RDF [Was: Meeting Notes Feb 27, 2006]

2006-03-02 Thread Miller, Michael D (Rosetta)

Hi All,

For me, my interest was in the transformation of an RDF XML document to
the equivalent (to some measure of equivalent) UML XML document and back
to describe/define an ontology.

The practical problem with the RDF XML notation we ran into for our use
is that many of the XML element names are dependent on the class and/or
property names in the ontology, so every ontology would need to have its
own XML Schema to validate the document. 

Our use case for MAGE-ML (generated from the MAGE-OM UML model) is to
simply allow creating and/or referencing individuals based on arbitrary
ontology classes (http://mged.sourceforge.net/).  Not arbitrary to the
user but arbitrary as far as applications based on MAGE.  So we needed a
formulation in our model that was not dependent on RDF specific XML
schemas.

Since we have no interest in incorporating entire ontologies, only
referencing, our needs are a bit simpler than probably a lot of other
folks.  In fact, it was only the Individual diagram from an earlier
draft, which has been deprecated but still described in "17.2.4
InstanceSpecification", that is the basis for what will be used in the
FuGE-OM UML model follow-up to MAGE-OM (http://fuge.sourceforge.net/).

The specification is also useful for thinking how arbitrary ontologies
can be referenced from relational or object databases without needing
foreknowledge of particular ontologies.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Cutler, Roger (RogerCutler)
> Sent: Thursday, March 02, 2006 7:34 AM
> To: [EMAIL PROTECTED]; public-semweb-lifesci
> Subject: [BioRDF] UML/RDF [Was: Meeting Notes Feb 27, 2006]
> 
> 
> 
> This lengthy and dense, but generally very well written, document goes
> into extreme detail as to the similarities and differences between UML
> and "RDF" (which I take to be a blanket term encompassing 
> other Semantic
> Web specs).  I'm not by any means an expert, but it looked to me like
> there are very significant overlaps in both detail and general
> philosophy of approach and implementation.  I got the impression, but
> this is by no means spelled out, that this document is an artifact of
> the process of the UML folk getting "on board" the SW train.  It
> certainly represents a very impressive amount of effort in 
> preparation. 
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoshu Wang
> Sent: Wednesday, March 01, 2006 1:24 PM
> To: 'public-semweb-lifesci'
> Subject: RE: [BioRDF] Meeting Notes Feb 27, 2006
> 
> 
> - Michael,
> 
> > http://www.omg.org/docs/ad/05-09-08.pdf
> > 
> > Because it is still being evaluated, this might only be 
> available from
> 
> > the OMG website to OMG members.
> 
> Thanks.  I was able to get the document.  Though haven't read through
> yet, it is too long ( closed to 300 pages ) but I spotted one heading
> from the
> document:
> 
> 8.2 Why Not Simply Use or Extend the UML 2.0 Metamodel?
> 
> To me, it implies that 
> 
> RDF != UML 4.0? 
> 
> Xiaoshu
> 
> 
> 
> 
> 
> 
> 




RE: [BioRDF] Meeting Notes Feb 27, 2006

2006-03-01 Thread Miller, Michael D (Rosetta)

Hi Xiaoshu,

http://www.omg.org/docs/ad/05-09-08.pdf

Because it is still being evaluated, this might only be available from
the OMG website to OMG members.

It is a very thorough document, well thought out.  They worked with many
of the W3C OWL and RDF experts.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Xiaoshu Wang
> Sent: Wednesday, March 01, 2006 9:17 AM
> To: 'public-semweb-lifesci'
> Subject: RE: [BioRDF] Meeting Notes Feb 27, 2006
> 
> 
> 
> > > So RDF = UML 4.0? :-)
> > 
> > At the OMG there is a proposed Ontology Definition Metamodel 
> > that IBM and SandPiper are submitting that takes an excellent 
> > look at this issue.
> 
> Do you have the URI for that.  I am actually kind of 
> suspicious of, but of
> course anxious to see, the validity of the approach.  
> Although OMG is trying
> to avoid being tied with Objected-Oriented.  But UML is based 
> on OO, which
> has a closed-world semantics.  I wonder how it can handle the 
> open-world
> semantics without breaking existing OO paradigm.
> 
> Xiaoshu 
> 
> 
> 
> 




RE: [BioRDF] Meeting Notes Feb 27, 2006

2006-03-01 Thread Miller, Michael D (Rosetta)

Hi All,

> So RDF = UML 4.0? :-)

At the OMG there is a proposed Ontology Definition Metamodel that IBM
and SandPiper are submitting that takes an excellent look at this issue.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Eric Jain
> Sent: Wednesday, March 01, 2006 6:38 AM
> To: public-semweb-lifesci
> Subject: Re: [BioRDF] Meeting Notes Feb 27, 2006
> 
> 
> 
> Tom Stambaugh wrote:
> > It seems to me that RDF helps us describe and model the 
> structure of our 
> > data. In my view, we'll then *use* this RDF-derived 
> description and model to 
> > build relational databases that hold said data. In this 
> worldview, the 
> > existence of the RDF description then helps us keep the 
> dynamic models --  
> > written in Java, Python or whatever -- in synch with the underlying 
> > relational descriptions, kept in relational DB's like MySql 
> and Oracle.
> 
> So RDF = UML 4.0? :-)
> 
> But beware the "impedance mismatch"...
> 
> I have to admit that most of the code I work with is still 
> "static". This 
> is inefficient, especially if your data model has some complexity and 
> changes frequently, but generic data models can be rather 
> difficult to work 
> with (even if there is generated code to ease the pain). 
> Nevertheless I 
> hope to gradually replace code that is tied to the data model 
> with generic 
> code, especially for lower-level infrastructure such as 
> database storage, 
> querying, serialization etc.
> 
> 
> 
> 




RE: Unstructured vs. Structured (was: HL7 and patient records in RDF/OWL?)

2006-02-15 Thread Miller, Michael D (Rosetta)

Hi All,

my quick 2c

> I'd argue that most information resources are indeed 
> semi-structured. The
> human brain is only able to meta-categorize resources based on its
> structured aspects (markup and structural metadata), its informational
> content (its aboutness), and context (environmental metadata).

although many people don't admit it or don't realize it, the human brain
is also able to hold unstructured data and work at it until it finds
structure or it's replaced by a next load of sensory input.

there's a place in use cases for the semantic web for this type of
activity.

cheers,
Michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Christopher Cavnor
> Sent: Tuesday, February 14, 2006 1:54 PM
> To: public-semweb-lifesci@w3.org
> Subject: Re: Unstructured vs. Structured (was: HL7 and 
> patient records in RDF/OWL?)
> 
> 
> 
> I'd argue that most information resources are indeed 
> semi-structured. The
> human brain is only able to meta-categorize resources based on its
> structured aspects (markup and structural metadata), its informational
> content (its aboutness), and context (environmental metadata).
> 
> "Structured" data is only structured once we have a common 
> understanding of
> its meaning. In this regard, data is never "raw" (except for randomly
> generated data) - as even structured database tables have 
> metadata to add
> meaning. So the term "semi-structured" is always adequate as 
> far as I am
> concerned. You'd have to prove that there is any other type 
> of data to me ;)
> 
> 
> -- 
> Christopher Cavnor
>  
> 
> On 2/14/06 10:54 AM, "Cutler, Roger (RogerCutler)" 
> <[EMAIL PROTECTED]>
> wrote:
> 
> > 
> > OK, then is there a preferred term for what we call "semi-structured
> > data"?  That is, information that is structured but where 
> the structure
> > is not easily determined and perhaps has not been 
> formalized at all, but
> > for which a formalized structure could be defined?  For 
> example, tables
> > in a spreadsheet?  We really care about this kind of thing, 
> but I don't
> > want to confuse the issue by using terms that most people understand
> > differently.
> > 
> > Incidentally, from my personal experience the usage of the term
> > semi-structured, that is, binary blobs in structured 
> databases, is not
> > very common.  Frankly, this is the first I have heard the 
> term used in
> > that sense, but maybe I just don't run in the right circles.
> > 
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of 
> Jim Hendler
> > Sent: Monday, February 13, 2006 3:43 PM
> > To: Pat Hayes; Gao, Yong
> > Cc: public-semweb-lifesci@w3.org
> > Subject: Re: Unstructured vs. Structured (was: HL7 and 
> patient records
> > in RDF/OWL?)
> > 
> > 
> > At 14:46 -0600 2/13/06, Pat Hayes wrote:
> >>> 
> >>> The point I'm trying to make is this: The concept of 
> "structuredness"
> >>> is relative and context-sensitive.
> >> 
> >> Hear, hear. Well said.
> >> 
> >> Pat Hayes
> >> 
> > 
> > 
> > FWIW, Structured, unstructured and semi-structured, 
> although non-precise
> > concepts in common language and (esp) philosophy, have 
> well-defined and
> > precise meanings in database jargon" -- most database books 
> have decent
> > definitions that are consistent with:
> >   unstructured - NL text
> >   semi-structured - unstructured fields within a structured 
> DB context
> >   structured - relational model (or similar) (those papers with
> > technical definitions tend to get ugly and recourse to relational
> > calculus, so these overly simplified definitions should 
> suffice for now)
> > that said, in the spirit of this particular thread, I think 
> we should be
> > careful and, if we mean to use it in a DB context, make it 
> clear in any
> > document that uses the term (i.e. "structured database" v.
> > "structured data" which are very different in some contexts)
> > -JH
> 
> 
> 
> 
> 




RE: Ontology Working Group Proposal Draft

2006-02-06 Thread Miller, Michael D (Rosetta)

Hi Vipul and Adrian,

> However, in  order for 
> these ontological
> artifacts to be useful to practitioners, we have to adopt the
> "model of use" perspective as well.

>From my (admittedly limited) point of view, for microarray experiments,
our use case for ontologies is that we wish to attach annotation from
standardized ontologies so that what we mine are common microarray
experiments, i.e. an investigator may be working on liver cancer and is
interested in a particular cellular pathway with a set of gene
biomarkers.  this investigator would like to go out via the semantic web
and find other microarray experiments, possibly proteomic experiments
and also related literature.

cheers,
Michael

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Kashyap, Vipul
> Sent: Sunday, February 05, 2006 9:59 AM
> To: Adrian Walker
> Cc: public-semweb-lifesci; John Madden; Vinay K. Chaudhri; 
> [EMAIL PROTECTED]; [EMAIL PROTECTED]; Robert Stevens; 
> Amit Sheth @ LSDIS; Alfredo Morales; Ullman-Cullere, Mollie; 
> [EMAIL PROTECTED]; Mark Musen
> Subject: RE: Ontology Working Group Proposal Draft
> 
> 
> 
> 
> > "The current definition of an ontology as enunciated by the 
> W3C needs to
> > be
> > examined and extended if required. Ontology as a model of 
> use needs to be
> > emphasized in contrast to ontology as a model of meaning."
> > 
> > In admittedly limited reading of the ontology literature, I 
> have formed
> > the
> > impression that "ontology as a model of meaning" is what 
> OWL is about,
> > while "ontology as a model of use" often seems to require 
> tools that are
> > built on top of OWL.
> 
> [VK] We clearly recognize the fact that the current 
> discussions around ontology
> have adopted the "model of meaning". However, in  order for 
> these ontological
> artifacts to be useful to practitioners, we have to adopt the
> "model of use" perspective as well.
> 
> In fact that model of use perspective is what has lead to 
> development of
> vocabularies, database schemas, terminologies, etc. We have 
> to "assimilate and
> extend"
> 
> It may be noted that the two perspectives are likely to 
> overlap to a large
> extent. Now whether we need to extend the current RDF, OWL, 
> SWRL standards to
> accommodate this perspective or come up with tools, 
> techniques and best
> practices based on the current standards is for the group as 
> a whole to explore.
> 
> > To try to ground this a bit in something that a healthcare 
> practitioner or
> > a clinical researcher
> > might in future find useful, there are some pointers below to some
> > examples
> > that one can run using a browser.
> 
> [VK] Thanks for the examples. Will take a look and get back to you.
> 
> Cheers,
> 
> ---Vipul
> 
> 
> 
>