Hi Michael,
Thanks for sharing your experience with your GWAS project. The thing
with marker identification and prediction is that it involves a series
of possibly iteratirve analyses. The resulting gene lists are really the
products of these analyses. Therefore, it is important to capture the
context of these gene lists. See you and others at the F2F.
Cheers,
-Kei
dmiller wrote:
hi kei,
yes, the proposal for specifying the dataset seemed to allow rich
annotation, one reason i liked it.
the piece i find missing is this. not being a biologist or
statistician i can't offer much help (my strength is software).
once a large group of individuals have been part of a GWAS (perhaps as
part of a clinical trial) and a marker based on a gene expression
signature over n genes has been determined, presumably that would be
published as the set of n genes and some averaged measure of up
regulation or down regulation per gene based on averaging that group
of individuals and an outcome such as bad prognosis or good prognosis
associated with the marker. now if a new individual has a gene
expression profile, how will the expression of this individual's genes
be compared against the marker to determine which group the individual
falls into?
(there are other scenarios, such as multiple markers associated with
different outcomes but the above seems the simplest case.)
look forward to meeting you and the others at the F2F next week.
cheers,
michael
Michael Miller
mdmille...@comcast.net
----- Original Message ----- From: "Kei Cheung" <kei.che...@yale.edu>
To: "mdmiller" <mdmille...@comcast.net>
Cc: "Helen Parkinson" <parkin...@ebi.ac.uk>; "HCLS"
<public-semweb-lifesci@w3.org>; "Tony Burdett" <tburd...@ebi.ac.uk>
Sent: Thursday, October 29, 2009 1:43 PM
Subject: Re: Action Items from call today
Hi Michael et al,
Thanks for pointing to this generic approach of RDF representation of
datasets. A list of differentially expressed genes may be associated
with values such as P-values, fold-change, gene symbols, etc. Also,
it's important to capture metadata/provenance associated with the
gene list. This may include the type of statistical test (e.g.,
ANOVA) and the array platform employed (e.g., Affymetrix U133A). This
may be an interesting discussion topic at the F2F meeting.
Cheers,
-Kei
mdmiller wrote:
hi all,
on a different HCLS thread i saw this proposal from jeni tennison
for specifying a generic dataset, it might be a useful way to encode
a list of differentially expressed genes. it looks like one could
do this encoding on the fly, so that the data itself at the source
could be in whatever format is natural.
http://sw.joanneum.at/scovo/schema.html
cheers,
michael
Michael Miller
mdmille...@comcast.net
----- Original Message ----- From: "Kei Cheung" <kei.che...@yale.edu>
To: "Helen Parkinson" <parkin...@ebi.ac.uk>
Cc: "HCLS" <public-semweb-lifesci@w3.org>; "Tony Burdett"
<tburd...@ebi.ac.uk>
Sent: Wednesday, October 14, 2009 7:03 PM
Subject: Re: Action Items from call today
Thanks, Helen.
To make it more concrete. I've been thinking about some example
queries that I hope can be answered by the RDF data once converted.
I wonder if the following example quereis can be answered:
Retrieve a list of differentially expressed genes between different
brain regions (e.g., hippocampus and entorhinal cortex) for
normally aged human subjects.
Retrieve a list of differentially expressed genes for the same
brain region of normal human subjects and AD patients.
Using these lists of genes one can issue (federated) queries to
retrieve addtional information about the genes for various types of
analyses (e.g., GO term enrichment).
Just a thought.
Cheers,
-Kei
Helen Parkinson wrote:
Hi
here are my action items from the call today
1. MAGE-TAB->RDF, Lena requested details.
Code here: https://sourceforge.net/projects/limpopo/
Java Parser for MAGE-TAB developed by EBI, used by several groups.
Contact Tony Burdett tburd...@ebi.ac.uk for details. Tony
estimates for a simple RDF dump a few days work. Lena if you are
interested in working on this java code please contact Tony as
he's already designed with rdf export in mind
2. MAGE-TAB->MAGE-ML - code from Junmin Liu at UPenn
https://sourceforge.net/projects/tab2mage/files/ - see mage2tab
Pretty much all public MAGE-ML comes from AE and is available from
Arrayexpress ftp dirs as mage-tab already. Exceptions are
Rosetta's mage-ml importer, and non public data
3. EBI experimental factor ontology (EFO) slides, attached
see also www.ebi.ac.uk/efo
4. Noted that an RDF dump of atlas data and triple store access
will be useful, we'll announce when these are available
thanks
Helen