On Jun 13, 2007, at 12:33 PM, SATYA SANKET SAHOO wrote:
On Jun 13, 2007, at 3:42 AM, [EMAIL PROTECTED] wrote:
I am following part of this thread and feel like popping in.
Maybe it helps.
In clinical trials and 'evidence' based medicine the word evidence
is strictly defined and may not be compatible with the word
'evidence' used in logic: if <evidence> then <conclusion>. I
support the idea of connecting the interpretation of the raw data
(the source data) with the data itself. Pixels cannot be evidences
on their own, without knowing what the pixels mean. So, an
important fact is the thrust in the interpreter.
Satya: Connecting source data with result data along with the
processing information used to derive the results and 'trust' sound
very similar to 'provenance' information. Can we or not
differentiate between 'evidence' and 'provenance'?
This is especially pertinent in case of experimental data and
results derived from it. For example, when a list of peptides is
derived from a 'biochemical sample' using mass spectrometry (ms)
the evidence that accompany these results are:
1. The details of the original sample (organism, type of cells,
cleavage enzyme used etc.)
2. The ms instrument used, the settings of the instruments and as
pointed out earlier in this discussion, the algorithms used in
processing the ms data - these entail a lot of contextual
information regarding how the results are processed or interpreted
(measure of confidence etc.)
As you recall, the demo used the Evidence Code Ontology from OBO, as
its basis. This is a funny artifact - on the one hand, it's been
useful, in some form, to the researchers that have used the Gene
ontology. On the other hand, it is clearly a mixture of various kinds
of things that don't really go well together - one of them being the
mixture of provenance versus experimental information.
I've thought that it would be a useful exercise to start with the
current ECO and try to refactor it, perhaps making explicit where
appropriate the various components that Dirk mentions as being
elements of evidence. (I think the proxy idea is quite related to his
view of things, btw).
As an example of what's there now, we see things like "Traceable
Author Statement", (no definition) which I take to mean that someone
read a paper where the author said it was so, and here is the PMID.
TAS is generally applicable, and was what we used when all we had was
some otherwise unexplained citation of a paper. Really it is more
like provenance than evidence.
OTOH, there are things like: "inferred from curated BLAST match to
protein" (no definition), which is a justification for moving GO
annotations on proteins in one species to proteins in another. So
this is much more specific, has an underlying theory along with which
comes a standard set of caveats. It can also be put into some sort of
proxy relationships - similarity of sequence of protein is a proxy
for similarity of function of protein. (on Dirk's scale of 1-4 this
would probably be considered a 0)
There is some overlap of the discussion of evidence with the OBI
protocol application branch's work. I'd say there that effort on
determining the ingredients and their relationships, rather than
evaluating how much to believe the evidence. There's also some
overlap of OBI with Satya's ontology, so maybe there's a chance for
more concentrated effort being put into a merge of these various
independent efforts.
-Alan