I think there are two issues in a suspicion of distributed curation:

1. Wanting curation to be good before publication, rather than "it will become good eventually with enough people looking at it". 2. The perceived need to control what appears as the community view. Feedback from the comunity is fine, but it has to be filtered by the people in control.

I've seen many cases of mis-trust in, for instance, mapping from one DB to another. Many people would much rather do it themselves than trust someone else's mappings. In contrast, however, many of the public resources have a degree of trust-- I know what I'm getting when I use resource X. So, giving up trust is a hard thing to do. SWISS- if people thought there was a lack of rigour.

With community curation, adding in the provenance so that I could, for insance, filter out idiots is all possible.

Robert.
is trusted, but I think the trust would go

At 11:30 04/08/2006, Joanne Luciano wrote:

I would like to get some feedback on the feasibility of distributed
curation.

Me too!

PIs who have years of experience in managing curation
projects are not that enthusiastic about its role. It seems the CS
community is all for it but the actual *users* havent really bought
in.

Are there any lurkers out there who can comment on why?
If not, do others think this would be something that would be worth
seeking out
to understand why?  Have the folks at CBioC/ASU any idea why it's not
taking off?  Does the word need to be spread or is there something wrong
with the tool? The concept? Trust of the data?

For example, there is a great tool developed by Chitta Baral's
group at ASU called CBioC

http://cbioc.eas.asu.edu/

When you search the PubMed database and display a particular abstract,
CBioC will automatically display the interactions found in the CBioC
database related to the abstract you are viewing. If the abstract has
not been processed by CBioC before, the automatic extraction system
will run "on the fly".

CBioC runs as a web browser extension, not as a stand-alone
application. When you visit the Entrez (PubMed) web site, CBioC
automatically opens within a "web band" at the bottom of the main
browser window in either IE or Firefox.

To me it appears to be a great tool, something that can actually
exploit the wisdom of the crowds without much effort required (other
than saying yes/no to an automatically extracted interaction).

What do others on the list think about such projects?

I agree - and think it would be good to learn more about why it's not
taking off.

Third, with semantic web technologies such as description
logics and rules, it should be possible to infer when two data
sets are really talking about the same biological object, even
if they use different identifiers to describe the thing.
To that end, I have been working with Alan Ruttenberg and
others at York University, UCSD and SRI to develop an
OWL/Description-logic based method to automate the integration
of two E. coli databases.

And Manchester :-)

I think with SW technologies it should be possible to go beyond
integration. At Stanford, we did a test project for integrating
ecocyc, reactome and kegg using BioPAX to create the Pathway Knowledge
Base, PKB at http://pkb.stanford.edu [its currently down coz we are
moving machines].

Can you say more about the problems you ran into and how you resolved
them?
The kind of integration Jeremy is talking about is integration based on
having descriptions of reactions, for example, such that a reasoner can
infer from the description that two reactions are the same - even if
they have different identifiers, or if in the database the left and
right side
are reversed.

Moreover, the reasoners can be used to find inconsistencies within
(and across)
the databases.  Work explored by Jeremy and Alan Ruttenberg.

Joanne


Dr. Robert Stevens
Senior Lecturer
School of Computer Science
University of Manchester
Oxford Road
Manchewster
M13 9pL
+44(0)1561


Reply via email to