Jim Hendler said the following on 2/8/2006 2:29 PM:
I love this idea, but I would go a bit further - be even nicer for us non-biologists if it also included some example queries to run (and maybe even the correct answer sets) - I think if that existed, we could push some of the triple store developers to use it as a benchmark, which would help both communities...


Agreed. The Oracle paper provided an outline for 6 different queries - which is a good starting point. It would be ideal to incorporate all of this into a test harness though. Similar efforts are underway at the SIMILE project, that I have been loosely involved with through Vineet Sinha.

http://simile.mit.edu/repository/shootout/trunk/shootout/ http://simile.mit.edu/repository/shootout/trunk/shootout-core/

Another similar project, that I haven't seen mentioned before, but found useful, is here:
http://tripletest.sourceforge.net/

For anyone that has not read the Oracle paper, I copied their query table into an ASCII friendly format below:

Description | Pattern | Projection | Result | limit
---------------------------------------------------
Q1: Display the ranges of
transmembrane regions
6 triples
5 vars
3 vars
15000 rows

Q2: List proteins with
publications by authors
with matching names
5 triples
5 vars
1 LIKE pred.
3 vars
10 rows

Q3: Count the number of
times a publication by a
specific author is cited
3 triples
2 vars
0 vars
32 rows

Q4: List resources that
are related to proteins
annotated with a specific
keyword
3 triples
2 vars
1 var
3000 rows

Q5: List genes associated
with human diseases
7 triples
5 vars
3 vars
750 rows

Q6: List recently
modified entries
2 triples
2 vars
1 range pred.
2 vars
8000 rows

---------------------------------------
Q1 (the only actual query provided)
---------------------------------------
SELECT AVG(LENGTH(protein)), AVG(LENGTH(begin)),
       AVG(LENGTH(end))
FROM TABLE(RDF_MATCH(
   ‘(?p       rdf:type      up:Protein)
    (?p       up:annotation  ?a)
    (?a       rdf:type
               up:Transmembrane_Annotation)
    (?a       up:range      ?range)
    (?range    up:begin      ?begin)
    (?range    up:end        ?end)’
   RDFModels('UniProt'), NULL, NULL))
WHERE rownum <= 15000;


Ian

Reply via email to