Hello!

We are experimenting with Virtuoso Open Source at the BBC to see if it could be used as a backend for some of our applications. However, we are running in two main issues.

First, we noticed that on an average dataset (a couple of million triples), the following query is really fast:

PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/>
PREFIX foaf: <htp://xmlns.com/foaf/0.1/>
SELECT ?artist ?type WHERE {
?artist mbz:firstLetter "a" .
}


However the following query is really, really slow (a couple of minutes to answer):

PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/>
PREFIX foaf: <htp://xmlns.com/foaf/0.1/>
SELECT ?artist ?type WHERE {
?artist mbz:firstLetter "a" .
?artist a ?type
}

I am guessing the optimiser must see the rdf:type predicate, figure that it should use an index on type first, and end up going through every resource in the dataset.

Another major issue we're running into is the deadlocking mechanism. We have a constant flow of updates going in through SPARQL/Update. Our dataset is a collection of fairly small graphs (around 30 triples each). When we do a query like the above, going through all these graphs, we're almost sure to reach a deadlock at some point. At almost any point in time, there is an update going on in one of the graphs.

Is there a way to work around that? Like just waiting for the deadlock to be removed on the problematic graphs and return the SPARQL results?

Cheers,
y

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
                                        

Reply via email to