Hello Nathan, I'll inspect what can be improved tonight. Single-element IN differs from "longer" IN because it becomes an equality and then the query is rewritten to, say, select distinct (<http://dbpedia.org/resource/BBC> as ?s) ?p ?o WHERE { <http://dbpedia.org/resource/BBC> ?p ?o . }
so these cases differ. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com On Fri, 2010-02-12 at 19:58 +0000, Nathan wrote: > Further test cases: > > seems to be tied to the number of arguments inside the in() clause eg: > > this works: > select distinct ?s ?p ?o WHERE { > ?s ?p ?o . FILTER( ?s in (<http://dbpedia.org/resource/BBC>) ) > } > > this doesn't (occasionally does but is exceptionally slow): > select distinct ?s ?p ?o WHERE { > ?s ?p ?o . FILTER( ?s in > (<http://dbpedia.org/resource/BBC>,<http://dbpedia.org/resource/Quantum_tunnelling>) > ) > } > > and to ensure it's not a size of graph thing; the following queries > create a small result set which is then filtered using the second IN > > this works: > select distinct ?s ?p ?o WHERE { > ?s foaf:page ?page . > FILTER( ?s in > (<http://dbpedia.org/resource/BBC>,<http://dbpedia.org/resource/Quantum_tunnelling>) > ) . > ?s ?p ?o . FILTER( isURI(?o) ) > FILTER( ?o in( <http://dbpedia.org/resource/United_Kingdom>) ) > } > > this doesn't: > > select distinct ?s ?p ?o WHERE { > ?s foaf:page ?page . > FILTER( ?s in > (<http://dbpedia.org/resource/BBC>,<http://dbpedia.org/resource/Quantum_tunnelling>) > ) . > ?s ?p ?o . FILTER( isURI(?o) ) > FILTER( ?o in( > <http://dbpedia.org/resource/United_Kingdom>,<http://dbpedia.org/resource/Category:Physics> > ) ) > } > > regards! > > Nathan wrote: > > Hi Ivan / All, > > > > I've checked, checked and checked a hundred+ times; there's definitely a > > performance issue with *only* the following query (where a,b,c can be > > any URIs, and where * can be * or ?s ?p ?o ): > > > > SELECT * WHERE { > > ?s ?p ?o . FILTER( ?s in( a,b,c ) ) > > } > > > > I can skirt around the performance issue by rewriting the query as a > > UNION or by either of the following (which requires me to know a > > predicate at least all of the subjects share in common): > > > > SELECT ?s ?p ?o WHERE { > > ?s foaf:page ?skip ; ?p ?o . FILTER( ?s in( a,b,c ) ) > > } > > > > SELECT ?s ?p ?o WHERE { > > ?s foaf:page ?skip . FILTER( ?s in( a,b,c ) ) . ?s ?p ?o > > } > > > > I'm also sure that it's not down to server performance as every other > > query I run (many of them v complex) run without any problems, in fact > > they run very quickly (which is great! good work). > > > > Please do see and test for yourself at the following address: > > > > http://webr3.org:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=SELECT+*+WHERE+{%0D%0A+%3Fs+%3Fp+%3Fo+.+FILTER%28+%3Fs+in%28%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FRDFa%3E%2C%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FSemantic_Web%3E+%29+%29%0D%0A}&format=text%2Fhtml&debug=on&timeout= > > > > Many Regards, > > > > Nathan > > > > Ivan Mikhailov wrote: > >> Hello Nathan, > >> > >> I've checked the compilation of the query and it seems the same for old > >> and new versions, so I have no clue. I can't say anything about the > >> speed of dbpedia.org endpoint, but it runs Virtuoso Cluster Edition that > >> is not yet released and it has no relation to your local installation. > >> > >> What about the speed of same queries but with the graph specified? BTW > >> how many graphs does the database contain? What is the database size, > >> number of buffers and the hardware (most important, amount of RAM)? > >> > >> Best Regards, > >> Ivan Mikhailov > >> http://virtuoso.openlinksw.com > >> > >> On Tue, 2010-02-09 at 22:14 +0000, Nathan wrote: > >>> Hi, > >>> > >>> I'm pretty sure that the performance of in() seems to have downgraded > >>> somewhat with the new 6.1.0.3126 release of virtuoso; I'm struggling to > >>> run even very simple queries such as the following without them timing > >>> out, on both my own 6.1 and dbpedia.org's instance: > >>> > >>> SELECT * WHERE { > >>> ?s ?p ?o . FILTER( ?s in(<http://dbpedia.org/resource/RDFa>, > >>> <http://dbpedia.org/resource/Semantic_Web> ) ) > >>> } > >>> > >>> Pretty sure this is a new issue as I can run similar queries (all be it > >>> on a smaller dataset) on 6.0.1-pre1.3124 without any problems, and > >>> likewise I've tested on the bio2rdf.org sparql endpoint and performance > >>> is nice and fast as expected. > >>> > >>> Regardless even if the above versioning detail is incorrect, the in() > >>> performance definitely appears to be an issue on 6.1.1.0.3126 > >>> > >>> Many Regards, > >>> > >>> Nathan > >> > >> > >> > > > > > > ------------------------------------------------------------------------------ > > SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, > > Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW > > http://p.sf.net/sfu/solaris-dev2dev > > _______________________________________________ > > Virtuoso-users mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/virtuoso-users > > > > >
