Mitko Iliev wrote:
Hi Martin,
The default indexes should do, i don't see wrong plan or similar. So it's better to send us the execution plan so we can see how this query compiles on your site; to do this put the query in explain ('sparql ...query here ..') and execute via command line ISQL tool.
When doing explain note that query should begin with SPARQL keyword.
Also what are INI settings in Parameters section specifically NumberOfBuffers, how large is working set? Check the staistics with status () command again with ISQL tool.
In addition, if this is all DBpedia dataset oriented you can simply test
for performance disparity (relative to your instance) across:
1. http://dbpedia.org/sparql
2. http://lod.openlinksw.com/sparql -- this has 8.5 Billion+ Triples.
Then we can complete analysis of your query using our existing live
instances.
Kingsley
Best Regards,
Mitko
On Apr 22, 2010, at 12:45 PM, Martin Gerlach wrote:
Hi,
The following SPARQL Query returns approx. 10000 results out of 35
million triples.
select [distinct] ?conflict ?metadata [from
<http://example.com/mygraph>] where {
?conflict a <http://dbpedia.org/ontology/MilitaryConflict> .
?conflict ?p <http://dbpedia.org/ontology/Event> .
?p rdfs:subPropertyOf rdf:type .
?p <http://example.com/mymetadata> ?metadata .
}
There are only 4 graphs including the Virtuoso default graphs and all
triples concerned are in <http://example.com/mygraph>. RDF_QUAD has the
following default indexes: GS, OP, POGS, SP.
Now the given query seems to be really slow. With FROM clause it is even
slower than without. Using DISTINCT or not does not seem to have much
impact.
Which indexes could help? I would think the query triple "?conflict ?p
<Event>" could be the cause for the delay - shouldn't an index on just
"O", "O,S", or "G,O,S" help? I tried each of the three without much of
an effect.
Are there other tables which need to be indexed in order to increase
SPARQL perfomance?
(Background: We have subclassed rdf:type dynamically for each ?s a
<Event> triple in order to annotate metadata, e.g. provenance, source,
time, ...)
Any ideas?
Thanks and cheers,
Martin
--
------------------------------------------------------------------------
Martin Gerlach
Research
Neofonie
Technologieentwicklung und
Informationsmanagement GmbH
Robert-Koch-Platz 4
10115 Berlin
fon: +49.30 24627 413
fax: +49.30 24627 120
[email protected] <mailto:[email protected]>
http://www.neofonie.de
Handelsregister
Berlin-Charlottenburg: HRB 67460
Geschaeftsfuehrung
Helmut Hoffer von Ankershoffen
(Sprecher der Geschaeftsfuehrung)
Nurhan Yildirim
------------------------------------------------------------------------
Das WePad ist ein Tablet der neuesten Generation. Dem Nutzer bietet es
schnellen Zugang zum Internet, eine komplette Welt von sofort nutzbaren
Applikationen und einfachen Zugriff auf Bücher, Fotos sowie auf Magazine
und Tageszeitungen verschiedener Verlage, die mit dem WeMagazine
ePublishing Eco System realisiert wurden. Mehr über das WePad auf
www.wepad.mobi <http://www.wepad.mobi> oder auf www.facebook.com/WePad
<http://www.facebook.com/WePad>.
------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/virtuoso-users
--
Mitko Iliev
Developer Virtuoso Team
OpenLink Software
http://www.openlinksw.com/virtuoso
Cross Platform Web Services Middleware
------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/virtuoso-users
--
Regards,
Kingsley Idehen
President & CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen