Hi Martin,
From the statistics i seem to me the db cache is not warmed, the disk cache is
cold as only couple of disk buffers are used. What happens if you run query few
more times ? you should see difference with execution time, how is it compared
agains first run. If still time large, can check the disk status by :
select top 20 * from sys_d_stat order by reads desc; This could give some
ideas which index is read so often.
Another note: are you using striped setup re db files with separate IO queues ?
Also check FDsPerFile in INI , Parameters section, could set it to 4 if you
have many reads, but this should not affect one query as all should be in
memory .when executed few times
Best Regards,
Mitko
On Apr 22, 2010, at 6:18 PM, Martin Gerlach wrote:
> Hi Mitko,
>
> The exact query is:
>
> sparql select ?conflict ?dateBegin where { ?conflict a
> <http://dbpedia.org/ontology/MilitaryConflict> ; ?p
> <http://dbpedia.org/ontology/Event> . ?p rdfs:subPropertyOf rdf:type ;
> <http://www.w3.org/2006/time#hasBeginning> ?dateBegin . }
>
> EXPLAIN gives me this... do you need me to execute the explain() with
> different parameters?
>
> SQL> explain('sparql select ?conflict ?dateBegin where { ?conflict a
> <http://dbpedia.org/ontology/MilitaryConflict> ; ?p
> <http://dbpedia.org/ontology/Event> . ?p rdfs:subPropertyOf rdf:type ;
> <http://www.w3.org/2006/time#hasBeginning> ?dateBegin . }');
> REPORT
> VARCHAR
> _______________________________________________________________________________
>
> {
>
> Precode:
> 0: $20 "callret" := Call min_bnode_iri_id ()
> 5: $21 "callret" := Call __i2idn (<constant (#i868)>)
> 10: $22 "callret" := Call __i2idn (<constant (#i1)>)
> 15: $23 "callret" := Call __i2idn (<constant (#i1030009)>)
> 20: $24 "callret" := Call __i2idn (<constant (#i1021055)>)
> 25: $25 "callret" := Call __i2idn (<constant (#i1021199)>)
> 30: BReturn 0
> from DB.DBA.RDF_QUAD by RDF_QUAD_POGS 0.21 rows
> Key RDF_QUAD_POGS ASC ($27 "s-1-1-t2.S")
> <col=551 P = $21 "callret"> , <col=552 O = $22 "callret">
> row specs: <col=552 O LIKE <constant (T�)>> , <col=550 S < $20 "callret">
>
> from DB.DBA.RDF_QUAD by RDF_QUAD_POGS 0.22 rows
> Key RDF_QUAD_POGS ASC ($32 "s-1-1-t1.P", $31 "s-1-1-t1.S")
> <col=551 P = $27 "s-1-1-t2.S"> , <col=552 O = $24 "callret">
> row specs: <col=552 O LIKE <constant (T�)>>
>
> from DB.DBA.RDF_QUAD by RDF_QUAD 0.16 rows
> Key RDF_QUAD ASC ($36 "s-1-1-t0.S")
> inlined <col=551 P = $22 "callret"> , <col=550 S = $31 "s-1-1-t1.S"> ,
> <col=552 O = $25 "callret">
> row specs: <col=552 O LIKE <constant (T�)>>
>
> from DB.DBA.RDF_QUAD by RDF_QUAD 0.22 rows
> Key RDF_QUAD ASC ($40 "s-1-1-t3.O")
> inlined <col=551 P = $23 "callret"> , <col=550 S = $32 "s-1-1-t1.P">
> row specs: <col=550 S < $20 "callret">
>
>
> After code:
> 0: $43 "conflict" := Call __id2i ($36 "s-1-1-t0.S")
> 5: $44 "dateBegin" := Call __ro2sq ($40 "s-1-1-t3.O")
> 10: BReturn 0
> Select ($43 "conflict", $44 "dateBegin", <$42 "<DB.DBA.RDF_QUAD
> s-1-1-t3>" spec 5>, <$38 "<DB.DBA.RDF_QUAD s-1-1-t0>" spec 5>, <$34
> "<DB.DBA.RDF_QUAD s-1-1-t1>" spec 5>, <$29 "<DB.DBA.RDF_QUAD s-1-1-t2>"
> spec 5>)
> }
>
> STATUS after executing the query:
>
> SQL> status();
> REPORT
> VARCHAR
> _______________________________________________________________________________
>
> OpenLink Virtuoso Server
> Version 06.01.3127-pthreads for Linux as of Apr 19 2010
> Started on: 2010/04/22 12:17 GMT+120
>
> Database Status:
> File size 6257901568, 763904 pages, 348771 free.
> 2621440 buffers, 38877 used, 1 dirty 0 wired down, repl age 0 0 w. io
> 0 w/crsr.
> Disk Usage: 39070 reads avg 0 msec, 0% r 0% w last 81 s, 480 writes,
> 107 read ahead, batch = 164. Autocompact 0 in 0 out, 0% saved.
> Gate: 488 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap.
> Log = data/narysubproperty.trx, 4359 bytes
> 415060 pages have been changed since last backup (in checkpoint state)
> Current backup timestamp: 0x0000-0x00-0x00
> Last backup date: unknown
> Clients: 4 connects, max 1 concurrent
> RPC: 166 calls, -1 pending, 1 max until now, 0 queued, 270 burst reads
> (162%), 1 second brk=21564690432
> Checkpoint Remap 19 pages, 0 mapped back. 9 s atomic time.
> DB master 763904 total 348771 free 19 remap 0 mapped back
> temp 256 total 251 free
>
> Lock Status: 0 deadlocks of which 0 2r1w, 0 waits,
> Currently 1 threads running 0 threads waiting 0 threads in vdb.
> Pending:
>
> Client 1111:4: Account: dba, 1845 bytes in, 356618 bytes out, 1 stmts.
> PID: 6351, OS: unix, Application: unknown, IP#: 127.0.0.1
> Transaction status: PENDING, 1 threads.
> Locks:
>
>
> Running Statements:
> Time (msec) Text
> 88 status()
>
>
> Replication Status: Server db-ALX-DEV03.
> db-ALX-DEV03 db-ALX-DEV03 0 OFF.
>
>
>
>
> Hash indexes
>
>
> 43 Rows. -- 88 msec.
>
>
> Am 22.04.2010 15:13, schrieb Mitko Iliev:
>> Hi Martin,
>>
>> The default indexes should do, i don't see wrong plan or similar. So
>> it's better to send us the execution plan so we can see how this
>> query compiles on your site; to do this put the query in explain
>> ('sparql ...query here ..') and execute via command line ISQL tool.
>> When doing explain note that query should begin with SPARQL keyword.
>> Also what are INI settings in Parameters section specifically
>> NumberOfBuffers, how large is working set? Check the staistics with
>> status () command again with ISQL tool.
>>
>> Best Regards, Mitko
>>
>> On Apr 22, 2010, at 12:45 PM, Martin Gerlach wrote:
>>
>>> Hi,
>>>
>>> The following SPARQL Query returns approx. 10000 results out of 35
>>> million triples.
>>>
>>> select [distinct] ?conflict ?metadata [from
>>> <http://example.com/mygraph>] where { ?conflict a
>>> <http://dbpedia.org/ontology/MilitaryConflict> . ?conflict ?p
>>> <http://dbpedia.org/ontology/Event> . ?p rdfs:subPropertyOf
>>> rdf:type . ?p <http://example.com/mymetadata> ?metadata . }
>>>
>>> There are only 4 graphs including the Virtuoso default graphs and
>>> all triples concerned are in <http://example.com/mygraph>. RDF_QUAD
>>> has the following default indexes: GS, OP, POGS, SP.
>>>
>>> Now the given query seems to be really slow. With FROM clause it is
>>> even slower than without. Using DISTINCT or not does not seem to
>>> have much impact.
>>>
>>> Which indexes could help? I would think the query triple "?conflict
>>> ?p <Event>" could be the cause for the delay - shouldn't an index
>>> on just "O", "O,S", or "G,O,S" help? I tried each of the three
>>> without much of an effect. Are there other tables which need to be
>>> indexed in order to increase SPARQL perfomance?
>>>
>>> (Background: We have subclassed rdf:type dynamically for each ?s a
>>> <Event> triple in order to annotate metadata, e.g. provenance,
>>> source, time, ...)
>>>
>>> Any ideas?
>>>
>>> Thanks and cheers, Martin
>>>
>>>
>>> --
>>> ------------------------------------------------------------------------
>>>
>>>
> Martin Gerlach
>>> Research
>>>
>>> Neofonie Technologieentwicklung und Informationsmanagement GmbH
>>> Robert-Koch-Platz 4 10115 Berlin fon: +49.30 24627 413 fax: +49.30
>>> 24627 120 [email protected]
>>> <mailto:[email protected]> http://www.neofonie.de
>>>
>>> Handelsregister Berlin-Charlottenburg: HRB 67460
>>>
>>> Geschaeftsfuehrung Helmut Hoffer von Ankershoffen (Sprecher der
>>> Geschaeftsfuehrung) Nurhan Yildirim
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>>
> Das WePad ist ein Tablet der neuesten Generation. Dem Nutzer bietet es
>>> schnellen Zugang zum Internet, eine komplette Welt von sofort
>>> nutzbaren Applikationen und einfachen Zugriff auf Bücher, Fotos
>>> sowie auf Magazine und Tageszeitungen verschiedener Verlage, die
>>> mit dem WeMagazine ePublishing Eco System realisiert wurden. Mehr
>>> über das WePad auf www.wepad.mobi <http://www.wepad.mobi> oder auf
>>> www.facebook.com/WePad <http://www.facebook.com/WePad>.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
> _______________________________________________
>>> Virtuoso-users mailing list [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>
>>
>> -- Mitko Iliev Developer Virtuoso Team OpenLink Software
>> http://www.openlinksw.com/virtuoso Cross Platform Web Services
>> Middleware
>>
>>
>> ------------------------------------------------------------------------------
>>
>>
> _______________________________________________
>> Virtuoso-users mailing list [email protected]
>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
> --
> ------------------------------------------------------------------------
> Martin Gerlach
> Research
>
> Neofonie
> Technologieentwicklung und
> Informationsmanagement GmbH
> Robert-Koch-Platz 4
> 10115 Berlin
> fon: +49.30 24627 413
> fax: +49.30 24627 120
> [email protected] <mailto:[email protected]>
> http://www.neofonie.de
>
> Handelsregister
> Berlin-Charlottenburg: HRB 67460
>
> Geschaeftsfuehrung
> Helmut Hoffer von Ankershoffen
> (Sprecher der Geschaeftsfuehrung)
> Nurhan Yildirim
>
> ------------------------------------------------------------------------
>
> Das WePad ist ein Tablet der neuesten Generation. Dem Nutzer bietet es
> schnellen Zugang zum Internet, eine komplette Welt von sofort nutzbaren
> Applikationen und einfachen Zugriff auf Bücher, Fotos sowie auf Magazine
> und Tageszeitungen verschiedener Verlage, die mit dem WeMagazine
> ePublishing Eco System realisiert wurden. Mehr über das WePad auf
> www.wepad.mobi <http://www.wepad.mobi> oder auf www.facebook.com/WePad
> <http://www.facebook.com/WePad>.
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Virtuoso-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
--
Mitko Iliev
Developer Virtuoso Team
OpenLink Software
http://www.openlinksw.com/virtuoso
Cross Platform Web Services Middleware