Re: [Virtuoso-users] Inference performance

2011-05-10 Thread Hugh Williams
Hi Roberto,

Can you please provide the content of the [SPARQL] section of the 
“virtuoso.ini” file so we can see what setting are in place.

Is your ontology and data available for public download ? If so please provide 
the location they can be downloaded from, which datasets to load and we can 
attempt a recreation in-house.

Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
Twitter: http://twitter.com/OpenLink

On 9 May 2011, at 23:56, Roberto García wrote:

 Please provide the command used for creating the rule sets and confirm they 
 exist in the “sys_rdf_schema” table ?
 
 Yes, I used 
 rdfs_rule_set('http://semantic.eurobau.com/data/schema/rules/','http://semantic.eurobau.com/data/schema/');
 
 And they exist in that table.
 
 Based on the details below is would appear you have about 10 triples 
 loaded in total and the status output indicates that only abt 16 of the 
 50 buffers are allocated with no locks etc so the system seems pretty 
 much idle and thus should be performant.  To see how performant the system 
 is can  you try counting all the triples in the store with the command 
 “select * from where {?s ?p ?o}” and see how long it takes. How long do 
 these queries below take to run if the inference rule is removed ? It would 
 also be interesting to see the compiler execution plan for these queries 
 using the Virtuoso explain() function as detailed at:
 
 I counted all triples in virtuoso with select count(*) where {?s ?p
 ?o}. The result is 117550498 and it took more or less 15 seconds.
 For the Eurobau data graph (http://semantic.eurobau.com/data/) the
 count is 89911 and 1 or 2 secs.
 For the Eurobau schema graph
 (http://semantic.eurobau.com/data/schema/) 79544 triples and also 1 or
 2 seconds.
 
 If there is no inference, the query is almost instantaneous and
 correctly responds 0:
 SELECT count(*)
 FROM http://semantic.eurobau.com/data/
 WHERE
 { ?x a http://purl.org/goodrelations/v1#ProductOrService }
 
 And here is what I get with fn_explain (inference enabled):
 
 explain('sparql define input:inference
 http://semantic.eurobau.com/data/schema/rules/; SELECT  count(?x)
 FROM http://semantic.eurobau.com/data/ FROM
 http://semantic.eurobau.com/data/schema/ WHERE { ?x a
 http://purl.org/goodrelations/v1#ProductOrService }');
 
 {
 Fork 34
 {
 RDF Inference subproperty iterates $28 inferred
  o= none p= #type
 RDF Inference subclass iterates $31 inferred
  o= #ProductOrService p= none
 from DB.DBA.RDF_QUAD by RDF_QUAD 1.2 rows
 Key RDF_QUAD ASC ($22 s-1-1-t0.S, $21 s-1-1-t0.G)
  inlined col=551 P = $28 inferred
 0
  Local Test
  0: $25 one_of_these := Call one_of_these ($21 s-1-1-t0.G, #data/
 , #data/schema/ )
  5: if ( 0 2() $25 one_of_these) then 8 else 9 unkn 9
  8: BReturn 1
  9: BReturn 0
  Local Code
  0: if ($22 s-1-1-t0.S 16( IS NULL ) 0 ) then 3 else 8 unkn 8
  3: $32 callretSearchedCASE := := artm 0
  7: Jump 12 (level=0)
  8: $32 callretSearchedCASE := := artm 1
  12: $33 one_of_these := Call one_of_these ($21 s-1-1-t0.G,
 #data/ , #data/schema/ )
  17: if ($32 callretSearchedCASE 16( IS NULL ) none ) then 32
 else 20 unkn 32
  20: if ($36 callret-0 16( IS NULL ) none ) then 23 else 28 unkn 32
  23: $36 callret-0 := := artm $32 callretSearchedCASE
  27: Jump 32 (level=0)
  28: $36 callret-0 := artm $36 callret-0 + $32 callretSearchedCASE
  32: BReturn 0
 
 }
 Select ($36 callret-0, $24 DB.DBA.RDF_QUAD s-1-1-t0 spec 5)
 }
 
 
 Best,
 
 Roberto
 
 --
 Achieve unprecedented app performance and reliability
 What every C/C++ and Fortran developer should know.
 Learn how Intel has extended the reach of its next-generation tools
 to help boost performance applications - inlcuding clusters.
 http://p.sf.net/sfu/intel-dev2devmay
 ___
 Virtuoso-users mailing list
 Virtuoso-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/virtuoso-users




[Virtuoso-users] Inference performance

2011-05-09 Thread Roberto García
Dear all,

I'm using OpenSource Virtuoso 6.1.3 with data from the Semantic
Eurobau initiative.

I've first loaded the ontologies used in this dataset. There is
GoodRelations, a small Eurbau-specific ontology and FreeClass, the
biggest one with arround 5000 classes:

DB.DBA.RDF_LOAD_RDFXML_MT(file_open('eurobau/freeclass_v1.owl'),
'http://semantic.eurobau.com/data/schema/',
'http://semantic.eurobau.com/data/schema/', 255);
DB.DBA.RDF_LOAD_RDFXML_MT(file_open('eurobau/goodrelations_v1.owl'),
'http://semantic.eurobau.com/data/schema/',
'http://semantic.eurobau.com/data/schema/', 255);
DB.DBA.RDF_LOAD_RDFXML_MT(file_open('eurobau/eurobau-utility.owl'),
'http://semantic.eurobau.com/data/schema/',
'http://semantic.eurobau.com/data/schema/', 255);

Then, I've loaded 6 of the dump files totalling 9 triples:

DB.DBA.TTLP_MT(gz_file_open('eurobau/at_ardex_uk_dump.nt.gz'),
'http://semantic.eurobau.com/data/',
'http://semantic.eurobau.com/data/', 255);
DB.DBA.TTLP_MT(gz_file_open('eurobau/at_rehau_com_dump.nt.gz'),
'http://semantic.eurobau.com/data/',
'http://semantic.eurobau.com/data/', 255);
DB.DBA.TTLP_MT(gz_file_open('eurobau/at_senftenbach_at_dump.nt.gz'),
'http://semantic.eurobau.com/data/',
'http://semantic.eurobau.com/data/', 255);
DB.DBA.TTLP_MT(gz_file_open('eurobau/at_swisspor_at_dump.nt.gz'),
'http://semantic.eurobau.com/data/',
'http://semantic.eurobau.com/data/', 255);
DB.DBA.TTLP_MT(gz_file_open('eurobau/at_tesa_at_dump.nt.gz'),
'http://semantic.eurobau.com/data/',
'http://semantic.eurobau.com/data/', 255);
DB.DBA.TTLP_MT(gz_file_open('eurobau/hu_ardex_hu_dump.nt.gz'),
'http://semantic.eurobau.com/data/',
'http://semantic.eurobau.com/data/', 255);

All fine till now, but when I try to get a count of the instances of
ProductOrSevice with the following query:
define input:inference http://semantic.eurobau.com/data/schema/rules/;
SELECT  ?x
FROM http://semantic.eurobau.com/data/
WHERE
  { ?x a http://purl.org/goodrelations/v1#ProductOrService }

I'm just able to get a timeout even for timeouts greater thn 100
seconds, in some cases a get an empty result page instead of the
timeout exception.

I've checked that there are 2616 subclasses of gr:ProductOrService:

define input:inference http://semantic.eurobau.com/data/schema/rules/;
SELECT  count(?x)
FROM http://semantic.eurobau.com/data/
FROM http://semantic.eurobau.com/data/schema/
WHERE
  { ?x rdfs:subClassOf http://purl.org/goodrelations/v1#ProductOrService }

And that many of these classes have direct instances.

I would expect to get a result even with a shorter interval so I'm not
sure if I have loaded the data incorrectly or if the queries are not
appropriate. The machine hosting Virtuoso is a 1 Xeon E5520 8GB RAM
and Virtuoso amounts more or less half of the RAM (50 buffers).

This is the output for the status() command:

---

OpenLink Virtuoso  Server
Version 06.01.3127-pthreads for Linux as of Mar 30 2011
Started on: 2011/05/07 01:32 GMT+120

Database Status:
  File size 14841544704, 1811712 pages, 360173 free.
  50 buffers, 167119 used, 1 dirty 0 wired down, repl age 0 0 w.
io 0 w/crsr.
  Disk Usage: 170762 reads avg 0 msec, 0% r 0% w last  0 s, 9950 writes,
781 read ahead, batch = 205.  Autocompact 0 in 0 out, 0% saved.
Gate:  366 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap.
Log = virtuoso.trx, 2494 bytes
1450938 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x-0x00-0x00
Last backup date: unknown
Clients: 69 connects, max 10 concurrent
RPC: 17583 calls, -58 pending, 1 max until now, 0 queued, 50 burst
reads (0%), 1 second brk=4323667968
Checkpoint Remap 0 pages, 0 mapped back. 666 s atomic time.
DB master 1811712 total 360173 free 0 remap 0 mapped back
   temp  1280 total 1275 free

Lock Status: 0 deadlocks of which 0 2r1w, 0 waits,
   Currently 1 threads running 0 threads waiting 0 threads in vdb.

---

Any help appreciated.

Best,

Roberto García
http://rhizomik.net/~roberto