Re: Inferred Triples

Dave Reynolds Tue, 08 Jul 2014 09:18:28 -0700

On 08/07/14 16:56, Pearson, Stephen (HP Cloud, Bristol) wrote:

I'm working with a medium sized dataset of around 8 million triples, and
I'm using fuseki to query it via an inference model (either RDFS or OWL
micro).  This works, but I'm looking to boost performance by pre-computing
the inferences, storing them in a named graph and using
tdb:unionDefaultGraph to merge them at run time.  I'll then have the
option of recomputing the inferences from scratch whenever the schema
changes; The code below takes under 2 minutes to run which is ok for my
use case provided I don't have to do it every time I restart the server.


I'm therefore looking for a way to take a reasoner and extract just the
new inferences from the resulting InfModel.

Code:

// Assume tdbModel loaded from TDB
Model schema = ModelFactory.createDefaultModel();
         schema.read("schema.ttl", "TURTLE");
         Model unionModel = ModelFactory.createUnion(tdbModel, schema);
         OntModel ont =
ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM_MICRO_RULE_INF);
         ont.add(unionModel);
//ont.write(System.out, "TURTLE");
ont.writeAll(System.out, "TURTLE");
         System.out.println("ont triples: " + ont.size());


I suppose I could write out the entire model + inferences, but that can
take a while.  The Jena API must know which triples are inferred in order
for ont.write() to behave differently from ont.writeAll(), but I can't see
how to filter them out from the information in the Javadocs.

Just write out the whole model, there will be a lot more inferredtriples than there are base triples so you won't be saving much byomitting the base ones.

The issue is that the reasoners in general, and OWL_Micro specifically,use a mix of forward and backward deductions.

The forward deductions are indeed stored separately and can be obtainedby getDeductionsModel.

However, the backward deductions are only computed on demand in responseto queries. Some of those are indirectly cached in the backwardreasoner's tabled predicates but others are never cached. So the onlyway to obtain all deductions to ask the most general query.


There are a few things you can do which might help performance.

First, you could materialize all the triples before you write them out.The writer will make a lot of separate calls so anything that isn'tcached may be recomputed. So try something like:


    Model myMaterializedModel = ModelFactory.createDefaultModel();
    myMaterializedModel.add ( ont );

Then you can write out myMaterializedModel or if you really to you couldremove the base models before doing so:


   myMaterializedModel.remove( schema );
   myMaterializedModel.remove( tdbModel );

Second, given that the reasoning is being done in memory you may find itmore efficient to copy tdbModel into a memory model first, then wrap thereasoner round that.

Third, if for your purposes there are only certain types of queries youneed to run you may chose to materialize only some of the inferences.For example if you only care about inferred types you could perform morerestricted materializations such as:

myMaterializedModel.add( ont.listStatements(null, RDF.type,(RDFNode)null) );


Dave

Re: Inferred Triples

Reply via email to