Re: TDB, inferencing and OOM

Dave Reynolds Sun, 12 Oct 2014 02:46:41 -0700

On 06/10/14 09:39, Claude Warren wrote:

I have the following code



         data = TDBFactory.createDataset( dir.getAbsolutePath() );
         data.begin( ReadWrite.WRITE);
         // read 2 files into the model (approx 3.5K triples)
         Model dataM = loadDir( ctxt, "/WEB-INF/resources/rdf/documents");
         // read http://www.w3.org/2009/08/skos-reference/skos.rdf
         Model schemaM = loadDir( ctxt, "/WEB-INF/resources/rdf/schemas" );
         Reasoner reasoner = ReasonerRegistry.getOWLMiniReasoner();
         reasoner = reasoner.bindSchema(schemaM);

         infModel = ModelFactory.createInfModel(reasoner, dataM);
         data.commit();

         // DEBUGGING CODE
         String realPath = ctxt.getRealPath("/WEB-INF/resources/");
         File f2 = new File(realPath, "data.rdf");

         // OOM occures here
         infModel.write(new FileOutputStream(f2));


So my questions are:


1) Should in inferencing be done within a write transaction and if so how
does one assure that all inferencing will be done within a write
transaction?  In this case I would expect that all inferencing would be
done in the infModel.write() call but in the general case I may not be
making that call.


Inferencing does not write to the store so no need for a write transaction.

2) Shoudn't the inferencing be writing to the TDB datastore?


No, or at least that's not the current design.

The rule engines were designed (many years ago) for purely in-memoryuse. They exploit specialist data structures (RETE network for theforward rules, goal tables for the backward rules (though those arepretty crude)). The results of inference are not necessarilymaterialized as triples and when they are they don't go into the basegraph (the forward engine has a separate deductions graph for this purpose).

Running the rule engine over a TDB store just makes them go slower anddoesn't offer any improved scaling.

The general pattern with Jena is to do the inference in-memory thenstore the (selected) results to persistent storage.

If is, of course, possible to develop reasoners that work at scaledirectly over persistent storage. There is a long history of research indeductive databases as well as various techniques to stretch in-memoryinference further with more memory efficiency and spill-to-disk.However, those would be new developments, not retrofits to the existingengines.


Dave

Re: TDB, inferencing and OOM

Reply via email to