Reasoners for RDFS + owl:sameAs: performance, stability & best practices

Andreas Kahl Mon, 19 Feb 2018 23:17:38 -0800

Hello everyone, 

I am currently developing a little Jena Model that should be able to do
RDFS inferencing plus owl:sameAs. From the documentation I learned that
the minimal Reasoner for that is OWLmini. During development I
experienced some severe performance bottlenecks if a runtime model
contains too many owl:sameAs links and generally for nearly all models
exceeding 1000 Statements. Most of the tests simply freeze at some point
if those performance bottlenecks occur, sometimes selecting a Statement
with a SimpleSelector consisting of a subject URI, a predicate URI and a
null Object takes 20secs. 
There should be not problems with blocking of threads as I run my
integration tests single threaded - especially if I am experiencing
failures.


I could confine this by using models without inferencing while
collecting and adding data spidered from the web, and especially adding
Ontologies last, only where absolutely needed. Also I use a whitelist
internally for domains my spider is allowed to fetch data from;
therefore I remove all owl:sameAs Statements containing object URIs not
in this whitelist. In the end, in my querying methods, I clone that
basic model with the collected data and add it to an InfModel: 

protected static Model getInfModelFrom(Model model) {
            final long size = model.size();
            LOG.debug("getInfModelFrom: Input size: " +
Long.toString(size));
            final Model copy = ModelFactory.createDefaultModel();
            copy.add(model instanceof InfModel ? ((InfModel)
model).getRawModel() : model);
            final InfModel infModel =
ModelFactory.createInfModel(ReasonerRegistry.getOWLMiniReasoner(),
copy);
            return infModel;
    }

The only Ontology I am using is
http://d-nb.info/standards/elementset/gnd# . 

I suppose that the Reasoner I use is much to mighty for the seemingly
simple owl:sameAs. Is there any more basic option understanding
owl:sameAs besides RDFS? All other OWL Axioms are not needed. 
Are there any best practices dealing with Inferencing for relatively
small in memory models <10,000 Statements (most <5,000 Statements)? I
found some information on the web that a simple 'Equality Reasoner' is
in the works. Would that be a good choice? Will it be available any time
soon?

Thanks for any hints
Andreas

Reasoners for RDFS + owl:sameAs: performance, stability & best practices

Reply via email to