Re: Problem inferencing from Jena TDB API

Ian Dickinson Wed, 23 Oct 2013 00:06:25 -0700

On Wed, Oct 23, 2013 at 7:03 AM, Dibyanshu Jaiswal <[email protected]> wrote:
> Just tell me what i can understand from your reply is correct or not.
Dave's not around at the moment, but I'll try to answer your question.


> By the solution provided by you, you mean to:
> Firstly create a model reading it from file with inferencing enabled.
> Second store the inferred model in the dataset while during the creation of
> the dataset.

> At last then read the model back from the dataset in OWL_MEM mode, which
> results in OntModel with inferences, but without inferencing mode enabled.
> Right?

I'm slightly confused by your description, but one of these steps is
not necessary.

To explain: inference in RDF models means using an algorithm to derive
additional
triples beyond those that are asserted in the original ('base') model.
For example,

    C rdfs:subClassOf B
    B rdfs:subClassOf A

allows an algorithm to infer the additional triple:

    C rdfs:subClassOf A

We call those additional triples the inference closure. Depending on
the model, there
can be quite a lot of them. When you construct an inference model, the inference
closure is computed dynamically (basically; there is some caching but
we'll leave that
to one side) in response to a query. So if you ask, via the API or SPARQL, "is
C rdfs:subClassOf A?", the inference algorithm will do work -
including pulling triples
from the base model - to answer that question.  The upside of this is
that (i) the
inference algorithm only has to do work in response to questions that
are asked (if
no-one cares about C's relationship to A, that inferred triple need
never be calculated),
and (ii) the results will adapt to updates to the model. The downside
is that every
time a question is asked, the algorithm typically has to make a lot of
queries into the
base model to test for the presence or absence of asserted triples.
That's OK for
memory models, which are efficient to access, but any model that is
stored on disk
makes that process very slow. There are two basic ways to get around
this problem:

1. When your application starts, make a copy of the model that's
stored on disk in
memory, and then query that via the inference engine. Pros: can be
more responsive
to updates, does not require the entire inference closure to be
calculated. Cons: the
model may be too large to fit in memory, persisting updates needs care.

2. Before your application starts, load the model from disk into an
inference memory
model, then save the entire inference closure to a new disk-based model. Then,
when your application runs, run queries directly against that new
model. Pros: you'll
get results from the inference closure as well as the base model
without having to
run inference algorithms when queries are asked, and you can cope with larger
models. Cons: needs more disk space, updating the model is much harder.

> If this is so! then Yes this can be useful to me. But I guess you missed a
> point in my question that, I directly want to query the dataset (Eg.:
> QueryExecution qexec = QueryExecutionFactory.create(**query, dataset); )
> and not the OntModel read from the dataset.
See above. You have to make choices among different trade-offs, and
those choices
will depend on the needs of your application and your users.

Ian

Re: Problem inferencing from Jena TDB API

Reply via email to