Re: Precomputing OWL inferences

Simon Gray Mon, 12 Jul 2021 03:46:44 -0700

I see, thank you for the detailed description. 

I’ll experiment a bit and try to come up with the best possible solution.


> Den 9. jul. 2021 kl. 15.37 skrev Dave Reynolds <[email protected]>:
> 
> [You don't often get email from [email protected]. Learn why this is 
> important at http://aka.ms/LearnAboutSenderIdentification.]
> 
> On 08/07/2021 10:05, Simon Gray wrote:
>> So I have a follow-up question...
>> 
>> What I really want is an updatable graph that persists on disk as TDB and 
>> then an expanded view that contains all of the inferred triples too - this 
>> may very well be an in-memory graph. Basically, I want to be able to add to 
>> the underlying TDB graph and then rely on inference to create additional 
>> triples, keeping a separation. I am not interested in persisting any 
>> inferred triples to a new TDB like some of the replies here assume. To me, 
>> the advantage of inference is having the flexibility of expanding the 
>> dataset on-demand while having a separation between man-made, curated 
>> triples and some varying set of inferred triples.
>> 
>> Is this at all possible to do with Jena?
> 
> Possible but not necessarily performant or convenient.
> 
> A major limitation of the Jena inference support is that it is in-memory
> only. There's no mechanism to persist/reload the internal state of the
> inference engines, you can only query for the resulting materialized
> triples and persist those as discussed on this thread. And the inference
> scaling is limited to memory, whether or not the base data is held on
> scalable persistent storage.
> 
> So you *can* create an inference model over a TDB model and updates made
> through the inference model will be persisted by the TDB base model, and
> also result in new inferences. However, the inference engine will be
> querying TDB for every query made by the rules. The performance of that
> will be much worse than performance of a purely in-memory configuration.
> When you first start up your service the first query (or any explicit
> initial prepare() call) will be very slow. After that, once the forward
> inferences have been completed, performance should be better but still
> significantly slower than a purely in-memory solution.
> 
> Depending on your application structure and scale of data you may be
> able to run with a dual in-memory-with-reasoning and
> copied-to-TDB-for-persistence architecture. Where on start up you copy
> the TDB data to the memory InfModel once and updates are written to both
> copies. That would still have high start up latency but not as high.
> 
> What you want to do is entirely reasonable but not well supported by
> Jena inference as it stands.
> 
> Dave
> 
>>> Den 5. jul. 2021 kl. 10.38 skrev Dave Reynolds <[email protected]>:
>>> 
>>> On 05/07/2021 08:03, Simon Gray wrote:
>>>> Thank you for that answer, Dave! I think this provides the missing link in 
>>>> my understanding of the matter.
>>>> Is there a single method call to use when copying the inference model to a 
>>>> plain model or do I need to make copies of every triple myself and add 
>>>> them to a new model?
>>> 
>>> Model.add does it for you, so you should just need something like like:
>>> 
>>>    plain.add( infModel );
>>> 
>>> and it will enumerate all triples and add them to the new model. 
>>> Potentially taking some time!
>>> 
>>> Dave
>>> 
>>>>> Den 3. jul. 2021 kl. 18.34 skrev Dave Reynolds 
>>>>> <[email protected]>:
>>>>> 
>>>>> 
>>>>> On 02/07/2021 13:29, Simon Gray wrote:
>>>>>> Hmm… I am not sure how my rules are modeled. I just use the built-in 
>>>>>> OWL_MEM_MICRO_RULE_INF OntModelSpec.
>>>>>> Anyway, my question is still this: how do I get all of those inferences 
>>>>>> computed *before* I start querying the Model. It’s great if I can just 
>>>>>> store them later, but I still need to *compute* them before I can think 
>>>>>> about persisting anything. Running a single query doesn’t seem to 
>>>>>> compute them all, just relevant ones to that specific query… I think?
>>>>> 
>>>>> Short answer is there's no built in way to precompute everything that's 
>>>>> precomputable for the OWL reasoners other than that which the others have 
>>>>> pointed out - copy the inferred model.
>>>>> 
>>>>> The OWL rules use a mix of forward and backward reasoning. The forward 
>>>>> reasoning can all be invoked in one go via prepare() but the backward 
>>>>> reasoning is mostly done on demand. Some of the backward rules are 
>>>>> tabled/memoized so once they've been run once future runs are supposed to 
>>>>> be quicker. Others are always run on demand.
>>>>> 
>>>>> If you have a few particular query patterns then to warm up the relevant 
>>>>> memoization run those queries.
>>>>> 
>>>>> The most comprehensive way to ensure everything has been computed is to 
>>>>> copy the model to a plain model (in memory or persistent). That copy is 
>>>>> essentially running the query (?s ?p ?o) and will compute everything the 
>>>>> rules can reach. After that the inference model is as warm as it's going 
>>>>> to get. But since that that point you've already materialized everything 
>>>>> then might as well keep the materialized copy as the others have said.
>>>>> 
>>>>> There'd be nothing to doing the general query (e.g. via an unbounded 
>>>>> listStatements()) call and throwing the results away. That *could* be 
>>>>> beneficial if the materialized model is too big but the 
>>>>> tabling/memoization is proving useful and smaller - but no guarantees.
>>>>> 
>>>>> Dave
>>>>> 
>>>>>>> Den 2. jul. 2021 kl. 14.06 skrev Lorenz Buehmann 
>>>>>>> <[email protected]>:
>>>>>>> 
>>>>>>> But can't you do this inference just once and then somewhere store 
>>>>>>> those inferences? Next time you can simply load the inferred model 
>>>>>>> instead of the raw dataset. It is not specific to TDB, you can load 
>>>>>>> dataset A, compute the inferred model in a slow process once, 
>>>>>>> materialize it as dataset B, and later on always work on dataset B - 
>>>>>>> this is standard forward chaining with writing the data back to disk or 
>>>>>>> database. Can you try this procedure, maybe it works for you?
>>>>>>> 
>>>>>>> Indeed this wont work if your rules are currently modeled as backward 
>>>>>>> chaining rules as those are computed at query time always.
>>>>>>> 
>>>>>>> 
>>>>>>> On 02.07.21 13:37, Simon Gray wrote:
>>>>>>>> Thank you Lorenz, although this seems to be a reply to my side comment 
>>>>>>>> about TDB rather than the question I had, right?
>>>>>>>> 
>>>>>>>> The main issue right now is that I would like to use inferencing to 
>>>>>>>> get e.g. inverse relations, but doing this is very slow the first time 
>>>>>>>> a query is run, likely due to some preprocessing step that needs to 
>>>>>>>> run first. I would like to run the preprocessing step in advance 
>>>>>>>> rather than running it implicitly.
>>>>>>>> 
>>>>>>>>> Den 2. jul. 2021 kl. 13.30 skrev Lorenz Buehmann 
>>>>>>>>> <[email protected]>:
>>>>>>>>> 
>>>>>>>>> you can just add the inferred model to the dataset, i.e. add all 
>>>>>>>>> triple to your TDB. Then you can disable the reasoner afterwards or 
>>>>>>>>> just omit the rules that you do not need anymore
>>>>>>>>> 
>>>>>>>>> On 02.07.21 13:13, Simon Gray wrote:
>>>>>>>>>> Hi there,
>>>>>>>>>> 
>>>>>>>>>> I’m using Apache Jena from Clojure to create new home for the Danish 
>>>>>>>>>> WordNet. I use the Arachne Aristotle library + some additional Java 
>>>>>>>>>> interop code of my own.
>>>>>>>>>> 
>>>>>>>>>> I would like to use OWL inferencing to query e.g transitive or 
>>>>>>>>>> inverse relations. This does seem to work fine although I’ve only 
>>>>>>>>>> tried using the supplied in-memory model for now (and it looks like 
>>>>>>>>>> I will have to create my own instance of a ModelMaker to integrate 
>>>>>>>>>> with TDB 1 or 2).
>>>>>>>>>> 
>>>>>>>>>> However, the first query always seems to run really, really slow. Is 
>>>>>>>>>> there any way to precompute inferred relations so that I don’t have 
>>>>>>>>>> to wait? I’ve tried calling `rebind` and `prepare`, but they don’t 
>>>>>>>>>> seem to do anything.
>>>>>>>>>> 
>>>>>>>>>> Kind regards,
>>>>>>>>>> 
>>>>>>>>>> Simon Gray
>>>>>>>>>> Research Officer
>>>>>>>>>> Centre for Language Technology, University of Copenhagen
>>>>>>>>>> 
>>>>>>>> 
>>

Re: Precomputing OWL inferences

Reply via email to