I see, thank you for the detailed description. I’ll experiment a bit and try to come up with the best possible solution.
> Den 9. jul. 2021 kl. 15.37 skrev Dave Reynolds <[email protected]>: > > [You don't often get email from [email protected]. Learn why this is > important at http://aka.ms/LearnAboutSenderIdentification.] > > On 08/07/2021 10:05, Simon Gray wrote: >> So I have a follow-up question... >> >> What I really want is an updatable graph that persists on disk as TDB and >> then an expanded view that contains all of the inferred triples too - this >> may very well be an in-memory graph. Basically, I want to be able to add to >> the underlying TDB graph and then rely on inference to create additional >> triples, keeping a separation. I am not interested in persisting any >> inferred triples to a new TDB like some of the replies here assume. To me, >> the advantage of inference is having the flexibility of expanding the >> dataset on-demand while having a separation between man-made, curated >> triples and some varying set of inferred triples. >> >> Is this at all possible to do with Jena? > > Possible but not necessarily performant or convenient. > > A major limitation of the Jena inference support is that it is in-memory > only. There's no mechanism to persist/reload the internal state of the > inference engines, you can only query for the resulting materialized > triples and persist those as discussed on this thread. And the inference > scaling is limited to memory, whether or not the base data is held on > scalable persistent storage. > > So you *can* create an inference model over a TDB model and updates made > through the inference model will be persisted by the TDB base model, and > also result in new inferences. However, the inference engine will be > querying TDB for every query made by the rules. The performance of that > will be much worse than performance of a purely in-memory configuration. > When you first start up your service the first query (or any explicit > initial prepare() call) will be very slow. After that, once the forward > inferences have been completed, performance should be better but still > significantly slower than a purely in-memory solution. > > Depending on your application structure and scale of data you may be > able to run with a dual in-memory-with-reasoning and > copied-to-TDB-for-persistence architecture. Where on start up you copy > the TDB data to the memory InfModel once and updates are written to both > copies. That would still have high start up latency but not as high. > > What you want to do is entirely reasonable but not well supported by > Jena inference as it stands. > > Dave > >>> Den 5. jul. 2021 kl. 10.38 skrev Dave Reynolds <[email protected]>: >>> >>> On 05/07/2021 08:03, Simon Gray wrote: >>>> Thank you for that answer, Dave! I think this provides the missing link in >>>> my understanding of the matter. >>>> Is there a single method call to use when copying the inference model to a >>>> plain model or do I need to make copies of every triple myself and add >>>> them to a new model? >>> >>> Model.add does it for you, so you should just need something like like: >>> >>> plain.add( infModel ); >>> >>> and it will enumerate all triples and add them to the new model. >>> Potentially taking some time! >>> >>> Dave >>> >>>>> Den 3. jul. 2021 kl. 18.34 skrev Dave Reynolds >>>>> <[email protected]>: >>>>> >>>>> >>>>> On 02/07/2021 13:29, Simon Gray wrote: >>>>>> Hmm… I am not sure how my rules are modeled. I just use the built-in >>>>>> OWL_MEM_MICRO_RULE_INF OntModelSpec. >>>>>> Anyway, my question is still this: how do I get all of those inferences >>>>>> computed *before* I start querying the Model. It’s great if I can just >>>>>> store them later, but I still need to *compute* them before I can think >>>>>> about persisting anything. Running a single query doesn’t seem to >>>>>> compute them all, just relevant ones to that specific query… I think? >>>>> >>>>> Short answer is there's no built in way to precompute everything that's >>>>> precomputable for the OWL reasoners other than that which the others have >>>>> pointed out - copy the inferred model. >>>>> >>>>> The OWL rules use a mix of forward and backward reasoning. The forward >>>>> reasoning can all be invoked in one go via prepare() but the backward >>>>> reasoning is mostly done on demand. Some of the backward rules are >>>>> tabled/memoized so once they've been run once future runs are supposed to >>>>> be quicker. Others are always run on demand. >>>>> >>>>> If you have a few particular query patterns then to warm up the relevant >>>>> memoization run those queries. >>>>> >>>>> The most comprehensive way to ensure everything has been computed is to >>>>> copy the model to a plain model (in memory or persistent). That copy is >>>>> essentially running the query (?s ?p ?o) and will compute everything the >>>>> rules can reach. After that the inference model is as warm as it's going >>>>> to get. But since that that point you've already materialized everything >>>>> then might as well keep the materialized copy as the others have said. >>>>> >>>>> There'd be nothing to doing the general query (e.g. via an unbounded >>>>> listStatements()) call and throwing the results away. That *could* be >>>>> beneficial if the materialized model is too big but the >>>>> tabling/memoization is proving useful and smaller - but no guarantees. >>>>> >>>>> Dave >>>>> >>>>>>> Den 2. jul. 2021 kl. 14.06 skrev Lorenz Buehmann >>>>>>> <[email protected]>: >>>>>>> >>>>>>> But can't you do this inference just once and then somewhere store >>>>>>> those inferences? Next time you can simply load the inferred model >>>>>>> instead of the raw dataset. It is not specific to TDB, you can load >>>>>>> dataset A, compute the inferred model in a slow process once, >>>>>>> materialize it as dataset B, and later on always work on dataset B - >>>>>>> this is standard forward chaining with writing the data back to disk or >>>>>>> database. Can you try this procedure, maybe it works for you? >>>>>>> >>>>>>> Indeed this wont work if your rules are currently modeled as backward >>>>>>> chaining rules as those are computed at query time always. >>>>>>> >>>>>>> >>>>>>> On 02.07.21 13:37, Simon Gray wrote: >>>>>>>> Thank you Lorenz, although this seems to be a reply to my side comment >>>>>>>> about TDB rather than the question I had, right? >>>>>>>> >>>>>>>> The main issue right now is that I would like to use inferencing to >>>>>>>> get e.g. inverse relations, but doing this is very slow the first time >>>>>>>> a query is run, likely due to some preprocessing step that needs to >>>>>>>> run first. I would like to run the preprocessing step in advance >>>>>>>> rather than running it implicitly. >>>>>>>> >>>>>>>>> Den 2. jul. 2021 kl. 13.30 skrev Lorenz Buehmann >>>>>>>>> <[email protected]>: >>>>>>>>> >>>>>>>>> you can just add the inferred model to the dataset, i.e. add all >>>>>>>>> triple to your TDB. Then you can disable the reasoner afterwards or >>>>>>>>> just omit the rules that you do not need anymore >>>>>>>>> >>>>>>>>> On 02.07.21 13:13, Simon Gray wrote: >>>>>>>>>> Hi there, >>>>>>>>>> >>>>>>>>>> I’m using Apache Jena from Clojure to create new home for the Danish >>>>>>>>>> WordNet. I use the Arachne Aristotle library + some additional Java >>>>>>>>>> interop code of my own. >>>>>>>>>> >>>>>>>>>> I would like to use OWL inferencing to query e.g transitive or >>>>>>>>>> inverse relations. This does seem to work fine although I’ve only >>>>>>>>>> tried using the supplied in-memory model for now (and it looks like >>>>>>>>>> I will have to create my own instance of a ModelMaker to integrate >>>>>>>>>> with TDB 1 or 2). >>>>>>>>>> >>>>>>>>>> However, the first query always seems to run really, really slow. Is >>>>>>>>>> there any way to precompute inferred relations so that I don’t have >>>>>>>>>> to wait? I’ve tried calling `rebind` and `prepare`, but they don’t >>>>>>>>>> seem to do anything. >>>>>>>>>> >>>>>>>>>> Kind regards, >>>>>>>>>> >>>>>>>>>> Simon Gray >>>>>>>>>> Research Officer >>>>>>>>>> Centre for Language Technology, University of Copenhagen >>>>>>>>>> >>>>>>>> >>
