On 30/08/17 15:10, baran...@gmail.com wrote:

PS: I wonder why Dave doesn't comment in this thread. Perhaps because he thinks, Lorenz is ok, i myself cannot stand the low-level-knowledge of the users in this thread or no matter what you do, by some heavy data-input an app with InfModel would hang anyway? Lorenz is ofcourse ok, but i 'guess' Jena users are also very curious about Dave's comments...

I didn't comment on this thread because, as Andy has already pointed out, this seems to be a repeat of a recent similar thread (on which I did comment). That in turn was a near repeat of another similar thread. All from the same group.

Also I think Lorenz has covered it all, with admirable patience.

However, in an attempt to clarify the trade-offs in more depth maybe the following would be helpful:

When comparing a rule system against a set of SPARQL Update queries there are several factors that affect the trade-offs including (1) the specific nature of the rules/queries, (2) the data flow and, (3) preferences on syntax and machinery.

1a. For a single (forward) Jena rule then you can always achieve the same with a SPARQL Update. For a single set of input data then SPARQL has the benefit of being a standard [1] and offering better performance over a store like TDB. Conversely SPARQL is much richer than Jena rules so there are things that you could achieve with a single SPARQL Update query using, say, property paths that would require multiple Jena rules.

1b. If you have a set of rules, but they don't create loops/recursion, then you can "stratify" them into groups of rules than can be run one after the other. In that case, for a single set of input data, then again you can implement it as a sequence of SPARQL Updates with similar benefits.

1c. If your rules can't be stratified, i.e. one rule can indirectly trigger itself, then it's more complex. In that case you would have to e.g. run the set of SPARQL Updates repeatedly until nothing new is deduced. Depending on the specifics of the rules and the data that may be quite expensive and you would be better off with something Jena rules. However, in some cases you may be able to use things like SPARQL property paths to achieve the desired effect without have to recurse.

2. If you have a single data set and just want to run your rules on it then the above applies. If you are repeatedly adding new data and want to keep your deductions up to date then the Jena forward rules engine has the advantage that it keeps all the partial matches around. So addition of one more triple may cause a rule to fire without it having to search for all the other triples in the body. This is also why "recursive" rules work relatively efficiently.

This doesn't apply if you delete data. In that case Jena rules have to start over and can't reuse state across data deletions.

If you keeping changing your data but very rarely ask questions of it, and then only limited questions, then Jena back rules have advantages. The backward engine will only run the rules needed for the specific query. If that's a lot fewer than the overall rules then that should be cheaper than running a full forward deduction using SPARQL Updates. In this situation it may be possible to achieve the same effects through SPARQL query (not update) by query rewriting but that's a whole different ball game.

3. With Jena rules you have some prebuilt machinery for running the rules (InfGraphs and all that) and some support for externalizing the rules in separate files. With SPARQL you have to create all that (though it's easy) and you have a nicer syntax.

So fundamentally, like all "X vs Y" questions it depends on the specifics of what you are trying to do.

Dave

[1] There is a standard for rules, RIF, but it is not aimed at particularly RDF processing and post-dates Jena rules.

Reply via email to