Hi all, I executed two simple equivalent queries having a big performance
difference on a large dataset:
1. First matching by two alternative predicates using pipe operator
* SELECT (count(*) as ?total) WHERE { *
* { ?s <http://someURI1 <http://someURI1>> | <http://someURI1
<http://someURI1>> ?o .}*
* }*
this one is very slow and query plan shows the following matching
pattern:
(path ?subject (alt <http://someURI1> <http://someURI2> ) ?object)))))
2. If I use UNION operator instead of pipe the query becomes fast
* SELECT (count(*) as ?total) WHERE {*
* { ?s <http://someURI1 <http://someURI1>> ?o . }** UNION** { ?s
<http://someURI2 <http://someURI2>> ?o . }*
* }*
query plan here is different and shows UNION of two BGP matches:
(union (bgp (triple ?s <http://someURI1> ?o )) (bgp (triple ?s <
http://someURI2> ?o ))))))
Documentation here
https://jena.apache.org/documentation/query/property_paths.html tells that:
1. "Paths are “simple” if they involve only operators / (sequence), ^
(reverse, unary or binary) and the form {n}, for some single integer n."
2. "A path is “complex” if it involves one or more of the operators
*,?, + and {}."
These statements do do define implications of | - it should act like union,
but query plan is different - is it a bug or a feature? Is there general
recommendation to use UNION instead of pipe?
Thanks for help!