Re: puzzling performance issue

2021-10-19 Thread Rob Vesse
Because it isn't a valid semantics preserving optimization. ARQ only applies optimizations that preserve the semantics of the query, the fact that this is ultimately a CONSTRUCT query doesn't change the semantics of the query evaluation itself, merely the final RDF produced. Making the rewrite

Re: puzzling performance issue

2021-10-19 Thread Élie Roux
> As others pointed out semantically the evaluation of the two query forms > yields very different intermediate results. It's only the presence of the > post-processing CONSTRUCT stage that happens to strip out the > duplicates/unusable results. Any optimizations MUST preserve the overall > s

Re: puzzling performance issue

2021-10-19 Thread Rob Vesse
Not really This pattern of unconnected BGPs has legitimate use cases. A common one is doing similarity calculations where you use unconnected BGPs to create every possible combination of results and then use BIND and/or FILTER to compute some metric and use that to filter/rank the combinations

Re: puzzling performance issue

2021-10-07 Thread Élie Roux
> Overall, it whether the WHERE answer is 16*26*2636 rows (all one BGP) or > 16+26+2636 rows (union). Yes, I understand better now, thanks! Do you think there might be some optimization at some point for that case? I suspect this is very common in SPARQL queries out there... Best, -- Elie

Re: puzzling performance issue

2021-10-07 Thread Andy Seaborne
On 07/10/2021 12:30, Élie Roux wrote: if you take this expression WHERE { { bdr:MW23703_1183 ?instp ?insto . # 200ms alone } union { bdr:MW23703_1183 :hasTitle ?t . ?t ?tp ?to . #245ms alone } union { bdr:MW23703_1183 :partOf+ ?ancestor . ?ancestor :hasPart ?ancestorPart . #

Re: puzzling performance issue

2021-10-07 Thread Andy Seaborne
When there are different parts of pattern going to make up different parts of the CONSTRUCT template, splitting it up into UNION makes sense. It is using the fact that in a CONSTRUCT template, if variables are unbound, the triple pattern isn't substantiated but the rest of the triples from the

Re: puzzling performance issue

2021-10-07 Thread Élie Roux
> if you take this expression > > WHERE > { > { > bdr:MW23703_1183 ?instp ?insto . # 200ms alone > } union { > bdr:MW23703_1183 :hasTitle ?t . ?t ?tp ?to . #245ms alone > } union { > bdr:MW23703_1183 :partOf+ ?ancestor . ?ancestor :hasPart > ?ancestorPart . # 200ms alone > } > } > > se

Re: puzzling performance issue

2021-10-07 Thread Élie Roux
Thanks a lot for your very informative answer Richard, it's really helpful to know when writing queries! It seems this is a case where some optimizations might be implemented? (I'm afraid this isn't something I could contribute though, sorry) Best, -- Elie

Re: puzzling performance issue

2021-10-07 Thread Richard Cyganiak
Queries of the form CONSTRUCT {...} WHERE {...} are evaluated with a three-stage pipeline. First, the query SELECT * WHERE {...} is executed. Second, the CONSTRUCT template is applied to each result row (producing no triple for any triple pattern that has a variable without value in t

Re: puzzling performance issue

2021-10-06 Thread Élie Roux
After long hours of anxiety, I discovered that using unions as in CONSTRUCT { bdr:MW23703_1183 ?instp ?insto . ?t ?tp ?to . ?ancestor :hasPart ?ancestorPart . } WHERE { { bdr:MW23703_1183 ?instp ?insto . # 200ms alone } union { bdr:MW23703_1183 :hasTitle ?t . ?t ?tp ?to . #2

puzzling performance issue

2021-10-06 Thread Élie Roux
Dear all, I'm experiencing a performance issue that I can't understand... I'm using: - Jena 3.14.0 , Fuseki (I'm testing in the web interface) - TDB1 - none.opt - this configuration: https://github.com/buda-base/buda-base/blob/master/conf/fuseki/ttl.erb (with some variable substitutions) - the rel