[
https://issues.apache.org/jira/browse/JENA-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138251#comment-17138251
]
Rob Vesse edited comment on JENA-1918 at 6/17/20, 9:53 AM:
-----------------------------------------------------------
I just ran the same query, but without the OPTIONAL joins:
{code:java}
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
SELECT ?item
WHERE {
?item wdt:P31/wdt:P279* wd:Q23397.
}
ORDER BY ?item LIMIT 1 OFFSET 0
{code}
Running since more than one hour. So the joins are not the reason for the bad
performance.
was (Author: yolpsoftware):
I just ran the same query, but without the OPTIONAL joins:
{code:java}
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
SELECT ?item
WHERE {
?item wdt:P31/wdt:P279* wd:Q23397.
}
ORDER BY ?item LIMIT 1 OFFSET 0
{code}
Running since more than one hour. So the joins are not the reason for the bad
performance.
> Bad performance when using ORDER BY
> -----------------------------------
>
> Key: JENA-1918
> URL: https://issues.apache.org/jira/browse/JENA-1918
> Project: Apache Jena
> Issue Type: Bug
> Components: Jena
> Affects Versions: Jena 3.15.0
> Reporter: Jonas Sourlier
> Priority: Major
>
> I want to execute the following SPARQL against my local Apache Jena (with
> preloaded Wikidata dump using TDB2):
> {code:java}
> PREFIX wd: <http://www.wikidata.org/entity/>
> PREFIX wdt: <http://www.wikidata.org/prop/direct/>
> PREFIX wikibase: <http://wikiba.se/ontology#>
> PREFIX p: <http://www.wikidata.org/prop/>
> PREFIX ps: <http://www.wikidata.org/prop/statement/>
> PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
> SELECT ?item ?outflow ?drainageBasin ?coordinates ?elevation ?country
>
> WHERE {
> ?item wdt:P31/wdt:P279* wd:Q23397.
>
> OPTIONAL { ?item wdt:P201 ?outflow. }
> OPTIONAL { ?item wdt:P4614 ?drainageBasin. }
> OPTIONAL { ?item wdt:P625 ?coordinates. }
> OPTIONAL { ?item wdt:P2044 ?elevation. }
> OPTIONAL { ?item wdt:P17 ?country. }
> }
>
> ORDER BY ?item LIMIT 1 OFFSET 0
> {code}
> When run on query.wikidata.org (which uses Blazegraph), this query takes 26
> seconds to complete. Other queries run in about the same time as on
> query.wikidata.org.
> Apache Jena runs for several hours, using one CPU core and 3-4 GB of memory.
> Then it runs into some timeout (the timeout might be increased, but that's
> not the issue here).
> My question is, why is this so much slower than Blazegraph? Can this SPARQL
> be optimized to get a better performance? Can the query optimizer be tweaked
> to run this more efficiently?
> If not, then I consider this a bug, because the query itself should not
> generate such a big workload. If the query optimizer runs the
> {code:java}
> wdt:P31/wdt:P279*{code}
> predicate first, then limits it via the
> {code:java}
> ORDER BY ?item LIMIT 1 OFFSET 0{code}
> clause, there would be just one item for which it needs to execute the
> {code:java}
> OPTIONAL { ?item ... }{code}
> joins.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)