Something like could work, if there are no cycles:
select ?class where {
<http://example.com/concept> rdfs:subClassOf* ?mid .
?mid rdfs:subClassOf* ?class .
}
group by ?class
order by count(?mid)
On 01.02.2016 16:29, Andy Seaborne wrote:
On 01/02/16 13:05, Joël Kuiper wrote:
Concretely I’m using this to get the path to root for a specific
concept, e.g.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?parent
WHERE {
GRAPH ?g {
<http://example.com/concept> rdfs:subClassOf+ ?parent .
}}
Would return all the (distinct) intermediates
So this is a rather different query to the first example!
The first one did not return ?concept but was searching over all
?concept which suggested to me that it was the cause of the expense.
Also, rdfs:subClassOf* and rdfs:subClassOf+ are different.
?anything <randomProperty>* ?anything
so it looked like there was a huge amount of work going on.
DISTINCT isn't needed if ?g is just one graph.
What do you use named graphs for in your data?
Can you put the data online where?
Andy
On 01 Feb 2016, at 13:49, Andy Seaborne <a...@apache.org> wrote:
On 01/02/16 12:11, Joël Kuiper wrote:
Hey all,
I’m trying to run a query to find the path to the root concept of a
graph.
The entries are defined as rdfs:subClassOf
Currently I’m using
PREFIX skos: <http://www.w3.org/2004/02/skos/core#
<http://www.w3.org/2004/02/skos/core#>>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#
<http://www.w3.org/2000/01/rdf-schema#>>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>>
SELECT ?parent ?label
WHERE {
GRAPH ?g {
?concept rdfs:subClassOf* ?parent
}}
However on a moderately sized data set < 1M triples, this query
sometimes takes /minutes/.
I suspect it has to do with the disk-based TDB (since I hear my HDD
spin a lot), but still.
Is there a way to optimise this query, maybe by using a different
reasoner? And if so how would that reasoner be used!
Thanks in advance,
Joël
If you are using a reasoner and doing rdfs:subClassOf* then that is
doing excessive redundant work.
(Otherwise, with TDB is materialises more nodes that it needs to.)
The root is a node without a parent so this might help, without a
reasoner:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?top
WHERE {
GRAPH ?g {
?concept rdfs:subClassOf ?top
FILTER NOT EXISTS { ?top rdfs:subClassOf ?x }
}
}