Re: Distinct graphs

Robert Vesse Thu, 08 Mar 2012 14:00:56 -0800

I don't know a lot about the internals of TDB but it may be that the two 
queries are broadly speaking equivalent i.e. in order for TDB to determine what 
graphs are in the dataset it still has to do a full scan because AFAIK it is 
just storing quads and not necessarily storing any record of what named graphs 
are present independent of the quads - am I correct in this assumption Andy?


If that is the case the only reason my suggested query is faster is because it 
doesn't have to store the ?s ?p ?o solutions that the first method generates

Rob

On Mar 8, 2012, at 12:54 PM, Sarven Capadisli wrote:

> On 12-03-08 02:47 PM, Paolo Castagna wrote:
>> Rob Vesse wrote:
>>> Yes one possibility that me and Andy raised in that discussion was the
>>> use of the following:
>>> 
>>> SELECT DISTINCT ?g WHERE { GRAPH ?g { } }
>>> 
>>> Since GRAPH ?g is defined as an iteration over all graphs in the dataset
>>> (which may of course be modified by the presence of FROM and FROM NAMED)
>>> and the empty graph pattern returns a single empty solution (i.e. always
>>> matches) then on paper at least this query should do the same job and be
>>> much more performant.  Whether this query works may vary depending on
>>> how accurately an engine actually implements the SPARQL spec because the
>>> whole dataset/GRAPH interaction is one of the areas prone to ambiguities
>>> in the spec and differences of opinion between implementers
>> 
>> Indeed, the optimization might already be there... Sarven, could you try to 
>> see
>> if SELECT DISTINCT ?g { GRAPH ?g { } } gives you what you want, faster?
> 
> First of all, that worked! It took about 10-15 minutes the first time I tried 
> it. I just ran it again.. and 30 minutes in, still waiting for a response. 
> Odd.
> 
> -Sarven

Re: Distinct graphs

Reply via email to