-----Original Message-----
From: Barry Bishop [mailto:[email protected]]
Sent: Mittwoch, 05. September 2012 19:49
To: Polleres, Axel
Cc: [email protected]
Subject: Re: Querying only the default graph from the data store
Hello Axel,
Thanks for taking the time to reply. I realise this thread is
somewhat out of place given the status/progress of the WG.
Your reply does address my initial post. It does not resolve
it, but this is perhaps not the time. However, for the
purpose of clarity I will make further comments inline:
On 05/09/12 04:11, Polleres, Axel wrote:
Hi Barry,
This is in response to
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2012Aug/0
011.html
The working draft does not specify how the RDF dataset is
constructed
when no FROM and FROM NAMED clauses are present in the
SPARQL query.
Implementations are therefore able to construct the dataset
differently, e.g.
a. dataset default graph contains only the data store's
default graph
b. dataset default graph contains the RDF merge of all
graphs in the
data store
It is correct that how the concrete default dataset of a
SPARQL endpoint is conctructed is left open to
implementations. Since different endpoints and
implementations support different behaviours in this regard
(e.g. in some implementations the default graph of the
default dataset is the union of all named graphs whereas in
others this is not the case), the working group does not feel
that there is a unique standard behavior to be advocated this
time around.
I feel this is a shame, as two different implementations can
produce different output from the simplest of queries, e.g.
SELECT * { ?s ?p ?o }
However, this is a separate issue.
As soon as a single FROM or FROM NAMED clause is used then
the data
store's default graph is excluded from the query's dataset.
Which means that there is no portable way to defne a
SPARQL query so
that it executes only against the default graph in the
data store -
or even against a combination of the default graph and one or more
named graphs.
Please note that a) querying the default graph in the
datastore is the standard behavior when no explicit FROM or
FROM NAMED clauses are given. b) the combination of querying
named graphs and the default graph of the endpoint's default
dataset is supported via GRAPH graph patterns.
a) This is rather inconsistent. Above you say that the
construction of the default RDF dataset (when no FROM/FROM
NAMED clauses are given) is not defined, but here you say
constructing it using the default graph only is the 'standard
behaviour'. One of the motivations for this post is that
there are good reasons not to have only the default graph in
the 'default dataset', e.g. you wouldn't be able to do this
to find out the graph names when presented with an unknown endpoint:
SELECT DISTINCT ?g WHERE { GRAPH ?g {?s ?p ?o } }
Anyway, the point here is that there is no *portable* way to
query just the default graph.
b) yes, but you can't query the RDF merge of the default
graph and a named graph in the same way with two named
graphs, e.g. FROM ex:g1 FROM ex:g2. Instead one would need to
use a triple and graph pattern union, which for complex
queries becomes cumbersome. Put another way, any combination
of named graphs can be merged and explored with query triple
patterns, but this can't be done with any combination of
named graphs and the default graph.
See also examples below.
This is a problem that often confuses users of RDF data
stores and is
likely to lead to implementations that provide their own specific
means to achieve this, e.g.
http://www.openrdf.org/issues/browse/SES-850
Inspired by the update language's use of the 'DEFAULT' keyword for
graph manipulation, I suggest an extension to the query
language that
allows "FROM DEFAULT" to be used, e.g.
SELECT *
FROM DEFAULT
WHERE { ..... }
=> dataset contains a default graph made up of the data store's
default graph only
Please note that this the standard behaviour when no FROM clause is
given, i.e. this corresponds to
SELECT *
WHERE { ..... } <--- (no use of GRAPH keyword)
I don't think this is "standard behaviour", rather it is
common behaviour. It can not be standard when the
construction of the dataset is implementation dependent when
no FROM clause is given.
This construct can be used with any number of FROM <uri>
or FROM NAMED
<uri> clauses, e.g.
SELECT *
FROM DEFAULT
FROM <http://example.com#g1>
WHERE { ..... }
=> dataset contains a default graph made up of the data
store's default
graph merged with the contents of the data store's g1 graph
This would be a fairly trivial change for exisiting sparql
processor
implementations, but would provide a big improvement in
functionality/flexibility by allowing a data store's
default graph to be
used/queried/merged in the same way as any of it's named graphs.
Note that similar to the example above, you can query the
default graph and named graphs within the default dataset in
a data store side by side by using GRAPH graph patterns, i.e.
SELECT *
WHERE
{
..... <-- (no use of
GRAPH) matches the default graph
GRAPH <http://ex.com#g1> { .... } <-- matches named
graph g1 (assuming g1 is a named graph in the default dataset)
}
Consider an application that needs to execute queries over various
subsets of a database's contents, where the subsets are defined using
various combinations of named graphs. It would certainly be useful to
have standard queries which only required the appropriate
"FROM g1 FROM
g2 etc" prepended. This is easy to do, unless one of the
graphs is the
default graph.
Finally, note that it is not possible in SPARQL1.1 to
construct a *new* dataset composed of *parts* of the default
dataset of an endpoint plus possible external graphs; such a
feature currently not foreseen in the features addressed in
this round of SPARQL, but had been suggested before [1].
The features being worked on in this round of
standardization have been decided in a voting process at the
beginning of the WG and are documented in the following
document: http://www.w3.org/TR/sparql-features/
Additionally, a list of work items and features postponed
to a future working group are being collected by the group in
a dedicated wiki page [2] which also contains the features
discussed in the beginning of the WG which have not been
considered for this round [3].
Yes, I will be more timely next time and will endeavour to
progress this
topic in the proper way. My apologies for the 'noise'.
Regards,
barry
Among this list, the feature "Composite Datasets" [1] might
partially capture what you have in mind and a future WG might
possibly work out the details of such feature.
We'd kindly ask you to confirm by a reply to this list that
this addresses your comment.
Axel Polleres, on behalf of the SPARQL WG
1. http://www.w3.org/2009/sparql/wiki/Feature:CompositeDatasets
2. http://www.w3.org/2009/sparql/wiki/Future_Work_Items
3. http://www.w3.org/2009/sparql/wiki/Category:Features