On 06/04/13 20:59, Bob DuCharme wrote:
With the following data,

   @prefix d:   <http://learningsparql.com/ns/data#> .
   @prefix dm:  <http://learningsparql.com/ns/demo#> .
   @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
   @prefix mt:  <http://learningsparql.com/ns/mytypesystem#> .

   d:item2a dm:prop "two" .
   d:item2b dm:prop "two"^^xsd:string .
   d:item2c dm:prop "two"^^mt:potrzebies .
   d:item2d dm:prop "two"@en .

I run this query,Possibly sh

   PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
   PREFIX d: <http://learningsparql.com/ns/data#>

   SELECT ?s
   WHERE { ?s ?p "two"^^xsd:string . }

and Jena 2.7.4 ARQ gives me this result,

   ------------
   | s        |
   ============
   | d:item2b |
   | d:item2a |
   ------------

but running the same query against the same data with Fuseki 0.2.6 gives
me this:

   ------------
   | s        |
   ============
   | d:item2b |
   ------------

Why is this? According to current W3C Recommendations (as opposed to
future plans for RDF 1.1), Fuseki is more correct, right?

Hi Bob,

Not quite - it depends on which recommendations you read :-) specifically how much of RDF Model Theory (RDF-MT).

How are you running Fuseki?

Jena memory models provide the inference that simple literals and xsd:strings are the same value. This is RDF-MT rules xsd 1a and xsd 1b

== xsd 1a
        uuu aaa "sss".          ->   uuu aaa "sss"^^xsd:string .
== xsd 1b
        uuu aaa "sss"^^xsd:string .   ->   uuu aaa "sss".

so if you include that spec, you get two rows because "two" matches "two" and "two"^^xsd:string.


But the memory models also keep those terms apart so that

:x :p "foo" .
:x :p "foo"^^xsd:string .

is stored as two triples.

Storage models just keep those two triples apart. It would be costly to also index on value at scale. TDB, whether in-memory or on-disk, treats the "" and ""^^xsd:string forms separately. So Fuseki/TDB isn't including RDF MT.

TDB could have done the translation of ^^xsd:string to simple literals (c.f. treatment of integers in TDB) on loading but that does risk being inconvenient for ontology storage where the use of xsd:string is common and they don't treat simple literals as xsd:string.

Future:

In RDF 1.1, this changes. All literals have datatypes. For @lang ones it's rdf:langString (SPARQL already includes this).

A simple literal (no language tag, no datatype) becomes surface syntax for xsd:string. So at parse time,

"foo" ==> "foo"^^xsd:string

and

:x :p "foo" .
:x :p "foo"^^xsd:string .

is the same triple - and hence one triple in the graph.

You will then get

   ------------
   | s        |
   ============
   | d:item2b |
   | d:item2a |
   ------------

on all storage types (and printing xsd:strings will be the un-^^ form).

While a small change in the grand scheme of things, it's going to be an interesting one.

        Andy


Thanks,

Bob


Reply via email to