Re: Different query results from ARQ and Fuseki

Andy Seaborne Mon, 08 Apr 2013 05:39:08 -0700

On 07/04/13 21:20, Bob DuCharme wrote:

 >How are you running Fuseki?


I used "File upload" on the http://localhost:3030/sparql.tpl screen to
load a ttl file of the data into Fuseki  and then pasted the query into
the "SPARQL Query" field at the top of that form.


Into a TDB dataset presumable (e.g. --loc=DB)

Would it be correct to say, as an extremely abbreviated version of the
explanation, that the answer sets are different because ARQ is using
some Jena inferencing derived from RDF-MT, and Fuseki isn't? (I noticed
that Sesame only returns d:item2b, like Fuseki.)

It's ARQ in both cases - there is only one query engine [*] and it's ARQboth times.


"Extremely abbreviated version"

Jena in-memory storage test values of strings. Same value gives a queryhit, even if written as "foo"^^xsd:string in one place and "foo" in another.


Goes away in RDF 1.1.

Are there any other inferencing rules in http://www.w3.org/TR/rdf-mt/
that Jena implements besides xsd 1a and xsd 1b?


Nothing else at the base level.

To get more inference, you need to use an inferencing graph.

TDB does special stuff with integers and decimals (and all their derivedtypes). It does value testing and stores the value, loosing the detailsof the data. And it goes faster because of that.


"+1"^^xsd:integer matches "001"^^xsd:short

        Andy


Thanks,

Bob

[*] Not true but unrelated - there are two a "reference" one and thenormal one. The reference engine is very naive, and blindly and simplyexecutes the query as given - hopefully simple enough you can look atthe code and be reasonably certain it gets the right answers. Used tocompare to the normal engine.



On 4/7/2013 5:29 AM, Andy Seaborne wrote:

On 06/04/13 20:59, Bob DuCharme wrote:

With the following data,

   @prefix d:   <http://learningsparql.com/ns/data#> .
   @prefix dm:  <http://learningsparql.com/ns/demo#> .
   @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
   @prefix mt: <http://learningsparql.com/ns/mytypesystem#> .

   d:item2a dm:prop "two" .
   d:item2b dm:prop "two"^^xsd:string .
   d:item2c dm:prop "two"^^mt:potrzebies .
   d:item2d dm:prop "two"@en .

I run this query,Possibly sh

   PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
   PREFIX d: <http://learningsparql.com/ns/data#>

   SELECT ?s
   WHERE { ?s ?p "two"^^xsd:string . }

and Jena 2.7.4 ARQ gives me this result,

   ------------
   | s        |
   ============
   | d:item2b |
   | d:item2a |
   ------------

but running the same query against the same data with Fuseki 0.2.6 gives
me this:

   ------------
   | s        |
   ============
   | d:item2b |
   ------------

Why is this? According to current W3C Recommendations (as opposed to
future plans for RDF 1.1), Fuseki is more correct, right?


Hi Bob,

Not quite - it depends on which recommendations you read :-)
specifically how much of RDF Model Theory (RDF-MT).

How are you running Fuseki?

Jena memory models provide the inference that simple literals and
xsd:strings are the same value.  This is RDF-MT rules xsd 1a and xsd 1b

== xsd 1a
      uuu aaa "sss".       ->   uuu aaa "sss"^^xsd:string .
== xsd 1b
     uuu aaa "sss"^^xsd:string .   ->   uuu aaa "sss".

so if you include that spec, you get two rows because "two" matches
"two" and "two"^^xsd:string.


But the memory models also keep those terms apart so that

:x :p "foo" .
:x :p "foo"^^xsd:string .

is stored as two triples.

Storage models just keep those two triples apart.  It would be costly
to also index on value at scale.  TDB, whether in-memory or on-disk,
treats the "" and ""^^xsd:string forms separately.  So Fuseki/TDB
isn't including RDF MT.

TDB could have done the translation of ^^xsd:string to simple literals
(c.f. treatment of integers in TDB) on loading but that does risk
being inconvenient for ontology storage where the use of xsd:string is
common and they don't treat simple literals as xsd:string.

Future:

In RDF 1.1, this changes.  All literals have datatypes. For @lang ones
it's rdf:langString (SPARQL already includes this).

A simple literal (no language tag, no datatype) becomes surface syntax
for xsd:string.  So at parse time,

"foo" ==> "foo"^^xsd:string

and

:x :p "foo" .
:x :p "foo"^^xsd:string .

is the same triple - and hence one triple in the graph.

You will then get

   ------------
   | s        |
   ============
   | d:item2b |
   | d:item2a |
   ------------

on all storage types (and printing xsd:strings will be the un-^^ form).

While a small change in the grand scheme of things, it's going to be
an interesting one.

    Andy


Thanks,

Bob

Re: Different query results from ARQ and Fuseki

Reply via email to