-----Original Message-----
From: Andy Seaborne <[email protected]>
Sent: Friday, 6 September 2024 16:40
To: [email protected]
Subject: Re: rdfdiff shows graphs are unequal, but does not list the differences
On 06/09/2024 13:32, John Walker wrote:
Hi Andy,
-----Original Message-----
From: Andy Seaborne <[email protected]>
Sent: Friday, 6 September 2024 10:54
To: [email protected]
Subject: Re: rdfdiff shows graphs are unequal, but does not list the
differences
On 05/09/2024 19:12, John Walker wrote:
Hi,
I am working on a project where we cleanse/normalize some RDF data,
and I
am using the rdfdiff utility to compare the input and output.
The utility tells me the models are unequal, but it does not list
any
statements.
Through a process of trial and error, I could isolate a couple of
literals that are
changed, but it’s unclear why the utility does not detect them.
* "1756"^^xsd:int --> "1756"^^xsd:integer
* "2024-03-13T12:52:06.227Z"^^xsd:dateTime -->
"2024-03-13T12:52:06.227000+00:00"^^xsd:dateTime
See attached minimal examples.
$ rdfdiff original.ttl modified.ttl TTL TTL models are unequal
Does Jena normalize literals when parsing the input files?
Not unless you configure the parser to do that or it goes into TDB.
Are the literal values different, or not?
xsd:dateTime: They are different RDF terms, they represent the same value.
Am I correct to say the variant with "Z" time zone is the canonical lexical
representation?
Yes.
https://www.w3.org/TR/xmlschema11-2/#f-tzCanFragMap
xsd:int/xsd:integer:
TDB1 blurs the difference, TDB2 retains the datatype.
Reading SPARQL 1.1 recommendation, is it correct to say:
"1756"^^xsd:int = "1756"^^xsd:integer produces a type error
That is not a type error. "=" compares values and they have the same value
space (numbers) so they can be be compared.
https://www.w3.org/TR/sparql11-query/#OperatorMapping
sameTerm("1756"^^xsd:int, "1756"^^xsd:integer) = false
Correct.
There can be multiple terms for the same value.
also false --
sameTerm("+1756"^^xsd:integer, "1756"^^xsd:integer)
sameTerm("01756"^^xsd:integer, "1756"^^xsd:integer)
This because RDF 1.1 Concepts and Abstract Syntax literal term equality
requires the datatype IRIs to compare equal, character by character.
and also RDF is not dependent on XSD datatypes. They are suggested but
there is no requirement to handle XSD. There is in SPARQL for a limited set of
datatypes and pragmatically, many triple store support a lot more than the
minimum.
I'm a bit puzzled by the example for "2004-12-31T19:00:00-
05:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> and
xsd:dateTime("2005-01-01T00:00:00Z").
Those are not character by character equal, but the RDFterm-equal returns.
They are not term equals (sameTerm).
They are value equals (the same point on the time line)
RDFTerm-Equal is a fallback. Two terms are value-equal if the terms are
the sameTerm regardless of understanding the datatype (datatypes are a
function).