Hi Andy

> -----Original Message-----
> From: Andy Seaborne <a...@apache.org>
> Sent: Monday, 9 September 2024 22:07
> To: users@jena.apache.org
> Subject: Re: rdfdiff shows graphs are unequal, but does not list the 
> differences
> 
> John,
> 
> So they differ only in the object and the terms are "same value" but not "same
> term"?

Yes, exactly.
I made a gist with the examples:
https://gist.github.com/jaw111/f4a9fb64904c442e36d64fea327e8a81

So, you would need to diff a_10.ttl with b_10.ttl and a_2.ttl with b_2.ttl to 
see the different behaviour.

John

> 
> Seems to work with the current Jena development codebase, but not in Jena
> 5.1.0.
> 
> "Fixed in the next release".
> 
>      Andy
> 
> On 09/09/2024 10:40, John Walker wrote:
> > Hi Andy,
> >
> > Thanks for the quick reply!
> >
> >
> >> -----Original Message-----
> >> From: Andy Seaborne <a...@apache.org>
> >> Sent: Friday, 6 September 2024 16:40
> >> To: users@jena.apache.org
> >> Subject: Re: rdfdiff shows graphs are unequal, but does not list the
> >> differences
> >>
> >>
> >>
> >> On 06/09/2024 13:32, John Walker wrote:
> >>> Hi Andy,
> >>>
> >>>> -----Original Message-----
> >>>> From: Andy Seaborne <a...@apache.org>
> >>>> Sent: Friday, 6 September 2024 10:54
> >>>> To: users@jena.apache.org
> >>>> Subject: Re: rdfdiff shows graphs are unequal, but does not list
> >>>> the differences
> >>>>
> >>>>
> >>>>
> >>>> On 05/09/2024 19:12, John Walker wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I am working on a project where we cleanse/normalize some RDF
> >>>>> data, and I
> >>>> am using the rdfdiff utility to compare the input and output.
> >>>>> The utility tells me the models are unequal, but it does not list
> >>>>> any
> >>>> statements.
> >>>>>
> >>>>> Through a process of trial and error, I could isolate a couple of
> >>>>> literals that are
> >>>> changed, but it’s unclear why the utility does not detect them.
> >>>>> * "1756"^^xsd:int --> "1756"^^xsd:integer
> >>>>> * "2024-03-13T12:52:06.227Z"^^xsd:dateTime -->
> >>>>> "2024-03-13T12:52:06.227000+00:00"^^xsd:dateTime
> >>>>>
> >>>>> See attached minimal examples.
> >>>>>
> >>>>> $ rdfdiff original.ttl modified.ttl TTL TTL models are unequal
> >>>>>
> >>>>> Does Jena normalize literals when parsing the input files?
> >>>>
> >>>> Not unless you configure the parser to do that or it goes into TDB.
> >>>>
> >>>>> Are the literal values different, or not?
> >>>>
> >>>> xsd:dateTime: They are different RDF terms, they represent the same
> value.
> >>>
> >>> Am I correct to say the variant with "Z" time zone is the canonical
> >>> lexical
> >> representation?
> >>
> >> Yes.
> >> https://www.w3.org/TR/xmlschema11-2/#f-tzCanFragMap
> >>
> >>>> xsd:int/xsd:integer:
> >>>> TDB1 blurs the difference, TDB2 retains the datatype.
> >>>
> >>> Reading SPARQL 1.1 recommendation, is it correct to say:
> >>>
> >>> "1756"^^xsd:int = "1756"^^xsd:integer produces a type error
> >>
> >> That is not a type error. "=" compares values and they have the same
> >> value space (numbers) so they can be be compared.
> >>
> >> https://www.w3.org/TR/sparql11-query/#OperatorMapping
> >>
> >>> sameTerm("1756"^^xsd:int, "1756"^^xsd:integer) = false
> >>
> >> Correct.
> >>
> >> There can be multiple terms for the same value.
> >>
> >> also false --
> >>
> >> sameTerm("+1756"^^xsd:integer, "1756"^^xsd:integer)
> >> sameTerm("01756"^^xsd:integer, "1756"^^xsd:integer)
> >>
> >>
> >>> This because RDF 1.1 Concepts and Abstract Syntax literal term
> >>> equality
> >> requires the datatype IRIs to compare equal, character by character.
> >>
> >> and also RDF is not dependent on XSD datatypes. They are suggested
> >> but there is no requirement to handle XSD. There is in SPARQL for a
> >> limited set of datatypes and pragmatically, many triple store support
> >> a lot more than the minimum.
> >>
> >>>
> >>> I'm a bit puzzled by the example for "2004-12-31T19:00:00-
> >> 05:00"^^<http://www.w3.org/2001/XMLSchema#dateTime> and
> >> xsd:dateTime("2005-01-01T00:00:00Z").
> >>> Those are not character by character equal, but the RDFterm-equal
> returns.
> >>
> >> They are not term equals (sameTerm).
> >> They are value equals (the same point on the time line)
> >>
> >> RDFTerm-Equal is a fallback. Two terms are value-equal if the terms
> >> are the sameTerm regardless of understanding the datatype (datatypes
> >> are a function).
> >
> > OK, clear.
> >
> >>
> >> Two dateTimes will be dispatched further up the table by:
> >>
> >> A = B      xsd:dateTime    xsd:dateTime    op:dateTime-equal(A, B)
> >>
> >>
> >>
> >>>
> >>>>
> >>>>
> >>>>> Is this a bug?
> >>>>
> >>>> Which Jena version are you running?
> >>>
> >>> I'm running 4.10.0 locally.
> >>>
> >>>>
> >>>> Jena5 changed to "term equality" everywhere for in-memory, with TDB
> >>>> still storing values.
> >>>
> >>> I'll try with the latest release.
> >
> > Using 5.1.0 the rdfdiff does output the statements with different terms
> when I try it with my initial files.
> > However, when I try with the smaller examples from my earlier mail, then no
> diff is shown.
> >
> > $ rdfdiff original.ttl modified.ttl TTL TTL models are unequal
> >
> > It seems strange, but if I add more statements to both files, then it does
> output the diff when both graphs contain at least 10 statements:
> >
> > $ rdfdiff original.ttl modified.ttl TTL TTL models are unequal
> >
> > < [http://example.com/this, http://purl.org/dc/terms/modified,
> > "2024-03-13T12:52:06.227Z"^^xsd:dateTime]
> > < [http://example.com/this, http://open-services.net/ns/core#shortId,
> > "1756"^^xsd:int]
> >> [http://example.com/this, http://purl.org/dc/terms/modified,
> >> "2024-03-13T12:52:06.227000+00:00"^^xsd:dateTime]
> >> [http://example.com/this, http://open-services.net/ns/core#shortId,
> >> "1756"^^xsd:integer]
> >
> > Seems like odd behaviour.
> >
> > John
> >
> >>>
> >>> John
> >>>
> >>>>
> >>>>        Andy
> >>>>
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> John Walker
> >>>>> Principal Consultant & co-founder
> >>>>>
> >>>>> Semaku B.V. | Torenallee 20 (SFJ 3D) | 5617 BC Eindhoven | T +31 6
> >>>>> 42590072 | https://semaku.com/
> >>>>> KvK: 58031405 | BTW: NL852842156B01 | IBAN: NL94 INGB 0008
> 3219
> >> 95
> >>>
> >

Reply via email to