Thank you, Andy. 

I agree that working on the triple level is the correct way to approach this. I 
was looking for something quick and dirty that would work with textual diffing 
by a VCS, hence my focus on the blank node labels.

Are there any examples of how to use the isomorphism utilities in Jena?

> On Feb 5, 2022, at 12:48 PM, Andy Seaborne <a...@apache.org> wrote:
> 
> 
> 
> On 04/02/2022 19:09, Shaw, Ryan wrote:
>> Hello,
>> I am trying to experiment with generating diffable N-Triples or flat Turtle 
>> files.
> ...
>> Thanks,
>> Ryan
> 
> 
> Info: There is work on a charter for
> 
> "RDF Dataset Canonicalization and Hash Working Group"
> 
> https://w3c.github.io/rch-wg-charter/
> 
> The end of section 1 has some links to related work.
> 
> Given RDF is inherently unordered, canonicalization and "diff of triples" are 
> related.
> 
> 
> For diff-able files, what counts as "different" between two files?
> 
> Instead of changing the bnode algorithm, have you considered making use of 
> bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
> bnodes in one list of triples to bnodes in the other list?
> Iso.isomorphicTriples
> 
> (The list being the triples in encounter order during parsing). It is working 
> not so much on the syntax as the abstraction of triples. e.g A Turtle file 
> and an NT file produced by parsing the TTL file can be defined to be "the 
> same".
> 
> It's fairly portable across files generated by other systems as well except 
> for Turtle lists - Jena as a fixed order for triple generation for a list but 
> it isn't necesasrily the same for all systems.
> 
> Jena's Turtle algorithm, which is in LangTurtleBase, generates in list order, 
> with rdf:first, then rdf:rest; the triple the referencing the list appears 
> after the list. It happens to be the way the spec explains it:
>   https://www.w3.org/TR/turtle/#sec-parsing-triples
> but that is defining the outcome and isn't a requirement.
> 
>    Andy

Reply via email to