Re: Disabling BNode UID generation

Beaudet, David Wed, 09 Feb 2022 09:14:04 -0800

I ran across an API call the other day that checks isomorphism.  See the 
topbraid shacl library junit test runner. I think it's called by the dash test 
case class to make sure the resulting graph matches the expected response.



On Feb 9, 2022 11:10, "Shaw, Ryan" <ryans...@unc.edu> wrote:
Thank you, Andy.

I agree that working on the triple level is the correct way to approach this. I 
was looking for something quick and dirty that would work with textual diffing 
by a VCS, hence my focus on the blank node labels.

Are there any examples of how to use the isomorphism utilities in Jena?

> On Feb 5, 2022, at 12:48 PM, Andy Seaborne <a...@apache.org> wrote:
>
>
>
> On 04/02/2022 19:09, Shaw, Ryan wrote:
>> Hello,
>> I am trying to experiment with generating diffable N-Triples or flat Turtle 
>> files.
> ...
>> Thanks,
>> Ryan
>
>
> Info: There is work on a charter for
>
> "RDF Dataset Canonicalization and Hash Working Group"
>
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c.github.io%2Frch-wg-charter%2F&amp;data=04%7C01%7C%7C9b4e78ea9e08469c023008d9ebe6a533%7C53f6461e95ad4b08a8da973e49ae9312%7C0%7C0%7C637800198129953885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=iFjDAQwclQvtNtNPWQ1c98VVZh5WzEjyFcSRzP%2FckkQ%3D&amp;reserved=0
>
> The end of section 1 has some links to related work.
>
> Given RDF is inherently unordered, canonicalization and "diff of triples" are 
> related.
>
>
> For diff-able files, what counts as "different" between two files?
>
> Instead of changing the bnode algorithm, have you considered making use of 
> bnode-isomorphism? That is, during a diff, maintain a growing mapping from 
> bnodes in one list of triples to bnodes in the other list?
> Iso.isomorphicTriples
>
> (The list being the triples in encounter order during parsing). It is working 
> not so much on the syntax as the abstraction of triples. e.g A Turtle file 
> and an NT file produced by parsing the TTL file can be defined to be "the 
> same".
>
> It's fairly portable across files generated by other systems as well except 
> for Turtle lists - Jena as a fixed order for triple generation for a list but 
> it isn't necesasrily the same for all systems.
>
> Jena's Turtle algorithm, which is in LangTurtleBase, generates in list order, 
> with rdf:first, then rdf:rest; the triple the referencing the list appears 
> after the list. It happens to be the way the spec explains it:
>   
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fturtle%2F%23sec-parsing-triples&amp;data=04%7C01%7C%7C9b4e78ea9e08469c023008d9ebe6a533%7C53f6461e95ad4b08a8da973e49ae9312%7C0%7C0%7C637800198129953885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Y1dlFIAko0H92M2VQrUDvDQmZqCWYwDuJUFNFJoSVyc%3D&amp;reserved=0
> but that is defining the outcome and isn't a requirement.
>
>    Andy

Re: Disabling BNode UID generation

Reply via email to