Hi Andy, Thank you so much for your patience and help. I think I’ve got a handle on things now and will forge ahead.
I appreciate you raising JENA-1905 <https://issues.apache.org/jira/browse/JENA-1905>, JENA-1906 <https://issues.apache.org/jira/browse/JENA-1906>, and JENA-1907 <https://issues.apache.org/jira/browse/JENA-1907>. I’ll comment on the issues as appropriate. Thank you again, Chris > On Jun 1, 2020, at 4:44 PM, Andy Seaborne <a...@apache.org> wrote: > > > > On 01/06/2020 21:08, Chris Tomlinson wrote: >> Hi Andy, >> Not trying to be pedantic below but I’m trying to understand how to think in >> shacl and establish some expectations of the validation process. > > If it help, the general pattern is > > Target -> > (Node shape -> property shape->)* > Constraint* > >>> On May 31, 2020, at 9:40 AM, Andy Seaborne <a...@apache.org> wrote: >>> >>> Do we agree that this is a test case? >>> (one file, data and shapes combined) >>> Only command line tools needed. >> I agree that the combined data and shapes file exhibits differences in >> report results, when interchanging bds:PersonShape and bds:PersonLocalShape. >>> ------------------------ >>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . >>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . >>> @prefix sh: <http://www.w3.org/ns/shacl#> . >>> @prefix bdo: <http://purl.bdrc.io/ontology/core/> . >>> @prefix bdr: <http://purl.bdrc.io/resource/> . >>> @prefix bds: <http://purl.bdrc.io/ontology/shapes/core/> . >>> >>> ## Data: >>> >>> bdr:NM0895CB6787E8AC6E >>> a bdo:PersonName ; >>> . >>> >>> bdr:P707 a bdo:Person ; >>> bdo:personName bdr:NM0895CB6787E8AC6E ; >>> . >>> >>> ## Shapes: >>> >>> #bds:PersonShape # 2 >>> bds:PersonLocalShape # 1 >>> sh:property bds:PersonShape-personName ; >>> sh:targetClass bdo:Person ; >>> . >>> >>> bds:PersonShape-personName >>> sh:message "PersonName is not well-formed, wrong Class or missing >>> rdfs:label"@en ; >>> sh:node bds:PersonNameShape ; >>> sh:path bdo:personName ; >>> . >>> >>> bds:PersonNameShape a sh:NodeShape ; >>> sh:property bds:PersonNameShape-personNameLabel ; >>> sh:targetClass bdo:PersonName ; >>> . >>> >>> bds:PersonNameShape-personNameLabel >>> sh:message ":PersonName must have exactly one rdfs:label"@en ; >>> sh:minCount 1 ; >>> sh:path rdfs:label ; >>> . >>> ------------------------ >>> >>> The differences seems to be that the hash order is different and it affects >>> finding targets, combined with the fact that targets are nested: >> I see JENA-1907 <https://issues.apache.org/jira/browse/JENA-1907> raises the >> issue; I understand: >>> If A is processed first as a target then the parser shapes now includes B >>> so processing B is skipped. >>> Note - the effect is only in the number of times constriants are executed , >>> once or twice, not whether they are omitted. >> to say that, in the current test case w/ the hash order issue, when nesting >> occurs owing to sh:node, then when a violation is found by (A) >> bds:PersonShape-personName, then the validation does not "go deeper" to >> consider (B) bds:PersonNameShape, by itself. W/o sh:node, in >> bds:PersonShape-personName, then both bds:PersonShape-personName and >> bds:PersonNameShape are parsed as independent targets and executed >> independently. >>> bds:PersonLocalShape (target) >>> -> bds:PersonLocalShape >>> -> bds:PersonNameShape (target) >>> -> bds:PersonNameShape-personNameLabel >> I think the second line above is supposed to be >> -> bds:PersonShape-personName >>> Both targets match bdr:P707, one by class, one by property. >> I understand the NodeShape, bds:PersonLocalShape, matching bdr:P707, >> meaning, to me, that the constraints expressed in that shape need to be >> evaluated w/ P707 being the subject (== focus node). I take this to be “by >> class”. >> I do not understand how NodeShape, bds:PersonNameShape, matches bdr:P707. I >> think bds:PersonNameShape matches bdr:NM0895CB6787E8AC6E because of >> sh:targetClass bdo:PersonName. > > 1/ > bds:PersonShape > sh:targetClass bdo:Person > -> bdr:P707 > > and is has > sh:property bds:PersonShape-personName ; > -> > sh:node bds:PersonNameShape ; > -> > sh:property bds:PersonNameShape-personNameLabel ; > > 2/ > bds:PersonNameShape a sh:NodeShape ; > sh:property bds:PersonNameShape-personNameLabel ; > sh:targetClass bdo:PersonName ; <-- which is part of bdr:P707 > -> bdr:NM0895CB6787E8AC6E ; > > so two ways to get to bds:PersonNameShape-personNameLabel from target > declarations. > > (try "shacl validate -v") > > In case1: you can see the paths: > 2 targets. > each with one focus node > leading to the same property shape /PersonNameShape-personNameLabel > which has a constraint. > > (I checked the spec and it is onlt says to execute once if the same focus > node comes up multiple times for the same target shape but here there are two > different target shapes. TQ shacl agrees.) > > F: Focus node > S: Node Shape > P: Property Shape. > C: Constraint > > NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonLocalShape] > N: FocusNodes(1): [http://purl.bdrc.io/resource/P707] > F: http://purl.bdrc.io/resource/P707 > S: NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonLocalShape] > P: > PropertyShape[http://purl.bdrc.io/ontology/shapes/core/PersonShape-personName > -> <http://purl.bdrc.io/ontology/core/personName>] > C: http://purl.bdrc.io/resource/P707 :: Node > S: NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape] > P: > PropertyShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape-personNameLabel > -> <http://www.w3.org/2000/01/rdf-schema#label>] > C: http://purl.bdrc.io/resource/NM0895CB6787E8AC6E :: minCount[1] > > > NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape] > N: FocusNodes(1): [http://purl.bdrc.io/resource/NM0895CB6787E8AC6E] > F: http://purl.bdrc.io/resource/NM0895CB6787E8AC6E > S: NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape] > P: > PropertyShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape-personNameLabel > -> <http://www.w3.org/2000/01/rdf-schema#label>] > C: http://purl.bdrc.io/resource/NM0895CB6787E8AC6E :: minCount[1] > > >>> It should execute twice - >> I’m not following the referent “it” (but see below, I think I may). > > The constraint(s) of bds:PersonShape-personName > >> My understanding of (target) bds:PersonLocalShape is that for resources of >> targetClass, bdo:Person, check that the constraints expressed in >> bds:PersonShape-personName conform for all objects of bdo:personName where >> the subject of that property path is bdr:P707 (in this case); and >> (target) bds:PersonNameShape says that for resources of targetClass, >> bdo:PersonName, check that the constraints in PersonShape-personNameLabel >> conform where the resource is a bdo:PersonName, in this case >> bdr:NM0895CB6787E8AC6E. >> I don’t see what’s supposed to execute twice. > > /PersonNameShape-personNameLabel > > and constraint minCount[1] on NM0895CB6787E8AC6E > >>> but did you mean to do this in the first place? Note while it is a minCount >>> failure, because of going through the sh;node, the message is the "wrong >>> Class" one because executing via bds:PersonShape-personName makes that the >>> message. >> I meant to express that for a bdo:Person there must be at least 1 >> bdo:personName - via bds:PersonShape-personName (the test case omits >> sh:minCount 1 in bds:PersonShape-personName); > > Yes - because that minCount was not a factor. > > I worked though the data removing each element that did not affect the > outcome, 3 vs 2, then remove the SPARQL constaint which is not relevant (it > contributed one violation in both cases) leaving 2 vs 1. > and that is due to the /PersonNameShape-personNameLabel minCount > >> and that a conforming bdoPersonName must have exactly 1 rdfs:label (the test >> case omits sh:maxCount 1 in bds:PersonShape-personNameLabel). >> I used "sh:node bds:PersonNameShape" in the declaration for >> bds:PersonShape-personName to identify the particular NodeShape that is >> intended to validate objects of the "sh:path bdo:personName” in this >> situation. >> Perhaps I see what is "supposed to execute twice”. >> With the "sh:node bds:PersonNameShape” in bds:PersonShape-personName, then >> bds:PersonNameShape validation must be executed (if it hasn’t already been >> executed); and >> since bdr:NM0895CB6787E8AC6E will match bds:PersonNameShape separately by >> considering “sh:targetClass bdo:PersonName” then unless there is some check >> in the validator to see if a (node, shape) pair has already been executed, >> then there will be 2 executions instead of just 1. >>> You can see the differences with "shacl print”. >> I do see differences w/ “shacl parse” w/ and w/o "sh:node >> bds:PersonNameShape”. I’ll learn to use the tool. >> My take away is that I shouldn’t be using sh:node as I have or perhaps I >> could remove the sh:targetClass from bds:PersonNameShape and use sh:node to >> steer the validation. But I guess the latter would lead to the generic >> "PersonName is not well-formed …” message instead of the more specific >> "PersonName must have exactly one rdfs:label”. > > Dulication arises when theer is a target that is also referred to by another > target by some connections though the shaps graph - sh:node is one way of > doing. > > There are other ways to link in a constraint twice like graph linking: > > ## Data: > > :foo a :C ; > :prop 1 , 2 . > > > ## Shapes: > > :A > sh:targetClass :C ; > sh:property :P . > > :B > sh:targetClass :C ; > sh:property :P . > > :P > sh:path :prop ; > sh:message "Hello world" ; > sh:maxCount 1 . > > 2 violations, both with "Hello World", for the same reason > > >> There seem to be many nuances to shacl. >> Anyway thanks very much for the valuable information regarding using shacl, >> Chris >>> >>> Andy >>> >>> >>> On 29/05/2020 20:39, Chris Tomlinson wrote: >>>> Hi Andy, >>>> Thank you for the reply. Focussing on just the first question. I have >>>> prepared small self-contained tests of jena-shacl from 3.14.0 (JS) and >>>> TopQuadrant Shacl 1.3.2 (TQ). >>>> The apps differ only according to differences imposed by the JS and TQ >>>> APIs: >>>> ShaclName_validateGraphJS.java <https://pastebin.com/5382xZeL> >>>> ShaclName_validateGraphTQ.java <https://pastebin.com/3BxmyhqA> >>>> The DATA_P707.ttl <https://pastebin.com/ugCZfABj> contains the three >>>> needed triples from the ontology and the bare minimum from the example >>>> P707 with two different errors in two of the PersonName instances. >>>> The ShapeName_01.ttl <https://pastebin.com/jDqzvPTe> contains the shape >>>> definitions and all tests are performed only by changing the name on line >>>> 9. >>>> The ShaclName_validateGraphJS-results-PersonShape.txt >>>> <https://pastebin.com/seEfWKNa> shows the results when the JS app is run >>>> with the name bds:PersonShape and gives the expected results. >>>> The ShaclName_validateGraphJS-results-PersonLocalShape… >>>> <https://pastebin.com/q1SWMC4H> shows the results when the JS app is run >>>> with the name bds:PersonLocalShape and gives unexpected results. Namely, >>>> the expected violation regarding the PersonName which uses skos:prefLabel >>>> instead of rdfs:label is erroneously reported as conforming. >>>> The ShaclName_validateGraphJS-results-varying.txt >>>> <https://pastebin.com/CNwnE5kg> shows results for names ranging from “P”, >>>> “Pe”, “Per” thru “PersonLocal”, “PersonShape” upto “PersonLocalShape”, >>>> “PersonLocalShaper”, and finally “PersonLocalShapers” for the JS app. In >>>> the table a “0” means the unexpected result and a “1” means the expected >>>> result - 7 names produce unexpected results and 20 names produce expected >>>> results. >>>> The ShaclName_validateGraphTQ-results.txt <https://pastebin.com/BQnStjVq> >>>> shows the results when the TQ app is run for any spelling of the name on >>>> line 9 of ShapeName_01.ttl <https://pastebin.com/jDqzvPTe>. The results >>>> are the expected results as with some spellings of the name in the JS >>>> case. TQ shows no variation owing to the name on line 9 as is expected. >>>> (Note: The TQ engine needed to be re-initialized for each use otherwise it >>>> accumulated results. This is why there is an init of the >>>> ShaclSimpleValidator at each use in the JS app even though it is not >>>> needed. I just wanted to produce as much as possible an apples-to-apples >>>> comparison of JS and TQ.) >>>> (Note: The TQ report does not include sh:conforms true ; in the results, >>>> just: [ a sh:ValidationReport ] . I don’t know if this conforms to >>>> the SHACL spec but that’s another matter.) >>>> The results from the command line tests show the same as the above. >>>> Running with line 9 of ShapeName_01.ttl <https://pastebin.com/jDqzvPTe> >>>> set to bds:PersonLocalShape: >>>> shacl v -s ShapeName_01.ttl -d DATA_P707.ttl > >>>> PersonLocalShape_JS_Results.ttl <https://pastebin.com/M9s859Kc> >>>> produces the unexpected results, namely there is no detail regarding the >>>> missing rdfs:label on bdr:NM0895CB6787E8AC6E. >>>> However, running with line 9 of ShapeName_01.ttl >>>> <https://pastebin.com/jDqzvPTe> set to bds:PersonShape: >>>> shacl v -s ShapeName_01.ttl -d DATA_P707.ttl > >>>> PersonShape_JS_Results.ttl <https://pastebin.com/DhBNucpX> >>>> produces the expected results, in that the detail regarding the missing >>>> rdfs:label on bdr:NM0895CB6787E8AC6E is present among the results. >>>> I did not set up the TQ command line but I think the above TQ results make >>>> this testing unnecessary. >>>> I think these tests show that there is an unexpected dependence on a shape >>>> name in the JS library and not in the TQ library. I think this is an error >>>> and I can open a JIRA issue if appropriate. >>>> A consideration I have is that we want to be able to use the fuseki shacl >>>> endpoint for some processing and hence need to understand the expected >>>> behavior of the JS library which is integrated. >>>> Thank you again for your help >>>> Chris >>>>> On May 29, 2020, at 6:26 AM, Andy Seaborne <a...@apache.org> wrote: >>>>> >>>>>> Question 1: regarding the name bds:PersonShape at line 9 of >>>>>> ShapeName_01.ttl <https://pastebin.com/spJJAsJ3>. With that name the >>>>>> results of running ShaclName_validateGraph.java >>>>>> <https://pastebin.com/qvUy2XeB> are as expected, see >>>>>> ShapeName-results-PersonShape.txt <https://pastebin.com/Hbk4dj04>. >>>>>> There are two errors in P707_nameErrs02.ttl >>>>>> <https://pastebin.com/8wZeMiEU> regarding bdr:NMC2A097019ABA499F and >>>>>> bdr:NM0895CB6787E8AC6E which are reported in the >>>>>> ShapeName-results-PersonShape.txt <https://pastebin.com/Hbk4dj04> file. >>>>>> However, if the name at line 9 of ShapeName_01.ttl >>>>>> <https://pastebin.com/spJJAsJ3> is changed to: bds:PersonLocalShape or >>>>>> bds:Frogs; then detail for bdr:NM0895CB6787E8AC6E reports, (see >>>>>> ShapeName-results-PersonLocalShape.txt <https://pastebin.com/f4F9h1E2>): >>>>>> [ a sh:ValidationReport ; >>>>>> sh:conforms true ] . >>>>>> instead of: >>>>>> [ a sh:ValidationReport ; >>>>>> sh:conforms false ; >>>>>> sh:result [ a sh:ValidationResult ; >>>>>> sh:focusNode bdr:NM0895CB6787E8AC6E ; >>>>>> sh:resultMessage ":PersonName must have >>>>>> exactly one rdfs:label"@en ; >>>>>> sh:resultPath rdfs:label ; >>>>>> sh:resultSeverity sh:Violation ; >>>>>> sh:sourceConstraintComponent >>>>>> sh:MinCountConstraintComponent ; >>>>>> sh:sourceShape >>>>>> bds:PersonNameShape-personNameLabel >>>>>> ] >>>>>> ] . >>>>>> which is the result with bds:PersonShape at line 9 of ShapeName_01.ttl >>>>>> <https://pastebin.com/spJJAsJ3>. In fact changing the name to >>>>>> bds:FrogTarts also produces the expected results. >>>>>> Summary: If the shape name at line 9 of ShapeName_01.ttl >>>>>> <https://pastebin.com/spJJAsJ3> is either bds:PersonShape or >>>>>> bds:FrogTarts then the results are as expected; while if the shape name >>>>>> is either bds:PersonLocalShape or bds:Frogs then one of the detail >>>>>> results disappears and is replaced by sh:conforms true. >>>>>> Why this dependence on the shape name? The shape name isn’t referred to >>>>>> elsewhere in ShapeName_01.ttl <https://pastebin.com/spJJAsJ3>. >>>>> >>>>> >>>>> A way to check is run both Jena Shacl and TQ Shacl and see if they get >>>>> the same violations >>>>> >>>>> I ran the shapes and data in both and get 32 violations (with no ontology >>>>> added) >>>>> >>>>> and then running with the datafile as P707+ontology. Now 5 results each. >>>>> >>>>> shacl v -s ShapeName_01.ttl -d P707_nameErrs02.ttl > V1.ttl >>>>> >>>>> tb-shacl -shapesfile ShapeName_01.ttl -datafile P707_nameErrs02.ttl >>>>> >>>>> The name of the shape does not seem to make a difference when run like >>>>> this. >>>>> >>>>> Have you tries with targetNode to select the node to validate? With a >>>>> subset of thee shapes? That would make discussing it much easier as would >>>>> a self-contained data (the ontology isn't particularly small). >>>>> >>>>> Do you have an example which has one target shape and shows differences? >>>>> >>>>> >>>>> This: >>>>> >>>>> bds:PersonShape-personName >>>>> a sh:PropertyShape ; >>>>> sh:class bdo:PersonName ; >>>>> sh:message "PersonName is not well-formed, wrong Class or missing >>>>> rdfs:label"@en ; >>>>> sh:minCount 1 ; >>>>> sh:node bds:PersonNameShape ; >>>>> sh:nodeKind sh:IRI ; >>>>> sh:path bdo:personName ; >>>>> . >>>>> >>>>> (and others) could be split up into separate shapes, one per constraint >>>>> (this has node kind, node shape, and minCount) which might make the >>>>> report clearer >>>>> >>>>> bds:PersonNameShape also has a target - it can get called via two >>>>> different routes. >>>>> >>>>> It's quite complicated to track what's going on.