Thanks! Peter
From: Sloane, Brandon [mailto:[email protected]] Sent: Thursday, 5 December 2019 11:59 AM To: Peter Kostouros <[email protected]>; [email protected] Subject: Re: Daffodil parsing fails on optional elements when maxOccurs set to "unbounded", passes when set to "999" The issue is that you have maxOccurs="unbounded" on elements which are potentially 0 bits long. In particular, F_Record and B_Record. Both of those elements have only optional children. This means that they will never fail to parse. Instead they will succeed in parsing, but consume 0 bits. Because they can occur an unbounded number of times, Daffodil considers this to be an error, and backtracks (and subsequently throws an unrelated error down the line). When maxOccurs is finite, then Daffodil will parse the 0 bits a finite number of times before resuming the parse normally. The simplest solution to this, is to add an explicit assertion that F_Record and B_Record are non-empty: <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert> { dfdl:contentLength(.,'bits') gt 0 } </dfdl:assert> </xs:appinfo> </xs:annotation> Attached, you will find a version of pug_records.xsd that takes this approach. While this is not technically a bug in Daffodil, it really should issue a warning when this situation arises. I have opened a ticket to that effect: https://issues.apache.org/jira/browse/DAFFODIL-2247 Given the above, you may be wondering why you do not see thousands of empty instances of F_Record when maxOccurs="9999". I believe this is the correct behavior in this case as defined by section 9.4.2.3, but I would need to read the spec very closely to be sure that this is not a bug in Daffodil. Regards, Brandon ________________________________ From: Peter Kostouros <[email protected]<mailto:[email protected]>> Sent: Wednesday, December 4, 2019 4:52 PM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: RE: Daffodil parsing fails on optional elements when maxOccurs set to "unbounded", passes when set to "999" Hi I have attached a dataset that shows the problem (PUG.IN) as well as its corresponding parsed output when the schema has set maxOccurs limits on selected optional elements (PUG_999.IN.XML). The F_LOOP records in file PUG.IN start with "31" and "32A". Peter From: Sloane, Brandon [mailto:[email protected]] Sent: Thursday, 5 December 2019 2:44 AM To: [email protected]<mailto:[email protected]> Subject: Re: Daffodil parsing fails on optional elements when maxOccurs set to "unbounded", passes when set to "999" The only thing that stands out to me is that the error you are seeing should be coming from ControlRecord, which isn't part of the quoted schema. Other then that, I am not sure what the issue could be (unless your data actually parses more then 999 instances when unbounded is used). Do you have example data that you can share which demonstrates the problem? ________________________________ From: Peter Kostouros <[email protected]<mailto:[email protected]>> Sent: Wednesday, December 4, 2019 12:35 AM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: Daffodil parsing fails on optional elements when maxOccurs set to "unbounded", passes when set to "999" Hi I hope someone can point me in the right direction to help me understand behaviour I have seen with a particular schema when parsing a file. I have a schema modelled on the NACHA schema files found in the DFDLSchemas/NACHA directory on github. In my case, with respect to optional (embedded) looping elements: 1. Parsing is unsuccessful when maxOccurs attribute is set to "unbounded" (Parse Error: Failed to populate ControlRecord[1]. Cause: Parse Error: Assertion failed: Not Control Record); 2. Parsing is successful when maxOccurs is limited to say "999". Below is a snippet from the schema referred to above that results in error: <!-- F LOOP --> <xs:element dfdl:lengthKind="implicit" name="F_Records" minOccurs="0"> <xs:complexType> <xs:sequence> <xs:element dfdl:lengthKind="implicit" name="F_Record" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element ref="F01_Record" minOccurs="0" /> <xs:element dfdl:lengthKind="implicit" name="F02_Records" minOccurs="0"> <xs:complexType> <xs:sequence> <xs:element ref="F02_Record" minOccurs="0" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> I have attached schema files that demonstrate this issue, so I hope someone can advise me on what I should be correctly doing. I am using the Daffodil 2.4.0 release as well as the 2.5 snapshot JAVA APIs, and both show similar behaviour; I have also seen this behaviour when running the files though the daffodil command line, with the following tunables "unqualifiedPathStepPolicy" = "defaultNamespace" "suppressSchemaDefinitionWarnings" = "multipleChoiceBranches noEmptyDefault" Peter This e-mail and any attachment is intended for the party to which it is addressed and may contain confidential information or be subject to professional privilege. Its transmission in not intended to place the contents into the public domain. If you have received this e-mail in error, please notify us immediately and delete the email and all copies. AWTA Ltd does not warrant that this e-mail is virus or error free. By opening this e-mail and any attachment the user assumes all responsibility for any loss or damage resulting from such action, whether or not caused by the negligence of AWTA Ltd. The contents of this e-mail and any attachments are subject to copyright and may not be reproduced, adapted or transmitted without the prior written permission of the copyright owner. This e-mail and any attachment is intended for the party to which it is addressed and may contain confidential information or be subject to professional privilege. Its transmission in not intended to place the contents into the public domain. If you have received this e-mail in error, please notify us immediately and delete the email and all copies. AWTA Ltd does not warrant that this e-mail is virus or error free. By opening this e-mail and any attachment the user assumes all responsibility for any loss or damage resulting from such action, whether or not caused by the negligence of AWTA Ltd. The contents of this e-mail and any attachments are subject to copyright and may not be reproduced, adapted or transmitted without the prior written permission of the copyright owner. This e-mail and any attachment is intended for the party to which it is addressed and may contain confidential information or be subject to professional privilege. Its transmission in not intended to place the contents into the public domain. If you have received this e-mail in error, please notify us immediately and delete the email and all copies. AWTA Ltd does not warrant that this e-mail is virus or error free. By opening this e-mail and any attachment the user assumes all responsibility for any loss or damage resulting from such action, whether or not caused by the negligence of AWTA Ltd. The contents of this e-mail and any attachments are subject to copyright and may not be reproduced, adapted or transmitted without the prior written permission of the copyright owner.
