Here is a trick used in one schema I've seen:

<xs:group name="requireNoDataLeft">
  <xs:sequence>
    <xs:element name="data" type="tns:tIntField" dfdl:length="1" minOccurs="0"/>
    <xs:sequence>
      <xs:annotation>
        <xs:appinfo source="http://www.ogf.org/dfdl/";>
          <dfdl:assert test="{ fn:not(fn:exists(data)) }"
message="Data found where none was expected." />
        </xs:appinfo>
      </xs:annotation>
    </xs:sequence>
  </xs:sequence>
</xs:group>

So a group reference to "requireNoDataLeft" states "There cannot be
any more data available."

This mostly is for the case where there is a surrounding "box" of data
such as an element with lengthKind 'explicit' and you expect the
described contents to use up everything in that box.

So if your first choice branch ends with a group ref to
"requireNoDataLeft" then it must consume all available data, and will
fail (and backtrack the choice to the next one) if there is data
available after it.


On Tue, May 3, 2022 at 1:52 PM Roger L Costello <[email protected]> wrote:

> The “left over data” error occurs when there is a choice where the first
> branch matches the same data as the second branch and the second branch
> matches a bit more. Input data that matches the second branch fails because
> the first branch parses the input and then stops and reports left over
> data. See example below.
>
>
>
> Is there a workaround? (without manually shuffling the order of the
> branches in the choice)
>
>
>
> <xs:choice>
>     <xs:element name="MilitaryDayTime">
>         <xs:complexType>
>             <xs:sequence dfdl:separator="">
>                 <xs:element name="Day" type="non-zero-length-string"
> dfdl:lengthPattern="[0-9]{2}"/>
>                 <xs:element name="HourTime" type="non-zero-length-string"
> dfdl:lengthPattern="[0-9]{2}"/>
>                 <xs:element name="MinuteTime"
> type="non-zero-length-string" dfdl:lengthPattern="[0-9]{2}"/>
>                 <xs:element name="TimeZone" type="non-zero-length-string"
> dfdl:lengthPattern="..."/>
>             </xs:sequence>
>         </xs:complexType>
>     </xs:element>
>    <xs:element name="DateTimeGroup">
>         <xs:complexType>
>             <xs:sequence dfdl:separator="">
>                 <xs:element name="Day" type="non-zero-length-string"
> dfdl:lengthPattern="[0-9]{2}"/>
>                 <xs:element name="HourTime" type="non-zero-length-string"
> dfdl:lengthPattern="[0-9]{2}"/>
>                 <xs:element name="MinuteTime"
> type="non-zero-length-string" dfdl:lengthPattern="[0-9]{2}"/>
>                 <xs:element name="TimeZone" type="non-zero-length-string"
> dfdl:lengthPattern="..."/>
>                 <xs:element name="MonthName" type="non-zero-length-string"
> dfdl:lengthPattern="…"/>
>                 <xs:element name="Year" type="non-zero-length-string"
> dfdl:lengthPattern="[0-9]{4}"/>
>             </xs:sequence>
>         </xs:complexType>
>     </xs:element>
> </xs:choice>
>
>
>
>
>
> *From:* Mike Beckerle <[email protected]>
> *Sent:* Monday, May 2, 2022 10:02 AM
> *To:* [email protected]
> *Subject:* [EXT] Re: Catalog the causes of the dreaded “left over data”
> error message
>
>
>
> I first encountered left-over-data with a dead-simple file format. Just a
> top level element named "records" with a minOccurs="0"
> maxOccurs="unbounded" array of elements named "record".
>
>
>
> Due to minOccurs="0" such a schema is very happy to "successfully" parse
> zero records, and tell you the entire file contents are "left over data".
>
>
>
> I learned one often wants to have minOccurs="1" to force it to at least be
> successful on one record.
>
>
>
>
>
>
>
> On Fri, Apr 15, 2022 at 9:48 AM Roger L Costello <[email protected]>
> wrote:
>
> Hi Folks,
>
> Have you encountered the “left over data” error message? If you’ve worked
> with Daffodil for more than 5 minutes, you undoubtedly have.
>
> The problem with that error message is it gives you absolutely no clue
> what’s causing the problem.
>
> Perhaps if we start cataloging the things that triggered the error
> message, then the Daffodil team will be able to provide better diagnostics.
> Here’s my contribution to said catalog.
>
> -----------------------
>
> In recent weeks I have encountered the dreaded “left over data” error
> message twice. After enormous effort I was able to figure out what the
> problems were in my DFDL schema. First I need to describe my DFDL schema.
>
> My DFDL schema consists of a series of element declarations and within
> each element are declarations of subelements:
>
> A
>     A.1
>     A.2
>     …
> B
>     B.1
>     B.2
>     …
> …
>
> Each subelement is of type string and uses a regex to describe the
> subelement’s data (i.e., the subelements use dfdl:lengthKind=”pattern” and
> dfdl:lengthPattern=”regex”)
>
> The first time that I got the “left over data” error message I found the
> cause was due to this bug in my DFDL schema: a dfdl:lengthPattern listed
> the regex alternatives in the wrong order (shortest to longest instead of
> longest to shortest). The error message said that Daffodil stopped
> consuming input at element G. The actual element containing the regex in
> wrong order was element G.2 (Daffodil stopped consuming input pretty near
> the problem)
>
> After I fixed that bug I immediately got another “left over data” error at
> element J. After much more effort I found the bug: a regex erroneously had
> spaces in it. In this case, the error message said that Daffodil stopped
> consuming input at element J. The actual element containing the regex with
> spaces was element K.5 (Daffodil stopped consuming input pretty far from
> the problem)
>
> /Roger
>
>

Reply via email to