Roger,

You need more than just the -V limited option. You really do need the
validString simpleType (that uses dfdl:assert with the
dfdl:checkConstraints function)

Otherwise your schema elements don't have the right composition behaviors
when combined with other elements.

Thought experiment: With the pattern facets erased, your dateTimeWrapper
element is either "-" or *any string of 16 characters*.

Pretty sure that's not what you want. And it's not the behavior someone
would expect if you give them a dateTime element type to reuse to create a
larger schema.

In your schema, you want the pattern facets to matter to the parse
algorithm. Hence, the need for the validString trick.

Strings are really like this a lot. They are often something highly
structured where it really must match a specific syntax be considered
well-formed.

You can do this by using DFDL's types like dateTime or numbers with
dfdl:representation='text' and the various properties that constrain the
format of these.

Or you can just use string, and XSD pattern/enumeration facets, but then
you need the pattern facets to be checked as the parse algorithm proceeds
element by element.

-mikeb




On Wed, Sep 21, 2022 at 12:38 PM Roger L Costello <[email protected]>
wrote:

> Hi Folks,
>
>
>
> Please let me know of anything that is unclear.  /Roger
>
>
> --------------------------------------------------------------------------------------
>
>
> 3. Fixed length, nillable, composite, no choice
>
>
>
> A composite field is one that is composed of parts. There is no separator
> between the parts. The parts may be fixed length or variable length. The
> parts are non-nillable, although the composite field itself may be
> nillable.
>
> This section deals with composite fields containing parts that are all
> fixed length and the field is nillable.
>
> We will create a DFDL schema for a “DateTime” field that has a date and
> time, separated by T. Here is a sample value:
>
> 20220919T134700Z
>
> That is one value with 8 parts:
>
> The first four digits (2022) is the year.
>
> The next two digits (09) is the month.
>
> The next two digits (19) is the day.
>
> The T separates the date from the time.
>
> The 13 is the hour.
>
> The 47 is the minute.
>
> The 00 is the second.
>
> The Z is the time zone.
>
> In other words, the DateTime is September 19, 2022 at 1:47pm Zulu.
>
> Here is another example of a valid DateTime value:
>
> -
>
> That value means no data was available to populate the field.
>
> Field Requirements:
>
> >>  Fixed length (16)
>
> >>  Nillable, hyphen is the nil value
>
> >>  Composite, 8 parts
>
>
>
> Here is an XML Schema declaration of DateTime, sans any DFDL properties (I
> highlighted in yellow the field name – DateTime – and its part names):
>
> <xs:element name="DateTime" nillable="true">
>     <xs:complexType>
>         <xs:sequence>
>             <xs:element name="Year">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{4}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Month">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:enumeration value="01"/>
>                         <xs:enumeration value="02"/>
>                         <xs:enumeration value="03"/>
>                         <xs:enumeration value="04"/>
>                         <xs:enumeration value="05"/>
>                         <xs:enumeration value="06"/>
>                         <xs:enumeration value="07"/>
>                         <xs:enumeration value="08"/>
>                         <xs:enumeration value="09"/>
>                         <xs:enumeration value="10"/>
>                         <xs:enumeration value="11"/>
>                         <xs:enumeration value="12"/>
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Day">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}"/>
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="DateTimeSeparator">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:enumeration value="T" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Hour">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Minute">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Second">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="TimeZone">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:enumeration value="Z" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>         </xs:sequence>
>     </xs:complexType>
> </xs:element>
>
> All parts have fixed length. To each part add these two DFDL properties:
>
> dfdl:lengthKind="explicit"
> dfdl:length="__"
>
> For example, Year has a fixed length of 4. Here is its declaration, with
> the DFDL properties (in yellow) added:
>
> <xs:element name="Year"
>                       dfdl:lengthKind="explicit"
>                       dfdl:length="2">
>     <xs:simpleType>
>         <xs:restriction base="xs:string">
>             <xs:pattern value="[0-9]{4}" />
>         </xs:restriction>
>     </xs:simpleType>
> </xs:element>
>
> Use the same strategy for the other fields.
>
> As I stated earlier, DateTime is nillable with hyphen as the nil value.
> Further, DateTime has a complexType. That is a problem. See section 2 for a
> complete discussion of the problem with nillable complexTypes and how to
> deal with it.
>
> Here’s the DFDL schema for DateTime (the DFDL properties are shown in
> yellow):
>
> <xs:element name="DateTime" nillable="true">
>     <xs:complexType>
>         <xs:sequence>
>             <xs:element name="Year"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="4">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{4}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>            </xs:element>
>             <xs:element name="Month"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="2">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                                                     <xs:enumeration
> value="01"/>
>                                                     <xs:enumeration
> value="02"/>
>                                                     <xs:enumeration
> value="03"/>
>                                                     <xs:enumeration
> value="04"/>
>                                                     <xs:enumeration
> value="05"/>
>                                                     <xs:enumeration
> value="06"/>
>                                                     <xs:enumeration
> value="07"/>
>                                                     <xs:enumeration
> value="08"/>
>                                                     <xs:enumeration
> value="09"/>
>                                                     <xs:enumeration
> value="10"/>
>                                                     <xs:enumeration
> value="11"/>
>                                                     <xs:enumeration
> value="12"/>
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Day"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="2">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}"/>
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="DateTimeSeparator"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="1">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:enumeration value="T" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Hour"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="2">
>                 <xs:simpleType>
>                    <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Minute"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="2">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="Second"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="2">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:pattern value="[0-9]{2}" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>             <xs:element name="TimeZone"
>                                    dfdl:lengthKind="explicit"
>                                    dfdl:length="1">
>                 <xs:simpleType>
>                     <xs:restriction base="xs:string">
>                         <xs:enumeration value="Z" />
>                     </xs:restriction>
>                 </xs:simpleType>
>             </xs:element>
>         </xs:sequence>
>     </xs:complexType>
> </xs:element>
>
> Notice that the last part (TimeZone) has no DFDL added. This is because I
> am assuming that it is followed by the delimiter for the DateTime field.
>
> Recall from section 2 that the workaround for nillable complexTypes is to
> create a wrapper element that has a choice with two branches: the first
> branch is a simple string element to deal with nil values and the second
> branch is (in this case) the non-nil DateTime element. So the DFDL schema
> for our example has this structure:
>
> <xs:element name="DateTimeWrapper">
>     <xs:complexType>
>         <xs:choice dfdl:choiceLengthKind="implicit">
>             <xs:element name="DateTime_" type="xs:string" nillable="true"
>                                    dfdl:nilKind="literalValue"
>                                    dfdl:nilValue="-">
>                 <xs:annotation>
>                     <xs:appinfo source="http://www.ogf.org/dfdl/";>
>                         <dfdl:assert>{ fn:nilled(.) }</dfdl:assert>
>                     </xs:appinfo>
>                 </xs:annotation>
>             </xs:element>
>             <xs:element name="DateTime">
>                 <!-- *see above* -->
>             </xs:element>
>         </xs:choice>
>     </xs:complexType>
> </xs:element>
>
> One last (important) point: When parsing input with Daffodil use the -V
> limited option. The option instructs Daffodil to validate each part of
> the composite field against the XSD facets. With this erroneous input
> value:
>
> xxxx0919T134700Z
>
>
>
> Daffodil gives this very helpful error message on parsing:
>
>
>
> [error] Validation Error: Year failed facet checks due to: facet
> pattern(s): [0-9]{4}
>
>
>
> If you don’t use the -V limited option, then Daffodil won’t validate the
> parts against the XSD facets. Consequently, Daffodil will not report any
> errors with the above erroneous input. Why? Because if we ignore the facets
> in this element declaration:
>
> <xs:element name="Year"
>                       dfdl:lengthKind="explicit"
>                       dfdl:length="4">
>     <xs:simpleType>
>         <xs:restriction base="xs:string">
>             <xs:pattern value="[0-9]{4}" />
>         </xs:restriction>
>     </xs:simpleType>
> </xs:element>
>
> then it is simply saying that the input must be a string of length 4 and
> “xxxx” certainly fits that specification.
>

Reply via email to