I want to add one important thought to this.
You don't want your <invalid>...</invalid> element to actually be
considered "valid" by the XML Schema (which is the DFDL schema).
If you just construct an <invalid> .... hexBinary here </invalid> element
when parsing fails, e.g., as the final branch of an xs:choice, then that
will be VALID xml.
So I find it helpful to define "always invalid" types, and use those for
these sorts of error-tolerating catch-alls.
One way is like this:
<complexType name="alwaysInvalid">
<sequence>
<element name="hex" maxOccurs="0"
dfdl:occursCountKind='parsed'>
<simpleType>
<restriction base="xs:hexBinary">
<length value="0"/>
</restriction>
</simpleType>
</element>
</sequence>
</complexType>
The idea here is that maxOccurs="0" will mean if there is an instance of
this hex element at all, it is invalid.
The length facet 0 is insisting that the hex element, if it exists, must
contain at least 1 byte.
The challenge with this is that different XML tools tolerate maxOccurs="0"
to different degrees. I believe it is technically allowed by XSD, but I
recall tools other than Xerces (esp. user interfaces associated with
writing XSD) giving me a hard time about it. Your mileage may vary.
On Fri, Sep 8, 2023 at 8:29 AM Steve Lawrence <[email protected]> wrote:
> If you want different element names, there's really way to avoid a
> choice. You can get fancy with a hidden group, restriction, choice
> dispatch and checkConstraints, and that avoids parsing the same field
> twice, but it's very ugly and I wouldn't recommend it. I've put an
> example of what this might look like at the end of this email.
>
> In general, I would recommend not trying to have separate valid/invalid
> elements, but instead parse the the field and rely on restrictions for
> validation. For example:
>
> <element name="field" dfdl:lengthKind="delimited"
> dfdl:terminator="%NL;">
> <simpleType>
> <restriction base="xs:string" />
> <pattern value="[0-9][a-zA-Z]" />
> </restriction>
> </simpleType>
> </element>
>
> This works very similar, but now field is used for the well-formed
> content, regardless if it's valid or not, and the restriction will be
> used to validate it, either with Daffodils internal "limited"
> validation, "full" Xerces validation, or external validation. An added
> benefit it is doesn't have to parse the data twice for invalid data.
>
>
> Below is the approach mentioned at the top. Note that it uses the same
> technique as above, but it hides the field element and uses choice
> dispatch, inputValueCalc, and outputValueCalc to create the
> invalid/valid elements.
>
> <group name="hiddenField">
> <sequence>
> <element name="field" dfdl:terminator="%NL;"
> dfdl:outputValueCalc="{ if (fn:exists(../valid)) then ../valid
> else ../invalid }">
> <simpleType>
> <restriction base="xs:string">
> <pattern value="[0-9][a-zA-Z]" />
> </restriction>
> </simpleType>
> </element>
> </sequence>
> </group>
>
> <element name="root">
> <complexType>
> <sequence>
> <sequence dfdl:hiddenGroupRef="ex:hiddenField" />
> <choice dfdl:choiceDispatchKey="{
> xs:string(dfdl:checkConstraints(./field)) }">
> <sequence dfdl:choiceBranchKey="true">
> <element name="valid" type="xs:string"
> dfdl:inputValueCalc="{ ../field }" />
> </sequence>
> <sequence dfdl:choiceBranchKey="false">
> <element name="invalid" type="xs:string"
> dfdl:inputValueCalc="{ ../field }" />
> </sequence>
> </choice>
> </sequence>
> </complexType>
> </element>
>
>
> On 2023-09-07 05:04 PM, Roger L Costello wrote:
> > Hi Folks,
> >
> > Good input contains a digit followed by a letter, e.g., this is good
> > input: 1H
> >
> > Anything else is bad input, e.g., this is bad input: 1H23
> >
> > If the input is good, I want to put the input into a <valid> element,
> e.g.,
> >
> > <valid>1H</valid>
> >
> > If the input is bad, I want to put the input into an <invalid> element,
> > e.g.,
> >
> > <invalid>1H23</invalid>
> >
> > This DFDL seems to work:
> >
> > <xs:choice>
> > <xs:sequencedfdl:terminator="%NL;">
> > <xs:elementname="valid"type="xs:string"dfdl:lengthKind="pattern"
> > dfdl:lengthPattern="[0-9][a-zA-Z]"/>
> > </xs:sequence>
> > <xs:elementname="invalid"type="xs:string"/>
> > </xs:choice>
> >
> > But that doesn’t seem like a good solution. Is there a better way to
> > solve this problem?
> >
> > /Roger
> >
>
>