You actually only need N alternatives, one for each of the possible number of spaces that could precede the nil character. So for length 10, there could be anywhere between 0 and 9 spaces, so you need one for each of those alternatives::

- %SP-; %SP;%SP;- ... %SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;-

The actual length of the nilValue property is on the order of n^2 (I think it's more precisely n(n+1)/2 characters long), but it's straightforward to generate, and you could have a number of pregenerated simpleTypes for or dfdl:formats for reuse.

It's still doesn't scale great, but its not as bad factorial.



On 8/10/22 8:33 AM, Roger L Costello wrote:
Hi Steve,

No, that is not a viable solution. As you observed, it doesn't scale. For a 
fixed length field of 10, we would need to specify something like 10-factorial 
alternatives (or is it 2^^10 alternatives?) in dfdl:nilValue.

/Roger

-----Original Message-----
From: Steve Lawrence <[email protected]>
Sent: Wednesday, August 10, 2022 8:25 AM
To: [email protected]
Subject: [EXT] Re: Conflicting requirements: fixed length field, nillable, some 
enumeration values shorter than the required length

I'm not sure I love this possibles solution, and it doesn't scale very
well, but what about something like this:

    <element name="field" type="xs:string" nillable="true"
      dfdl:lengthKind="explicit"
      dfdl:length="3"
      dfdl:textStringJustification="left"
      dfdl:textTrimKind="padChar"
      dfdl:textPadKind="padChar"
      dfdl:textStringPadCharacter="%SP;"
      dfdl:nilKind="literalValue"
      dfdl:nilValue="- %SP;- %SP;%SP;-" />

So the field is left-justified and right-padded with spaces. Left padded
spaces are not trimmed, so a field like " A " will show up in the
infoset with the left space and fail validation. And the nilValue is set
to all the combinations of the nil character preceded with a space.

Like I said, this doesn't scale because you need N nilValues for a
string of length N. And this scala at all for delimited length fields
where you don't know the length of the field, unless you just add a
bunch of nilValues up to some size.

If we had something like %SP*; (similar to how we have %WSP*;), then the
nilValue could just be "%SP*;-" and this would scale without issue, and
work for both fixed length and delimited length fields. I believe %SP*;
has come up in the past, so this might be another argument to added it.


On 8/10/22 7:54 AM, Roger L Costello wrote:
Thanks Mike. I implemented your approach. It fails to detect invalid input. Let
me explain.

Input specifications:

    * Fixed length field (3)
    * Nillable, hyphen is the nil value, the hyphen may be anywhere within the 3
      character field
    * Values must be left-justified

Here are examples of valid inputs:

…/AB /…

…/ABC/…

…/-  /…

.../ - /…

…/  -/…

Your solution permits this input (I tested it, Daffodil gives no error or 
warning):

…/ AB/…

Notice that the value is right-justified. That is invalid.

/Roger

/Roger

*From:* Mike Beckerle <[email protected]>
*Sent:* Monday, August 8, 2022 3:58 PM
*To:* [email protected]
*Subject:* [EXT] Re: Conflicting requirements: fixed length field, nillable,
some enumeration values shorter than the required length

So I think your requirements are this:

* fixed length 5

* the hyphen nil indicator may have spaces around it

* canonical form is left justified for "-" or any value.

This is the best I could do. I had to surround the nillable element with another
element so as to get left-justification by way of filling of the unused region
of a complex type, with fillByte which is %SP;.

If you want center justified hyphens for the nil case and left-justified strings
for the value case, then I think it's not possible to model this without using
separate elements for the nil and value. (That solution not shown here.)

<*element *name*="Foo"
*dfdl:length*="5"
*dfdl:lengthKind*="explicit"
*dfdl:terminator*="/"
*dfdl:fillByte*="%SP;"* >
/<!--
     The above achieves canonical unparse
     as left-justified fixed length because
     the fillByte will be used to fill unused
     space on the right.

     This only works for fixed length left-justified data.
     If this was right-justified, this trick would not work.
     -->
/<*complexType* >
     <*sequence* >
/<!--
       The below achieves trimming of spaces either side,
       but only when parsing. Nothing is added when unparsing.
       -->
/<*element *name*="value" *nillable*="true"
*dfdl:nilValue*="-"
*dfdl:lengthKind*="delimited"
*dfdl:textStringJustification*="center"
*dfdl:textTrimKind*="padChar"
*dfdl:textPadKind*="none"* >
         <*simpleType* >
           <*restriction *base*="xs:string"* >
             <*enumeration *value*="AB"*/>
             <*enumeration *value*="ABC"*/>
           </*restriction* >
         </*simpleType* >
       </*element* >
       </*sequence* >
</*complexType* >
</*element* >

The TDML file I created for this has these tests in it showing that this works:

     <parserTestCase name="foo1" root="Foo" model="s" roundTrip="onePass">
       <document>-    /</document>
       <infoset>
         <dfdlInfoset>
           <ex:Foo xmlns=""><value xsi:nil="true"/></ex:Foo>
         </dfdlInfoset>
       </infoset>
     </parserTestCase>

     <parserTestCase name="foo2" root="Foo" model="s" roundTrip="twoPass">
       <document> -   /</document>
       <infoset>
         <dfdlInfoset>
           <ex:Foo xmlns=""><value xsi:nil="true"/></ex:Foo>
         </dfdlInfoset>
       </infoset>
     </parserTestCase>

     <parserTestCase name="foo3" root="Foo" model="s" roundTrip="twoPass">
       <document> AB  /</document>
       <infoset>
         <dfdlInfoset>
           <ex:Foo xmlns=""><value>AB</value></ex:Foo>
         </dfdlInfoset>
       </infoset>
     </parserTestCase>

     <parserTestCase name="foo4" root="Foo" model="s" roundTrip="onePass">
       <document>AB   /</document>
       <infoset>
         <dfdlInfoset>
           <ex:Foo xmlns=""><value>AB</value></ex:Foo>
         </dfdlInfoset>
       </infoset>
     </parserTestCase>

On Mon, Aug 8, 2022 at 10:22 AM Roger L Costello <[email protected]
<mailto:[email protected]>> wrote:

      Hi Mike,

      I gave your suggested approach a try. It failed.

      With this input:

      …/AB /…

      it works.

      With this input:

      …/ - /…

      it fails, producing this error:

      [error] Validation Error: Foo failed facet checks due to: facet
      enumeration(s): AB|ABC

      Further, even if the approach were to work with this example where the 
field
      length is 3, it would be an untenable approach for longer fixed fields. 
For
      example, if the field length was 10, then the nilValue would need 
something
      like 10-factorial whitespace-separated values.

      Do you have another suggested approach?

      /Roger

      *From:* Mike Beckerle <[email protected] <mailto:[email protected]>>
      *Sent:* Monday, August 8, 2022 9:38 AM
      *To:* [email protected] <mailto:[email protected]>
      *Subject:* [EXT] Re: Conflicting requirements: fixed length field, 
nillable,
      some enumeration values shorter than the required length

      I would try making the nilValue "%SP;-%SP; -". That is two separate
      possibilities for nilValue, one is space-hyphen-space, the other just
      hyphen. (It's a whitespace-separated list of nil values tokens.)

      The first one will be used for unparsing. Both will be tried for parsing.

      That along with justification left might work.

      On Mon, Aug 8, 2022 at 8:01 AM Roger L Costello <[email protected]
      <mailto:[email protected]>> wrote:

          Hi Folks,

          I have an input field that is fixed length (3). If there is no data, 
the
          field is to be populated with a hyphen (of course, it must be padded
          with spaces to the required length). The schema has a simpleType with
          enumeration facets. Some enumeration values are less than the required
          length.

          Here's how I specify the field:

          <xs:element name="Foo"
               nillable="true"
               dfdl:nilKind="literalValue"
               dfdl:nilValue="-"
               dfdl:lengthKind="explicit"
               dfdl:length="3"
               dfdl:textTrimKind="padChar"
               dfdl:textPadKind="padChar"
               dfdl:textStringPadCharacter="%SP;"
               dfdl:textStringJustification="center">
               <xs:simpleType>
                   <xs:restriction base="xs:string">
                       <xs:enumeration value="AB"/>
                       <xs:enumeration value="ABC"/>
                   </xs:restriction>
               </xs:simpleType>
          </xs:element>

          Notice dfdl:textStringJustification="center" which is fine for the
          nillable value (hyphen) but not for a regular value such as AB which
          should be left justified. As the schema is, the input could contain 
this
          (assume slash separators):

          .../ AB/...

          which is incorrect.

          So, there are conflicting requirements: the nillable value needs
          dfdl:textStringJustification="center" whereas the normal values need
          dfdl:textStringJustification="left". What to do about this?

          /Roger


Reply via email to