I don't think this:

<xs:element name="field" type="xs:string"
  dfdl:lengthKind="explicit" dfdl:length="9"
  dfdl:textPadKind="padChar"
  dfdl:textTrimKind="padChar"
  dfdl:textStringPadCharacter="%SP;"
  dfdl:textStringJustification="left"
  dfdl:nilKind="literalCharacter"
  dfdl:nilValue="%SP;"
  nillable="true" />

should produce nil when the input is all spaces. The nilValue is not WSP* or 
WSP+,, it's SP, and the value is fixed length 9 chars, so it can *never* be 
just one space, which is the nilValue. So I think this can never produce a nil 
value.

The rest of the email seems right tho. I just want to add this sort of 
motivation.

This is an interesting thing in DFDL. Sometimes you can't get out what you want 
as XML because in DFDL, the primary requirement is to describe the format of 
the data as it is. In your data representation, the field isn't optional. It's 
mandatory. It has to be there and occupies 9 characters of fixed length. So 
DFDL doesn't let you model this as an optional field on purpose. The physical 
format often constrains the logical model in DFDL.

DFDL's job is to describe the input format. Not so much to describe how to 
transform it to what your preference is. That's really a job for other tools.

That said, tricks like what Steve suggested where you use choices to model 
something as not an optional element, but an alternative of two things, one of 
which is just syntax, the other of which is an element.... that's the kind of 
thing you have to do to force it to produce what you prefer. This sort of thing 
isn't really a trick. It's extensively used in many formats.




________________________________
From: Steve Lawrence <[email protected]>
Sent: Friday, February 21, 2020 9:15 AM
To: [email protected] <[email protected]>
Subject: Re: How to remove empty elements in the output XML?

I think there might be a bug with nillable strings and padding when
padChar is the same as the nilValue. The following should result with a
nilled element when the data is all spaces but doesn't.

<xs:element name="field" type="xs:string"
  dfdl:lengthKind="explicit" dfdl:length="9"
  dfdl:textPadKind="padChar"
  dfdl:textTrimKind="padChar"
  dfdl:textStringPadCharacter="%SP;"
  dfdl:textStringJustification="left"
  dfdl:nilKind="literalCharacter"
  dfdl:nilValue="%SP;"
  nillable="true" />

However, if you want the element to not be in the infoset at all when
the data is all spaces, as opposed to a nilled element, you need a
different method. Keep in mind that something needs to parse those empty
strings if the element is missing.

A technique that seems to work well (although it is maybe a bit messy)
is something like this:

<xs:choice>
  <xs:sequence dfdl:initiator="%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;" />
  <xs:element name="field" ... />
</xs:choice>

So in this case we first try to parse 9 spaces via an empty sequence
with an initiator. If that fails, then we try to parse those 9
characters as "field". So if 9 spaces were found then the field element
will not be in the infoset. Note that field should have minOccurs="1"
now--it's not optional since the optionality is handled by the sequence.

This is a little messy, so I'd recommend defining some formats to make
it more clear, e.g.:

  <xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/";>
      <dfdl:defineFormat name="empty9">
        <dfdl:format initiator="%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;" />
      </dfdl:defineFormat>
    </xs:appinfo>
  </xs:annotation>

  <xs:choice>
    <xs:sequence dfdl:ref="empty9" />
    <xs:element name="field" ... />
  </xs:choice>




On 2/20/20 1:03 PM, Patrick Grandjean wrote:
> Hi,
>
> I am parsing a text format and some optional elements are encoded as empty
> strings or strings with space characters only. How to have these elements
> omitted in the output XML?
>
> The element is declared as:
>
> <xs:element name="field1" type="xs:string" minOccurs="0" dfdl:length="9"
> dfdl:lengthKind="explicit" />
>
> Thanks to the properties textStringJustification="center" and
> textTrimKind="padChar", the parsed string is trimmed and the output XML looks 
> like:
>
> <field1></field1>
>
> I have tried specifying properties emptyValueDelimiterPolicy, nilKind,
> nilValueDelimiterPolicy and  nilValue but can't find a combination to have 
> this
> element removed in the output XML.
>
> Is it possible? If yes, could you please show me how?
>
> Thanks,
> Patrick.
>

Reply via email to