Hi all, Sorry for the late reply. Thank you for having a look at this!
Patrick. On Fri, Feb 21, 2020 at 6:54 PM Beckerle, Mike <[email protected]> wrote: > Ah. I missed the nilKind 'literalCharacter' entirely. > > So, yeah, we need to test the interaction here. I believe for nilValue > literalCharacter, one must examine for nil value before trimming pad chars. > I know we have some tests for literalCharacter, but one where it is the > same as the pad character with textTrimKind="padChar" I doubt. > > When unparsing, it's simpler. If the element is nilled, produce the > nilValue character repeating for the field width. > > If the value is present and empty string when unparsing, then padding will > fill in the spaces via the textStringPadCharacter. > > This is another example of a format that doesn't "round trip", because an > empty string value would be written out as all spaces, and would be parsed > back in as a nilled element. I.e., the canonical interpretation is nilled > not empty sting. > > ------------------------------ > *From:* Steve Lawrence <[email protected]> > *Sent:* Friday, February 21, 2020 4:38 PM > *To:* Beckerle, Mike <[email protected]>; [email protected] < > [email protected]> > *Subject:* Re: How to remove empty elements in the output XML? > > nilValue is %SP; but nilKind="literalCharacter". So it is nil if all > characters in the "NilLiteralCharacters" region is all spaces. If I'm > reading the spec correctly, pading isn't applied with > NilLiteralCharacters, so all the spaces should be part of it. > > I guess an alternative would to do nilKind="literalValue" and > nilValue="%WSP*;" but that doesn't seem to work either. > > > On 2/21/20 5:27 PM, Beckerle, Mike wrote: > > I don't think this: > > > > <xs:element name="field" type="xs:string" > > dfdl:lengthKind="explicit" dfdl:length="9" > > dfdl:textPadKind="padChar" > > dfdl:textTrimKind="padChar" > > dfdl:textStringPadCharacter="%SP;" > > dfdl:textStringJustification="left" > > dfdl:nilKind="literalCharacter" > > dfdl:nilValue="%SP;" > > nillable="true" /> > > > > should produce nil when the input is all spaces. The nilValue is not > WSP* or > > WSP+,, it's SP, and the value is fixed length 9 chars, so it can *never* > be just > > one space, which is the nilValue. So I think this can never produce a > nil value. > > > > The rest of the email seems right tho. I just want to add this sort of > motivation. > > > > This is an interesting thing in DFDL. Sometimes you can't get out what > you want > > as XML because in DFDL, the primary requirement is to describe the > format of the > > data as it is. In your data representation, the field isn't optional. > It's > > mandatory. It has to be there and occupies 9 characters of fixed length. > So DFDL > > doesn't let you model this as an optional field on purpose. The physical > format > > often constrains the logical model in DFDL. > > > > DFDL's job is to describe the input format. Not so much to describe how > to > > transform it to what your preference is. That's really a job for other > tools. > > > > That said, tricks like what Steve suggested where you use choices to > model > > something as not an optional element, but an alternative of two things, > one of > > which is just syntax, the other of which is an element.... that's the > kind of > > thing you have to do to force it to produce what you prefer. This sort > of thing > > isn't really a trick. It's extensively used in many formats. > > > > > > > > > > > -------------------------------------------------------------------------------- > > *From:* Steve Lawrence <[email protected]> > > *Sent:* Friday, February 21, 2020 9:15 AM > > *To:* [email protected] <[email protected]> > > *Subject:* Re: How to remove empty elements in the output XML? > > I think there might be a bug with nillable strings and padding when > > padChar is the same as the nilValue. The following should result with a > > nilled element when the data is all spaces but doesn't. > > > > <xs:element name="field" type="xs:string" > > dfdl:lengthKind="explicit" dfdl:length="9" > > dfdl:textPadKind="padChar" > > dfdl:textTrimKind="padChar" > > dfdl:textStringPadCharacter="%SP;" > > dfdl:textStringJustification="left" > > dfdl:nilKind="literalCharacter" > > dfdl:nilValue="%SP;" > > nillable="true" /> > > > > However, if you want the element to not be in the infoset at all when > > the data is all spaces, as opposed to a nilled element, you need a > > different method. Keep in mind that something needs to parse those empty > > strings if the element is missing. > > > > A technique that seems to work well (although it is maybe a bit messy) > > is something like this: > > > > <xs:choice> > > <xs:sequence dfdl:initiator="%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;" /> > > <xs:element name="field" ... /> > > </xs:choice> > > > > So in this case we first try to parse 9 spaces via an empty sequence > > with an initiator. If that fails, then we try to parse those 9 > > characters as "field". So if 9 spaces were found then the field element > > will not be in the infoset. Note that field should have minOccurs="1" > > now--it's not optional since the optionality is handled by the sequence. > > > > This is a little messy, so I'd recommend defining some formats to make > > it more clear, e.g.: > > > > <xs:annotation> > > <xs:appinfo source="http://www.ogf.org/dfdl/"> > > <dfdl:defineFormat name="empty9"> > > <dfdl:format initiator="%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;%SP;" /> > > </dfdl:defineFormat> > > </xs:appinfo> > > </xs:annotation> > > > > <xs:choice> > > <xs:sequence dfdl:ref="empty9" /> > > <xs:element name="field" ... /> > > </xs:choice> > > > > > > > > > > On 2/20/20 1:03 PM, Patrick Grandjean wrote: > >> Hi, > >> > >> I am parsing a text format and some optional elements are encoded as > empty > >> strings or strings with space characters only. How to have these > elements > >> omitted in the output XML? > >> > >> The element is declared as: > >> > >> <xs:element name="field1" type="xs:string" minOccurs="0" > dfdl:length="9" > >> dfdl:lengthKind="explicit" /> > >> > >> Thanks to the properties textStringJustification="center" and > >> textTrimKind="padChar", the parsed string is trimmed and the output XML > looks like: > >> > >> <field1></field1> > >> > >> I have tried specifying properties emptyValueDelimiterPolicy, nilKind, > >> nilValueDelimiterPolicy and nilValue but can't find a combination to > have this > >> element removed in the output XML. > >> > >> Is it possible? If yes, could you please show me how? > >> > >> Thanks, > >> Patrick. > >> > > > >
