Mike

Appreciate the suggested workaround. I did incorporate/test your snippet
per Mar 16, 2022 at 12:21 PM [below] w/ following anticipated results:

[1]Satellite numbers w/ leading whitespace(s) yields lossless unparse
results.
[2]Satellite numbers w/ leading zero(s) or whitespace(s)+zero(s) yield
unparse results that are 'numerically' equivalent |HOWEVER| unparsed target
ASCII file fails to compare w/ parsed source ASCII file due to
<<<dfdl:textNumberPattern="####0">>> formatting that trims leading
irrelevant characters - see attached parse/source and unparse/target files.
NB "00000" formatting yields opposite results by trimming leading
whitspace(s).

The issue/concern of 'lossless' parse/unparse processing for our
organization is fundamental. Our organization has no control over
customers' [legacy] pre-/post- processes |AND| the format of input data.
Ergo lossless end-to-end data transformation is essential b/c if/when
source/target data fail to compare, we're placed in the untenable position
of explaining differences on case by case basis.

The impetus/urgency for my questions below re: ticket is that producing
lossless end-to-end data transformation results via string processing with
suite of XPATH functions is more important/suitable than yielding
numerically 'equivalent' results.

Thx - Attila

On Fri, Mar 18, 2022 at 5:12 AM Attila Horvath <[email protected]>
wrote:

> I know you're preparing to release 3.3.0.
>
> When do think this issue might be resolved? Which point release are you
> targeting?
>
> On a related subject, Daffodil implements a subset of XPATH function.
> Might dev-team consider implementing all XPATH functions in lieu of
> workarounds?
>
> Thx in advance - Attila
>
> On Wed, Mar 16, 2022 at 12:26 PM Mike Beckerle <[email protected]>
> wrote:
>
>> Created https://issues.apache.org/jira/browse/DAFFODIL-2676
>>
>> On Wed, Mar 16, 2022 at 12:21 PM Mike Beckerle <[email protected]>
>> wrote:
>>
>> > Ok, I found the attachment. Sorry for the delay.
>> >
>> > The challenge here is you are thinking the
>> > xs:unsignedInt(../Line1.02-Satellite) call will tolerate whitespace.
>> Which
>> > it seems they do not.
>> >
>> > I think this is a Daffodil bug, as the constructors like xs:unsignedInt
>> > are supposed to work like they do in XPath, and the XPath functions spec
>> > says when converting from strings, that whitespace normalization
>> applies -
>> > which trims all leading and trailing whitespace. It's less clear
>> > about whether interior whitespace is collapsed, but definitely
>> > leading/trailing seem to be trimmed.
>> >
>> > So I'll add a JIRA ticket about this.
>> >
>> > For how to work around, I suggest parsing the satellite field not as a
>> > string, but as an unsignedInt from the start.
>> >
>> > So like:
>> >
>> > <xs:element name="satellite-num-range" type="xs:unsignedInt"
>> > dfdl:lengthKind="explicit" dfdl:length="5"
>> >   dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar"
>> > dfdl:textNumberPadCharacter="%SP;" dfdl:textNumberJustification="right"
>> >   dfdl:textNumberPattern="####0"/>
>> >
>> > I didn't run this, but I think this will remove leading spaces, and add
>> > leading spaces to your 5 character element.
>> >
>> > Another way to express this, since you need only leading padding is
>> this:
>> >
>> > <xs:element name="satellite-num-range" type="xs:unsignedInt"
>> > dfdl:lengthKind="explicit" dfdl:length="5"
>> >   dfdl:textNumberPattern="* ####0"/>
>> >
>> > In that textNumberPattern the "* " means spaces are the pad character to
>> > be used, and when there is no digit for the position of a "#" then the
>> pad
>> > character from the pattern (not the textNumberPadCharacter) is used.
>> >
>> > Both kinds of padding can be used together E.g., so you could have
>> number
>> > text right justified in a fixed-length field of width 6, using "*" to
>> pad
>> > to width 5 so that you can get " **123".
>> >
>> > <xs:element name="starPadNum" type="xs:unsignedInt"
>> > dfdl:lengthKind="explicit" dfdl:length="6"
>> >   dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar"
>> > dfdl:textNumberPadCharacter="%SP;" dfdl:textNumberJustification="right"
>> >   dfdl:textNumberPattern="* ####0"/>
>> >
>> > I didn't run these, but this is, I believe, how it is supposed to work.
>> >
>> >
>> >
>>
>>

Reply via email to