Mike Appreciate the suggested workaround. I did incorporate/test your snippet per Mar 16, 2022 at 12:21 PM [below] w/ following anticipated results:
[1]Satellite numbers w/ leading whitespace(s) yields lossless unparse results. [2]Satellite numbers w/ leading zero(s) or whitespace(s)+zero(s) yield unparse results that are 'numerically' equivalent |HOWEVER| unparsed target ASCII file fails to compare w/ parsed source ASCII file due to <<<dfdl:textNumberPattern="####0">>> formatting that trims leading irrelevant characters - see attached parse/source and unparse/target files. NB "00000" formatting yields opposite results by trimming leading whitspace(s). The issue/concern of 'lossless' parse/unparse processing for our organization is fundamental. Our organization has no control over customers' [legacy] pre-/post- processes |AND| the format of input data. Ergo lossless end-to-end data transformation is essential b/c if/when source/target data fail to compare, we're placed in the untenable position of explaining differences on case by case basis. The impetus/urgency for my questions below re: ticket is that producing lossless end-to-end data transformation results via string processing with suite of XPATH functions is more important/suitable than yielding numerically 'equivalent' results. Thx - Attila On Fri, Mar 18, 2022 at 5:12 AM Attila Horvath <[email protected]> wrote: > I know you're preparing to release 3.3.0. > > When do think this issue might be resolved? Which point release are you > targeting? > > On a related subject, Daffodil implements a subset of XPATH function. > Might dev-team consider implementing all XPATH functions in lieu of > workarounds? > > Thx in advance - Attila > > On Wed, Mar 16, 2022 at 12:26 PM Mike Beckerle <[email protected]> > wrote: > >> Created https://issues.apache.org/jira/browse/DAFFODIL-2676 >> >> On Wed, Mar 16, 2022 at 12:21 PM Mike Beckerle <[email protected]> >> wrote: >> >> > Ok, I found the attachment. Sorry for the delay. >> > >> > The challenge here is you are thinking the >> > xs:unsignedInt(../Line1.02-Satellite) call will tolerate whitespace. >> Which >> > it seems they do not. >> > >> > I think this is a Daffodil bug, as the constructors like xs:unsignedInt >> > are supposed to work like they do in XPath, and the XPath functions spec >> > says when converting from strings, that whitespace normalization >> applies - >> > which trims all leading and trailing whitespace. It's less clear >> > about whether interior whitespace is collapsed, but definitely >> > leading/trailing seem to be trimmed. >> > >> > So I'll add a JIRA ticket about this. >> > >> > For how to work around, I suggest parsing the satellite field not as a >> > string, but as an unsignedInt from the start. >> > >> > So like: >> > >> > <xs:element name="satellite-num-range" type="xs:unsignedInt" >> > dfdl:lengthKind="explicit" dfdl:length="5" >> > dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" >> > dfdl:textNumberPadCharacter="%SP;" dfdl:textNumberJustification="right" >> > dfdl:textNumberPattern="####0"/> >> > >> > I didn't run this, but I think this will remove leading spaces, and add >> > leading spaces to your 5 character element. >> > >> > Another way to express this, since you need only leading padding is >> this: >> > >> > <xs:element name="satellite-num-range" type="xs:unsignedInt" >> > dfdl:lengthKind="explicit" dfdl:length="5" >> > dfdl:textNumberPattern="* ####0"/> >> > >> > In that textNumberPattern the "* " means spaces are the pad character to >> > be used, and when there is no digit for the position of a "#" then the >> pad >> > character from the pattern (not the textNumberPadCharacter) is used. >> > >> > Both kinds of padding can be used together E.g., so you could have >> number >> > text right justified in a fixed-length field of width 6, using "*" to >> pad >> > to width 5 so that you can get " **123". >> > >> > <xs:element name="starPadNum" type="xs:unsignedInt" >> > dfdl:lengthKind="explicit" dfdl:length="6" >> > dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" >> > dfdl:textNumberPadCharacter="%SP;" dfdl:textNumberJustification="right" >> > dfdl:textNumberPattern="* ####0"/> >> > >> > I didn't run these, but this is, I believe, how it is supposed to work. >> > >> > >> > >> >>
