Hi Mike,
Thank you! That is very helpful.
I added all the things you suggested. See below. Unfortunately, that just
resulted in stripping off the rightmost null (hex 0) symbols, leaving this:
<field-name>marker-col?C????xFE</field-name>
Someone on StackOverflow says that C indicates "character field" and xFE
indicates 254 bytes. I'm not sure that that is true, however.
What I desire (I think) is this:
<field-name>marker-col</field-name>
Suggestions?
<xs:element name="field-name"
type="xs:string"
dfdl:length="32"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="characters"
dfdl:textTrimKind="padChar"
dfdl:textStringPadCharacter="%NUL;"
dfdl:textStringJustification="left"/>
From: Mike Beckerle <[email protected]>
Sent: Monday, October 1, 2018 11:02 AM
To: [email protected]
Subject: Re: How to declare a string element where the string stops at the
first null (hex 0) symbol?
Hi Roger,
Looks like you are looking to create a 32-byte long element with NUL "padding".
Question: Is there always at least one NUL at the end, or can a field name use
up all 32 bytes with non-NUL characters? I'm going to guess here (because it's
more common in data I've seen), that a field name that occupies all 32 bytes
would not have a NUL at all.
In that case this is fixed-length data (dfdl:lengthKind="explicit"), and the
properties that do what you want are for "padding/trimming", in section 13.2
and, as this is a string element (not a number or boolean) section 13.4.
textTrimKind (used for parsing)
textPadKind (used for unparsing)
textStringJustification (which side the text is padded/trimmed on, or "center"
justified)
textStringPadCharacter="%NUL;" (note: must use DFDL Entity to represent this.)
truncateSpecifiedLengthString (if string is too long on unparse - chop it, or
is it an error?)
These names seem bulky, but DFDL lets you have simultaneously left justified
text strings, but right justified text numbers in the same format, since this
is so common for the elements to need different justification directions.
A note when you are testing - DFDL spec requires that the padding/filling area
after the data gets filled with the pad character. So data like in your example
will not "round trip", as it won't preserve the junk that is there.
If you create a TDML test, you will need to set roundTrip="twoPass" to get it
to compare the infoset after re-parsing the data it unparsed.
________________________________
From: Costello, Roger L. <[email protected]<mailto:[email protected]>>
Sent: Monday, October 1, 2018 10:27 AM
To: [email protected]<mailto:[email protected]>
Subject: How to declare a string element where the string stops at the first
null (hex 0) symbol?
Hi Folks,
I am working on a DFDL schema for parsing dBase files.
One of its fields is "Field Name". The dBase specification says this about that
field:
Field name in ASCII, zero-filled, 32 bytes.
I have a sample dBase file with this hex value for field name:
6D 61 72 6B 65 72 2D 63 6F 6C 00 43 00 00 00 00
FE 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
My hex editor also displays the hex as text:
marker-col.C....รพ...............
I believe the actual field name is "marker-col" and the rest is garbage. (I
have this belief because I have a dBase tool and it displays "marker-col")
How do I declare, in DFDL, that the element's value is, "The text up to, but
not including, the first null (hex 0) symbol; discard the null symbol and all
the following hex digits"?
/Roger