> That says the input is an integer that has exactly 2 digits.

Not quote. That DFDL snippet says that Daffodil will consume two
characters in the current encoding (so it treats multi-byte encodings
how you would expect), and then converts that two-character string to an
integer based on the dfdl:textNumberPattern and other textNumber
properties. So there's really no concept of parsing a certain number of
digits.

Part of the reason for this is that when representation is text, the
fundamental unit really is a single character. Groupings of these
characters might be interpreted as numbers, or delimiters, or other
things, but at its core the basic unit is a single character (or a byte
or bit in some formats).

Additionally, thinking about numbers as just digits is pretty limiting
in many cases, since numbers are often more than just the 0-9 digits.
They can often include positive and negative signs, grouping separators,
decimal separators, exponent characters, infinity characters, null
representations, prefixes/suffixes, and on and on. As  simple example,
the string "-1" is a two character number that only has single digit.

So the use of lengthUnits="digits" would really only be useful with
unsigned integers with no grouping separators/exponents/prefixes/etc.
Although that might be somewhat common, it's really just
lengthUnits="characters" with type="xs:unsignedInt" and a
dfdl:textNumberPattern that only accepts digits and nothing else.
lenghtUnits="digits" is really just a restriction of the more general
case using existing DFDL properties.


On 8/12/19 2:34 PM, Costello, Roger L. wrote:
> Hello DFDL community,
> 
> I want to confirm my understanding of the following DFDL:
> 
> <xs:elementname="DataEntry"
>      type="xs:int"
>      dfdl:lengthKind="explicit"
>      dfdl:length="2"
>      dfdl:lengthUnits="characters"/>
> 
> That says the input is an integer that has exactly 2 digits.
> 
> Right?
> 
> It seems kind of strange to say the length units are characters. It would be 
> less strange if I could say the length units are “digits” but that, of 
> course, 
> not legal. Any explanation of why length units of characters makes sense?
> 
> I reckon the above DFDL is kind of equivalent to the following XML Schema, 
> right?
> 
> <xs:elementname="DataEntry">
> <xs:simpleType>
> <xs:restrictionbase="xs:int">
> <xs:lengthvalue="2"/>
> </xs:restriction>
> </xs:simpleType>
> </xs:element>
> 

Reply via email to