Hi Steve,

I really like your explanation of " When you think about lengthKind="prefixed", 
it might help to think of it as syntactic sugar of this:". That is very helpful.

> Maybe you already have lengthUnits defined
> in the default format, and when you added it
> to the simpleType you used the same value

I do have lengthUnits defined in the default format. However, I explicitly 
inserted lengthUnits=characters to the simpleType and then lengthUnits=bytes to 
the simpleType ... no difference in behavior observed.

/Roger

-----Original Message-----
From: Steve Lawrence <[email protected]> 
Sent: Friday, February 7, 2020 12:01 PM
To: [email protected]
Subject: [EXT] Re: Why is dfdl:lengthUnits=bytes required with 
lengthKind=prefixed?

It looks like we might have a bug with dfdl:lengthUnits="characters".
I'm seeing odd behavior too. I'll take a look into this.

And you do need lengthUnits on both the simple type and the name element. Maybe 
you already have lengthUnits defined in the default format, and when you added 
it to the simpleType you used the same value, so it was effectively the same?

When you think about lengthKind="prefixed", it might help to think of it as 
syntactic sugar of this:

  <xs:sequence>
    <xs:element name="length" type="prefix-type" />
    <xs:element name="name"
      type="xs:string"
      dfdl:lengthKind="explicit"
      dfdl:length="{ ../length }"
      ... />
  </xs:sequence>

So it's really two separate parses. We first parse a "length" field that 
results in an integer, and it needs all the normal properties related to 
integer parsing, including length/lengthKind/lengthUnits/etc. We then parse the 
"name" field that uses the resulting number of the "length"
field as the length. But that number is unitless (e.g. 8). We use the 
lengthUnits of "name" to determine how to interpret that number.

When you think about it like this, it maybe becomes more clear when lengthUnits 
are required for both the simple type and the "name"
type--they really are two different elements with different properties.

Note that you can mix and match lengthUnits. So, for example, the prefix length 
could have lengthUnits="bits", and the element could have lengthUnits="bytes". 
So when we parse "8", we could interpret that as 8 bits or 8 bytes depending on 
the lengthUnits property of "name".



On 2/7/20 11:14 AM, Costello, Roger L. wrote:
> Hi Folks,
> 
> My input is this:
> 
> 8John Doe
> 
> I used lengthKind=prefixed in my schema. See below. Notice on the 
> element declaration for "name" I specify dfdl:lengthUnits="bytes". I 
> originally specified "characters" instead of "bytes" and that resulted 
> in an error (unconsumed data). This is so counterintuitive. First, why 
> do I even need to specify lengthUnits on "name"? Second, although I 
> don't' show it, I tried putting lengthUnits on the simpleType and it 
> didn't matter what value I assigned to lengthUnits ... I thought you 
> always had to specify lengthUnits whenever you specify 
> lengthKind=explicit ... apparently not. I couldn't find any 
> explanation of this in the specification.  /Roger
> 
> <xs:element name="input">
>     <xs:complexType>
>         <xs:sequence>
>             <xs:element name="name" 
>               type="xs:string" 
>               dfdl:lengthKind="prefixed" 
>               dfdl:lengthUnits="bytes" 
>                       dfdl:prefixLengthType="prefix-type"  
>               dfdl:prefixIncludesPrefixLength="no"/>
>         </xs:sequence>
>     </xs:complexType>
> </xs:element>
> 
> <xs:simpleType  name="prefix-type" 
>               dfdl:lengthKind="explicit"
>               dfdl:length="1">
>     <xs:restriction base="xs:integer" /> </xs:simpleType>
> 
> 

Reply via email to