Hi Steve,

Thanks for the quick reply, that appears to be exactly what I'm looking for! Is 
there any chance you could try sending me the example.dfdl.xsd file again? The 
attachment didn't seem to make it through correctly.


-Brett

________________________________
From: Steve Lawrence <[email protected]>
Sent: Wednesday, September 12, 2018 1:04:39 PM
To: [email protected]; Gedvilas, Brett L2
Subject: Re: DFDL Schema help

Hi Brett,

The recent 2.2.0 release adds a feature that does just what you need,
called "data layering". It's not officially part of the DFDL spec, but
the proposal is found here:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75979671

Essentially, what you'll want to do to is specify a data layer transform
of "fourbyteswap" on your data. This layer transform swaps the bytes of
each 4 byte chunk for the given length of data, effectively making them
big-endian-like. You can then parse the individual fields using a
bigEndian byteOrder and explicit bit lengths. I've attached an example
schema that parses your 4 bytes of example data to give you an idea of
what such a schema would look like.

The data in the data.bin is:

  0x01 20 00 90

To parse with the daffodil CLI, you can run:

  daffodil parse -s example.dfdl.xsd data.bin

The resulting XML infoset should be:

  <Data>
    <a>9</a>
    <b>2</b>
    <c>1</c>
  </Data>

- Steve


On 09/12/2018 02:14 PM, Gedvilas, Brett L2 wrote:
> Hi everyone,
>
>
> I am a new daffodil user and I was looking for input on a DFDL schema 
> definition
> I'm trying to create. I'm working with some binary physics data, the format of
> which can loosely be described as fields that are aggregated together and 
> packed
> into a single 32-bit integer before being written to memory. The gist of the
> issue is that because not all fields fall nicely on 1-byte divisions, 
> different
> pieces of a field will get jumbled if you read the data as a linear stream 
> from
> memory. This is best illustrated by a simple example:
>
>
> Consider the following 32-bit hex value: 0x90 00 20 01
>
>
> The problem arises because the values that have meaning in context are 0x9
> (consisting of 4 bits), 0x0002 (16 bits), and finally 0x001 (12 bits).
>
>
> When this value gets stored in memory on a little endian architecture we see 
> the
> following: 0x01 20 00 90. Trying to read those bit sequences as a stream from
> memory will yield 0x0, 0x1200, and 0x090, which are clearly incorrect.
>
>
> The simplest approach I can envision is to read in the value as an entire 
> 32-bit
> value and then perform some processing via masks/bit shift in order to extract
> the correct values. Is there a more straightforward solution to this problem? 
> or
> does anyone have experience or insights solving this issue using daffodil?
>
>
> Thanks!
>
>
> Brett
>
>
>
>
>

Reply via email to