Some comments:

1) I like the idea that the layers write to a variable, but it seems
like the variables are hard coded in the layer transformer? What are
your thoughts on having the variable defined in a property so that the
user has more control over the naming/definition of it, maybe via
something like dfdlx:runtimeProperties? For example:

  <xs:sequence dfdlx:layerTransform="checksum"
dfdlx:runtimeProperties="resultVariable=checksumPart1">...

2) For the IPv4 layer, it feels a bit unfortunate to have to split the
CRC into two separate layers, since the CRC algorithm is really just a
checksum over the whole header with just the checksum field treated as
if it were zero. Is it possible to have a property that just specifies
that the Nth byte doesn't contribute? Maybe something like:

  <xs:sequence dfdlx:layerTransform="checksum"
dfdlx:runtimeProperties="ignoreByte=5">...


3) As for implementing the checksums, have you put any thought into
making that extensible? For example, I'm wondering if we only have a
single "checksum" layer, and then the dfdlx:runtimeProperties determines
which algorithm to use? E.g.

  <xs:sequence dfdlx:layerTransform="checksum"
dfdlx:runtimeProperties="algorithm=crc32">...

  <xs:sequence dfdlx:layerTransform="checksum"
dfdlx:runtimeProperties="algorithm=ipv4header">...

And then people can register different checksum algorithms without
having to reimplement their own layer? Or maybe we keep it simple and
the default checksum layer just supports a handful of the most common
checksums (maybe those supported by some preexisting checksum library?)

People could still implement their own pluggable checksum layer if they
need something we don't support, but this would cover the most common
cases and avoids a proliferation of a bunch of different layers that are
basically the same except for some minor algorithm details.


On 7/30/21 2:29 PM, Beckerle, Mike wrote:
> I would like comments on the layering enhancement to enable checksum 
> computations in DFDL schemas.
> 
> 
> This is a high-priority feature for Daffodil's next release 3.2.0, especially 
> for cybersecurity applications of Daffodil, which I know a number of us are 
> involved in.
> 
> 
> I've produced a mock-up of how it would look, with lots of annotations in a 
> WIP 
> pull request on the ethernetIP DFDL schema. I only did the mock-up for the 
> IPV4 
> element, so look at that element in the ethernetIP.dfdl.xsd.
> 
> (UDP and TCP packets have their own additional checksums - I didn't mock up 
> those, just IPV4)
> 
> 
> This is at https://github.com/DFDLSchemas/ethernetIP/pull/1 
> <https://github.com/DFDLSchemas/ethernetIP/pull/1>
> 
> 
> This doesn't run, it's just an initial mock-up of the ideas for 
> checksum/CRC/parity recomputation capability as a further simple extension of 
> the existing DFDL layering extension.
> 
> 
> The layering extension itself is described here:
> 
> https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Data+Layering+for+Base64%2C+Line-Folding%2C+Compression%2C+Etc
>  
> <https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Data+Layering+for+Base64%2C+Line-Folding%2C+Compression%2C+Etc>
> 
> 
> I did notice that none of the published DFDLSchemas actually use the layering 
> transforms that we've built into Daffodil. There are some non-public DFDL 
> schemas that do use this extension to do line-folding transformations.
> 
> 
> There are, however, tests showing the DFDL layering extension in daffodil's 
> code 
> base. See
> 
> https://github.com/apache/daffodil/blob/master/daffodil-test/src/test/resources/org/apache/daffodil/layers/layers.tdml
>  
> <https://github.com/apache/daffodil/blob/master/daffodil-test/src/test/resources/org/apache/daffodil/layers/layers.tdml>
> and search for dfdlx:layerTransform property.
> 
> 
> The mock-up effectively proposes allowing layer transforms to read and write 
> DFDL variables, as a means of them accepting input parameters, and as the 
> means 
> of them computing and returning output results.
> 
> 
> I plan to do a couple other mock-ups of a check-digit calculation, and some 
> parity bit computations, but this IPV4 is enough to get the gist of the idea.
> 
> 
> I'd appreciate feedback on this, which you can do on the pull request in the 
> usual github code review manner.
> 
> 
> -mikeb
> 
> 
> 
> 
> Mike Beckerle | Principal Engineer
> 
> mbecke...@owlcyberdefense.com <mailto:bhum...@owlcyberdefense.com>
> 
> P +1-781-330-0412
> 

Reply via email to