Hi Amy,

What would you be looking for beyond what is already in our Wikipedia
description https://en.wikipedia.org/wiki/Data_Format_Description_Language ?

One thing I would emphasize about DFDL is that most people think data
parsing is the hard problem, and data serialization is relatively simple,
but if the serializer actually solves the problem of computing all the
stored lengths for you, so that the user doesn't have to know anything
about the actual format, then data serialization turns out to be far more
complex than parsing. Particularly if you want streaming behavior. I gave a
talk about this back at the ApacheCon conference in 2018. The slides are
available here: https://s.apache.org/apacheconNA2018-dfdl

To date, the primary use case for DFDL is cybersecurity related. Data must
be parsed/ validated and "unparsed" back to original form in order to
ensure that the data is, in fact, in that format and will not crash
applications. The threat is not so much malware as "just bad data" causing
denial-of-service.

For your study, I would suggest you also look at ASN.1 Encoding Control
Notation. This has been an ISO standard since 2008. ASN.1 which we normally
think of as a prescriptive data format, but ECN extends it so that you
specify the representation of the data. See:
https://en.wikipedia.org/wiki/Encoding_Control_Notation

I think it would be very helpful if a paper really compared/contrasted
these approaches.


On Fri, Apr 19, 2024 at 10:24 AM Roberts, Amy L2 <[email protected]>
wrote:

> Hello!
>
> I am working with a team on a tool, Awkward Kaitai, that gives people
> tools to work with binary data once that data has been described with a
> custom language.  If you're interested in more details the project is
> currently hosted at
> https://github.com/ManasviGoyal/kaitai_struct_awkward_runtime and is
> meant to integrate with a larger project, https://kaitai.io/.
>
> I am writing because DFDL is a tool that solves a similar problem.
>
> We are currently writing a paper that provides examples of different
> custom-data problems in different domains and provides an overview of tools
> that help scientists work with such data and I wanted to reach out to the
> DFDL community to see if anyone would be interested in joining our paper as
> an author.
>
> I'd be delighted to have you contribute in any way you'd like as an
> author, and am particularly interested in having you:
>
> - Contribute a section about your tool
> - Show how your tool deals with a toy data file (I'm suggesting
> https://github.com/det-lab/dataReaderWriter/blob/master/kaitai/ksy/animal.ksy
> but would be happy to consider other options!)
> - Help identify any similar tools that we should include in our review
> - Help identify any use cases that we could include in our "Use Cases"
> section
>
> Thanks so much for your work in this area!
>
> Best,
>
> Amy
>
> Amy Roberts
>
> Assistant Professor of Physics
>
> [email protected]
>

Reply via email to