Hi Amy, What would you be looking for beyond what is already in our Wikipedia description https://en.wikipedia.org/wiki/Data_Format_Description_Language ?
One thing I would emphasize about DFDL is that most people think data parsing is the hard problem, and data serialization is relatively simple, but if the serializer actually solves the problem of computing all the stored lengths for you, so that the user doesn't have to know anything about the actual format, then data serialization turns out to be far more complex than parsing. Particularly if you want streaming behavior. I gave a talk about this back at the ApacheCon conference in 2018. The slides are available here: https://s.apache.org/apacheconNA2018-dfdl To date, the primary use case for DFDL is cybersecurity related. Data must be parsed/ validated and "unparsed" back to original form in order to ensure that the data is, in fact, in that format and will not crash applications. The threat is not so much malware as "just bad data" causing denial-of-service. For your study, I would suggest you also look at ASN.1 Encoding Control Notation. This has been an ISO standard since 2008. ASN.1 which we normally think of as a prescriptive data format, but ECN extends it so that you specify the representation of the data. See: https://en.wikipedia.org/wiki/Encoding_Control_Notation I think it would be very helpful if a paper really compared/contrasted these approaches. On Fri, Apr 19, 2024 at 10:24 AM Roberts, Amy L2 <[email protected]> wrote: > Hello! > > I am working with a team on a tool, Awkward Kaitai, that gives people > tools to work with binary data once that data has been described with a > custom language. If you're interested in more details the project is > currently hosted at > https://github.com/ManasviGoyal/kaitai_struct_awkward_runtime and is > meant to integrate with a larger project, https://kaitai.io/. > > I am writing because DFDL is a tool that solves a similar problem. > > We are currently writing a paper that provides examples of different > custom-data problems in different domains and provides an overview of tools > that help scientists work with such data and I wanted to reach out to the > DFDL community to see if anyone would be interested in joining our paper as > an author. > > I'd be delighted to have you contribute in any way you'd like as an > author, and am particularly interested in having you: > > - Contribute a section about your tool > - Show how your tool deals with a toy data file (I'm suggesting > https://github.com/det-lab/dataReaderWriter/blob/master/kaitai/ksy/animal.ksy > but would be happy to consider other options!) > - Help identify any similar tools that we should include in our review > - Help identify any use cases that we could include in our "Use Cases" > section > > Thanks so much for your work in this area! > > Best, > > Amy > > Amy Roberts > > Assistant Professor of Physics > > [email protected] >
