Here's a slide I use about performance to give people some expectations of what 
they should be able to achieve if calling Daffodil with a pre-compiled DFDL 
schema. I did this test using the Daffodil command-line tool.

This is for dense binary data. Mostly binary integers, flags, short strings.

For verbose textual data the Daffodil overhead will be higher, but the point 
here is that if something is taking like 2-seconds extra, that's not 
per-parse-unparse overhead, that's doing some expensive initialization 
repeatedly that isn't needed to be done repeatedly.

[cid:b0d93fd4-2bdb-46e6-a7e8-2fb47c5bf386]
________________________________
From: Sloane, Brandon <[email protected]>
Sent: Friday, August 30, 2019 10:47 AM
To: [email protected] <[email protected]>
Subject: Re: How to speed up DFDL processing?

Serializing data to XML (or JSON) will never give you optimal performance, and 
Daffodil has not been heavily optimized.

Having said that, there are 2 common sources of slowness that can be avoid by 
users:

1) Schema compilation. There is an ongoing effort to improve in this regard. 
Users can mitigate this concern by precompiling schemas using `daffodil 
save-parser` and `daffodil parse -P`. If using daffodil as a library, you can 
also compile once on initialization then reuse the compile schema throughout 
the programs lifetime

2) JVM startup time. Not much Daffodil can do about this one. There are a 
couple of options for user:

a) Use the --stream option, which allows a single instance of Daffodil to parse 
a stream of messages
b) Use Daffodil as a library from a long-lived process
c) Use a third party tool to speed up JVM startup time
________________________________
From: Costello, Roger L. <[email protected]>
Sent: Friday, August 30, 2019 9:55 AM
To: [email protected] <[email protected]>
Subject: How to speed up DFDL processing?


Hello DFDL community,



A project that is using DFDL reported this to me:



A comment was made about the latency of
               using Daffodil.  One of the developers that
               is implementing DFDL said that they were
               seeing a 2 second increase in the latency for
               the dataflow using Daffodil.



How to respond to this? Is an addition of 2 seconds to a dataflow to be 
expected? How to make things run faster?



/Roger

Reply via email to