https://issues.apache.org/jira/browse/DAFFODIL-1971

I believe that I have the work for this pretty much wrapped up in this PR 
(other than rebasing onto main) - https://github.com/apache/daffodil/pull/987

Before finishing that, I figured it would be worth discussing how these changes 
can break existing schemas.  Schemas that have been using discriminator 
expressions with speculative choices can be particularly effected as 
discriminator expressions are now evaluated regardless of the success of 
parsing the rest of the choice branch.  Prior to this change the discriminator 
expression would only be evaluated if the rest of the choice branch parsed 
successfully.

3 out of the ~80 DFDL schema projects we have in our regression test suite are 
effected by this change.


  *
GIF
     *
This schema breaks because it uses discriminator expressions to test if Byte 
elements match some expected value.  Prior to this change the expression would 
only run if the rest of the choice branch successfully parsed and the Byte 
elements checked by the expression would match.  With this change a speculative 
choice branch will fail to parse but the discriminator expression will still be 
evaluated and fail to match the Byte elements.
     *
Suggested fix: Use choice dispatch based on these expected Byte elements and 
remove speculative parsing and discriminators altogether.
     *
https://github.com/DFDLSchemas/GIF/pull/8/commits/afb10382514753c5dbe6fb69f77e0ac6da6505ab


  *
Edifact
     *
This schema used discriminator expressions that always evaluated to true.  
Prior to this change this expression only evaluated when the rest of the choice 
branch parsed successfully.  With this change the expression will be run even 
if the choice branch fails, causing obvious issues.
     *
Suggested fix: Use a nested sequence to contain the choice branch body followed 
by another nested sequence for the discriminator expression.  This will cause 
the discriminator expression to only be ran if the the first nested sequence 
containing the choice branch body parses.
     *
https://github.com/DFDLSchemas/EDIFACT/pull/9/commits/476fe78e424e01580d5caf302a87aaaf703c609b


  *
JPEG2000 (Internal schema)
     *
This schema also uses discriminator expressions to check if a ID element 
matches expected values
     *
Suggested fix: Use choice dispatch based on the ID element and remove 
speculative parsing and discriminators altogether.

With these suggested changes all schemas in our regression suite pass with the 
changes for DAFFODIL-1971.  I'd like to get this merged into the next release 
if possible now that I have confirmed viable work-arounds for at least the 
problematic schemas that we are aware of.

Thoughts?

Reply via email to