I'm not saying I completely disagree, but there are more and more DFDL
schemas to consider and many to which we have no visibility.

For example, at one company I was told there are over 300 DFDL schemas for
major aircraft systems. All are on secure networks where we have no access,
and I was told that nobody is going to downgrade those so we can see them
or run regression testing on them. They expect this to grow to 600+ DFDL
schemas.

So I think we need to consider backward compatibility more and more heavily
over time. The fact that we've done breaking changes before with sufficient
doc is not necessarily enough any longer.

Going forward breaking changes like this need to be carefully phased.
We may need, for example, to send out preliminary announcements on the
users list, for example, outlining release plans and anticipating
compatibility issues.


On Tue, May 6, 2025 at 1:24 PM Steve Lawrence <slawre...@apache.org> wrote:

> I feel like as long as the release notes are clear about how to maintain
> compatibility, it's not unreasonable to put this in a 3.11.0 release.
> We've
> changed behavior of many things before (especially when it's related to
> fixing
> non-spec compliant bugs, which technically this is) as long as there were
> relatively simple workarounds that were clearly explain in release notes
> or we
> didn't think many schemas relied on the old behavior.
>
> In this case, the release can be something like:
>
> To maintain the previous behavior, change choice branches with
> discriminators
> like this:
>
>    <choice>
>      <sequence ... >
>        <!-- discriminator -->
>        <!-- content -->
>      </sequence>
>      ...
>    </choice>
>
> To this:
>
>    <choice>
>      <sequence>
>        <sequence ...>
>          <!-- content -->
>        </sequence>
>        <sequence>
>          <!-- discriminator -->
>        </sequence>
>      </sequence>
>      ...
>    </choice>
>
> There are usually better ways to rewrite a schema, like using choice
> dispatch
> instead of speculative parsing, but this is a generic and relatively
> simple
> solution for all schemas that do this. We should also add a warning if a
> discriminator is placed directly on a choice branch, since usually what a
> schema
> author intended was to evaluate the discriminator at the beginning of the
> branch
> and it should simply be wrapped in a sequence.
>
> Though, I'm not against a toggle flag if we really want to maintain
> backwards
> compatibility with the old broken behavior, and then 4.0.0 can toggle the
> switch
> to the correct behavior.
>
> If we still think this requires a major version bump, I would suggest we
> just
> wait until 4.0.0 with the API, scala 3, and other large breaking changes.
> Doing
> a single 4.0.0 release for this and then another major version bump to
> 5.0.0 for
> the API and scala 3 changes seems unnecessary. I also don't know of any
> schemas
> really waiting on this fix, so we don't really need to rush it. We're just
> trying to merge this since it's been in the merge queue for a long time.
>
>
> On 2025-05-06 12:42 PM, Mike Beckerle wrote:
> > Plan on a rather large release-note section, perhaps even a whole
> > daffodil-site page of its own to discuss the changes required to schemas
> > due to this bug-fix.
> >
> > This will need a substantial worked example, discussion of the
> > false-positives problem where malformed data might be masked if a branch
> > which should have been positively discriminated instead backtracks and a
> > subsequent branch of that choice is somehow satisfied by default, because
> > the schema was written with assumption that the discriminator would be
> > evaluated, but it wasn't. This is a pretty big deal.
> >
> > Since this is an incompatible breaking change we have to think about
> > version numbering here. Last chatter I heard we are planning for a
> 3.11.0,
> > then 4.0.0.
> > But 3.11 is just scala version change, and 4.0.0 is Scala 3 and required
> > API changes, but neither of these have any a functional change to schemas
> > and DFDL language behavior. (I think. Correct me if I am wrong.)
> >
> > Does this incompatible breaking change require us to then create a
> release
> > 5.0.0 or do we want to muddy the waters of 4.0.0 by adding this
> functional
> > change in as well?
> >
> > Also, we should think about if we want to have a switch on the fix for
> this
> > behavior change which defaults to "fixed" behavior, but the release note
> > would tell people to either fix their schemas or turn on the "beforeFix"
> > behavior with the switch. Or we could introduce this fix with the switch
> > set to "beforeFix" behavior, but then flip the default when we're ready
> to
> > release daffodil version 5.0.0?
> >
> >
> >
> >
> >
> >
> >
> >
> > On Tue, May 6, 2025 at 11:57 AM Adams, Joshua
> > <jad...@owlcyberdefense.com.invalid> wrote:
> >
> >> https://issues.apache.org/jira/browse/DAFFODIL-1971
> >>
> >> I believe that I have the work for this pretty much wrapped up in this
> PR
> >> (other than rebasing onto main) -
> >> https://github.com/apache/daffodil/pull/987
> >>
> >> Before finishing that, I figured it would be worth discussing how these
> >> changes can break existing schemas.  Schemas that have been using
> >> discriminator expressions with speculative choices can be particularly
> >> effected as discriminator expressions are now evaluated regardless of
> the
> >> success of parsing the rest of the choice branch.  Prior to this change
> the
> >> discriminator expression would only be evaluated if the rest of the
> choice
> >> branch parsed successfully.
> >>
> >> 3 out of the ~80 DFDL schema projects we have in our regression test
> suite
> >> are effected by this change.
> >>
> >>
> >>    *
> >> GIF
> >>       *
> >> This schema breaks because it uses discriminator expressions to test if
> >> Byte elements match some expected value.  Prior to this change the
> >> expression would only run if the rest of the choice branch successfully
> >> parsed and the Byte elements checked by the expression would match.
> With
> >> this change a speculative choice branch will fail to parse but the
> >> discriminator expression will still be evaluated and fail to match the
> Byte
> >> elements.
> >>       *
> >> Suggested fix: Use choice dispatch based on these expected Byte elements
> >> and remove speculative parsing and discriminators altogether.
> >>       *
> >>
> >>
> https://github.com/DFDLSchemas/GIF/pull/8/commits/afb10382514753c5dbe6fb69f77e0ac6da6505ab
> >>
> >>
> >>    *
> >> Edifact
> >>       *
> >> This schema used discriminator expressions that always evaluated to
> true.
> >> Prior to this change this expression only evaluated when the rest of the
> >> choice branch parsed successfully.  With this change the expression
> will be
> >> run even if the choice branch fails, causing obvious issues.
> >>       *
> >> Suggested fix: Use a nested sequence to contain the choice branch body
> >> followed by another nested sequence for the discriminator expression.
> This
> >> will cause the discriminator expression to only be ran if the the first
> >> nested sequence containing the choice branch body parses.
> >>       *
> >>
> >>
> https://github.com/DFDLSchemas/EDIFACT/pull/9/commits/476fe78e424e01580d5caf302a87aaaf703c609b
> >>
> >>
> >>    *
> >> JPEG2000 (Internal schema)
> >>       *
> >> This schema also uses discriminator expressions to check if a ID element
> >> matches expected values
> >>       *
> >> Suggested fix: Use choice dispatch based on the ID element and remove
> >> speculative parsing and discriminators altogether.
> >>
> >> With these suggested changes all schemas in our regression suite pass
> with
> >> the changes for DAFFODIL-1971.  I'd like to get this merged into the
> next
> >> release if possible now that I have confirmed viable work-arounds for at
> >> least the problematic schemas that we are aware of.
> >>
> >> Thoughts?
> >>
> >
>
>

Reply via email to