I think there is an invariant that inserting these alignment suspensions is 
violating.

The chain of connected DOSs, well... information propagates across them. Any 
time a later one in the chain wants to know information derived from prior 
ones, a request ripples back the chain. So there is an assumption that you 
can't insert a split, except at the end of the chain.

That kind of lazy propagation algorithm is incompatible with there being timing 
issues associated with inserting new splits after such information has been 
requested.

So I think what we need to do is insert, unconditionally, a split for a DOS 
associated with the conditional separator, where that DOS will be there in the 
chain, but will either behave as 0 length if the separator proves to be 
suppressed, or behaves like an MTA if the separator proves to be not-suppressed.

Does that make sense?

Note that in most text-centric formats, optimizers are supposed to be 
optimizing out these MTA, since a format that is all text doesn't need them, so 
this insertion of MTA, or these just suggested 
PossiblySuppressedConditionalMTAs, should not be needed except in mixed 
text+binary or multiple text encoding situations, which are rare.

I do suspect these optimizations aren't working right currently though, so 
we're seeing these MTA and other alignment regions being inserted more 
frequently than they should.  (Which has performance implications also....)






________________________________
From: Steve Lawrence <slawre...@apache.org>
Sent: Thursday, September 24, 2020 1:58 PM
To: dev@daffodil.apache.org <dev@daffodil.apache.org>
Subject: Handling Nested Suspensions

In an effort to evaluate suspensions earlier so that we can minimize
excess buffering, I've discovered that this is causing problems related
to suspensions that create other suspensions. I think normally this
isn't an issue because we evaluate suspensions at the end, so the nested
suspensions don't actually need to suspend. But now that I'm trying to
evaluate suspensions earlier, the nested suspensions actually do need to
suspend.

Only about a dozen tests fail right now with my changes to evaluate
suspensions earlier, but the simplest one is
test_sequenceWithComplexType. This test is a sequence of prefix
separated elements:

https://github.com/apache/incubator-daffodil/blob/master/daffodil-test/src/test/resources/org/apache/daffodil/section14/sequence_groups/sequenceWithComplexType.dfdl.xsd

I believe what is happening during unparse is we create a suspension to
determine if the unparsed element is zero length or not, which
determines if we should unparse a separator. At some point, the
suspension determines the unparsed element is not zero length, and the
suspension runs the separator unparser and then finishes.

The problem is that running this separator unparser triggers an
alignment unparser, which at the time of evaluation needs to suspend.
The DOS that this alignment parser suspends on has already been split
(other things have suspended before we reevaluate the separator
suspension), which currently fails due to an assertion. We currently
only ever allow suspending from he last DOS, not a previous DOS.

My first instinct was to change this assumption to allow suspensions on
an older DOS, and to just sort of stick the new suspension DOS in the
middle of DOS's. I have this working, but this still results in blocks
on the alignment suspensions. I haven't quite figured out what's causing
these blocks. I suspect that there's just something extra that I need to
do to allow arbitrarily inserting DOSs.

So there's definitely more debugging to do, but wanted to bring this up
before I get too far down the rabbit hole.

Any thoughts if this is the right approach, or perhaps there's some
other fundamental change to how suspensions need to be handled to allow
this to work?

Reply via email to