Re: Handling Nested Suspensions

Steve Lawrence Thu, 24 Sep 2020 12:42:00 -0700

So if we always create a split for the MTA regardless, then it needs to
be a special MTA unparser that knows the results of the Separator
suspension to determine if it should insert any alignment at all.


So I think the options so far are:

1) Separator suspension knows about both the alignment *and* the
separator, and once it determines a separator is needed, it then
suspends until it knows if alignment bytes are need. The suspension
continuation outputs alignment bits *and* unparse the separator

2) Separator suspension knows nothing about alignment, only the
separator. When we create a separator suspension we first create a new
OptionalMTA suspension. The OptionalMTA test waits until the Separator
suspension finishes, and based on the results of the Separator
suspension it may output alignment bits or just collapse to zero bits.

I'm not sure if there is a clear advantage of one over the other.
Whether the suspensions are distinct unparsers, they are still
potentially aware of each other to some degree. And both still need a
change to the grammar to uncouple the MTA unparser from Separator
unparser. Looking at the grammar, I don't think that will be too bad though.

I think it comes down to either one suspension with one split but maybe
more complex test/continuation logic, or two separate splits and
suspensions with simpler distinct logic. The overhead of extra splits is
maybe best to be avoided, but by separating into two distinct
suspensions the MTA one should optimize out (if our logic was fixed). So
maybe these feels like the better approach.

Distinct suspensions with some sort of communication also feels like it
could be more easily generalized so that if there are other cases of
nested suspensions, they just get split out into two suspensions and an
ability for one suspension to ask questions about another suspension.


On 9/24/20 3:25 PM, Beckerle, Mike wrote:
> Ah... I have to wrap my head around the unparser again..... I am not sure of 
> any of this, just that we have to put the splits to the DOS in order before 
> anything starts making inquiries of the locations of things.
> 
> I think my point is that we have to split before and after the conditional 
> separator perhaps twice, one for the MTA (unless optimized out) and again 
> once for the separator. Both those DOS would be zeroed length if later we 
> show the separator is to be suppressed. For the separator-proper's DOS, that 
> would happen naturally. For the MTA DOS, we need new logic to make it 
> collapse to zero also.
> 
> I think what we are doing now is waiting to see whether the separator-proper 
> is zero length and if so, then inserting the MTA split, but I think that is a 
> race with other things unparsing (like the very next separated element) that 
> ask for position information.
> 
> This might work now because the late evaluation of all the suspensions means 
> that there is no possibility of any ripple propagation of information 
> incrementally. So adding splits late never comes after info has already 
> rippled forward.
> 
> 
> 
> ________________________________
> From: Steve Lawrence <[email protected]>
> Sent: Thursday, September 24, 2020 2:33:39 PM
> To: [email protected] <[email protected]>
> Subject: Re: Handling Nested Suspensions
> 
> So are you suggesting that the suspension that determines if a separator
> should be laid down *also* determines and lays down necessary alignment?
> So the grammar changes so that separators no longer include MTA during
> unparse, and that instead is handled by the various separator unparsers
> and the suspensions they create?
> 
> If if we come across other issues of nested suspensions the general
> solution is likely to combine the separate suspensions into a single
> suspension?
> 
> 
> On 9/24/20 2:12 PM, Beckerle, Mike wrote:
>>
>> I think there is an invariant that inserting these alignment suspensions is 
>> violating.
>>
>> The chain of connected DOSs, well... information propagates across them. Any 
>> time a later one in the chain wants to know information derived from prior 
>> ones, a request ripples back the chain. So there is an assumption that you 
>> can't insert a split, except at the end of the chain.
>>
>> That kind of lazy propagation algorithm is incompatible with there being 
>> timing issues associated with inserting new splits after such information 
>> has been requested.
>>
>> So I think what we need to do is insert, unconditionally, a split for a DOS 
>> associated with the conditional separator, where that DOS will be there in 
>> the chain, but will either behave as 0 length if the separator proves to be 
>> suppressed, or behaves like an MTA if the separator proves to be 
>> not-suppressed.
>>
>> Does that make sense?
>>
>> Note that in most text-centric formats, optimizers are supposed to be 
>> optimizing out these MTA, since a format that is all text doesn't need them, 
>> so this insertion of MTA, or these just suggested 
>> PossiblySuppressedConditionalMTAs, should not be needed except in mixed 
>> text+binary or multiple text encoding situations, which are rare.
>>
>> I do suspect these optimizations aren't working right currently though, so 
>> we're seeing these MTA and other alignment regions being inserted more 
>> frequently than they should.  (Which has performance implications also....)
>>
>>
>>
>>
>>
>>
>> ________________________________
>> From: Steve Lawrence <[email protected]>
>> Sent: Thursday, September 24, 2020 1:58 PM
>> To: [email protected] <[email protected]>
>> Subject: Handling Nested Suspensions
>>
>> In an effort to evaluate suspensions earlier so that we can minimize
>> excess buffering, I've discovered that this is causing problems related
>> to suspensions that create other suspensions. I think normally this
>> isn't an issue because we evaluate suspensions at the end, so the nested
>> suspensions don't actually need to suspend. But now that I'm trying to
>> evaluate suspensions earlier, the nested suspensions actually do need to
>> suspend.
>>
>> Only about a dozen tests fail right now with my changes to evaluate
>> suspensions earlier, but the simplest one is
>> test_sequenceWithComplexType. This test is a sequence of prefix
>> separated elements:
>>
>> https://github.com/apache/incubator-daffodil/blob/master/daffodil-test/src/test/resources/org/apache/daffodil/section14/sequence_groups/sequenceWithComplexType.dfdl.xsd
>>
>> I believe what is happening during unparse is we create a suspension to
>> determine if the unparsed element is zero length or not, which
>> determines if we should unparse a separator. At some point, the
>> suspension determines the unparsed element is not zero length, and the
>> suspension runs the separator unparser and then finishes.
>>
>> The problem is that running this separator unparser triggers an
>> alignment unparser, which at the time of evaluation needs to suspend.
>> The DOS that this alignment parser suspends on has already been split
>> (other things have suspended before we reevaluate the separator
>> suspension), which currently fails due to an assertion. We currently
>> only ever allow suspending from he last DOS, not a previous DOS.
>>
>> My first instinct was to change this assumption to allow suspensions on
>> an older DOS, and to just sort of stick the new suspension DOS in the
>> middle of DOS's. I have this working, but this still results in blocks
>> on the alignment suspensions. I haven't quite figured out what's causing
>> these blocks. I suspect that there's just something extra that I need to
>> do to allow arbitrarily inserting DOSs.
>>
>> So there's definitely more debugging to do, but wanted to bring this up
>> before I get too far down the rabbit hole.
>>
>> Any thoughts if this is the right approach, or perhaps there's some
>> other fundamental change to how suspensions need to be handled to allow
>> this to work?
>>
> 
>

Re: Handling Nested Suspensions

Reply via email to