My own thoughts inline on the three ideas discussed. On Mon, Dec 4, 2017 at 3:19 PM, Eugene Kirpichov <[email protected]> wrote: > > Continuation triggers are still worse. For context: continuation trigger > is the trigger that's set on the output of a GBK and controls further > aggregation of the results of this aggregation by downstream GBKs. The > output shouldn't just use the same trigger as the input, because e.g. if > the input trigger said "wait for an hour before emitting a pane", that > doesn't mean that we should wait for another hour before emitting a result > of aggregating the result of the input trigger. Continuation triggers try > to simulate the behavior "as if a pane of the input propagated through the > entire pipeline", but the implementation of individual continuation > triggers doesn't do that. E.g. the continuation of "first N elements in > pane" trigger is "first 1 element in pane", and if the results of a first > GBK are further grouped by a second GBK onto more coarse key (e.g. if > everything is grouped onto the same key), that effectively means that, of > the keys of the first GBK, only one survives and all others are dropped > (what happened in the data loss bug). >
Additional flavor: the full version of the faulty spec for "once trigger" is that the continuation trigger is a once trigger that fires once *after awaiting all of the once-fired data from upstream*. This spec is not met - the continuation trigger does not correctly wait for all upstream data. FWIW we can still try to meet that spec while still forbidding dropping - a correct design would have the "once firing" effect but bugs would manifest as an extra firing instead of data loss. But I also don't think the spec is worth keeping. The ultimate fix to all of these things is https://s.apache.org/beam- > sink-triggers . However, it is a huge model change, and meanwhile we have > to do something. The options are, in order of increasing backward > incompatibility (but incompatibility in a "rejecting something that > previously was accepted but extremely dangerous" kind of way): > > - *Make the continuation trigger of most triggers be the "always-fire" > trigger.* Seems that this should be the case for all triggers except > the watermark trigger. This will definitely increase safety, but lead to > more eager firing of downstream aggregations. It also will violate a user's > expectation that a fire-once trigger fires everything downstream only once, > but that expectation appears impossible to satisfy safely. > > +1 This doesn't solve the issue, but I think we should do it in addition. It makes the upstream trigger govern latency and we can define it to be compatible with any trigger, letting the other trigger govern the output. This will fix the issue that a three-way join should still work (at least not crash!) when expressed as successive binary joins. This is currently not true because continuation trigger of processing time is sync processing time, so the trigger of the first binary join is judge incompatible with the remaining input. Also, most runners do not implement synchronized processing time; its utility is a bit questionable in this space anyhow. Note: there are only three leaf triggers: processing time, count, and end of window. Element count already has "always trigger" as a continuation, but in a form that might appear incompatible with other triggers. Processing time has the problematic sync processing time - since it already doesn't work, let's just make it "always trigger". And leave EOW alone. > - *Make the continuation trigger of some triggers be the "invalid" > trigger, *i.e. require the user to set it explicitly: there's in > general no good and safe way to infer what a trigger on a second GBK > "truly" should be, based on the trigger of the PCollection input into a > first GBK. This is especially true for terminating triggers. > > -0.5 Meh from me. Triggers are already a pain enough. Manual interventions like this will be extra annoying and may often cross composite boundaries, breaking abstractions. > - *Prohibit top-level terminating triggers entirely. *This will ensure > that the only data that ever gets dropped is "droppably late" data. > > +1 I've wanted this for a long time, but didn't make it in for v2. Given the current behavior is data loss, I would be OK with doing it without a major version bump. I know most people only want to fix code bugs in minor versions, but I tend to also favor fixing semantic bugs like this. Anyone who is currently not losing data will continue to not lose data. > Do people think that these options are sensible? > +Kenn Knowles <[email protected]> +Thomas Groh <[email protected]> +Ben > Chambers <[email protected]> is this a fair summary of our discussion? > Yup. Kenn
