Re: [DISCUSS] Deferring (pre) combine for merging windows.

2016-10-24 Thread Aljoscha Krettek
@Amit: Yes, Flink is more "what you write is what you get". For example, in Flink we have a Fold function for windows which cannot be efficiently computed with merging windows (it would require using a "group by" window and then folding the iterable). We just don't allow this. For Beam, I think

Re: [DISCUSS] Deferring (pre) combine for merging windows.

2016-10-22 Thread Robert Bradshaw
On Sat, Oct 22, 2016 at 2:38 AM, Amit Sela wrote: > I understand the semantics, but I feel like there might be a different > point of view for open-source runners. It seems we're losing a major promise of the runner interchangeability story if different runners can give

Re: [DISCUSS] Deferring (pre) combine for merging windows.

2016-10-22 Thread Amit Sela
I understand the semantics, but I feel like there might be a different point of view for open-source runners. Dataflow is a service, and it tries to do it's best to optimize execution while users don't have to worry about internal implementation (they are not aware of it). I can assure

Re: [DISCUSS] Deferring (pre) combine for merging windows.

2016-10-21 Thread Robert Bradshaw
Combine.perKey() is defined as GroupByKey() | Combine.values(). A runner is free, in fact encouraged, to take advantage of the associative properties of CombineFn to compute the result of GroupByKey() | Combine.values() as cheaply as possible, but it is incorrect to produce something that could

Re: [DISCUSS] Deferring (pre) combine for merging windows.

2016-10-21 Thread Amit Sela
Please excuse my typos and apply "s/differ/defer/g" ;-). Amit. On Fri, Oct 21, 2016 at 2:59 PM Amit Sela wrote: > I'd like to raise an issue that was discussed in BEAM-696 > . > I won't recap here because it would be

[DISCUSS] Deferring (pre) combine for merging windows.

2016-10-21 Thread Amit Sela
I'd like to raise an issue that was discussed in BEAM-696 . I won't recap here because it would be extensive (and probably exhaustive), and I'd also like to restart the discussion here rather then summarize it. *The problem* In the case of (main)