Robert Burke created BEAM-10302:
-----------------------------------

             Summary: Respect timestamp OutputTime Windowing Strategy 
configuration in Lifted CombineFns.
                 Key: BEAM-10302
                 URL: https://issues.apache.org/jira/browse/BEAM-10302
             Project: Beam
          Issue Type: New Feature
          Components: sdk-go
            Reporter: Robert Burke


The Go SDK currently retains an arbitrary timestamp per key per bundle when 
performing a lifted combine. 
However, depending on the windowing strategy, a prefered time could be 
specified.
https://github.com/apache/beam/blob/a5b2046b10bebc59c5bde41d4cb6498058fdada2/model/pipeline/src/main/proto/beam_runner_api.proto#L901

The code in question for the Go SDK:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/combine.go#L395

At present this implementation is "correct", as the default output time is 
Unspecified, and there's no user mechanism to configure a windowing strategy to 
this granularity.

So there are a few parts to this.
1. Propagate the windowing strategy information to exec.LiftedCombine somehow 
and implement the correct output. This can be done whether or not 2 is 
implemented.
2. Provide a trigger configuration for beam.WindowInto, so this can be 
configured on the user side. This is significantly more work.

This matters only when using windows that are not the Global Window, and when 
using a Lifted Combine, which commonly only happens in batch contexts. However, 
since Beam is a unified model, the windowing features should work correctly in 
both execution modes of a Go SDK pipeline.







--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to