Robert Burke created BEAM-10302:
-----------------------------------
Summary: Respect timestamp OutputTime Windowing Strategy
configuration in Lifted CombineFns.
Key: BEAM-10302
URL: https://issues.apache.org/jira/browse/BEAM-10302
Project: Beam
Issue Type: New Feature
Components: sdk-go
Reporter: Robert Burke
The Go SDK currently retains an arbitrary timestamp per key per bundle when
performing a lifted combine.
However, depending on the windowing strategy, a prefered time could be
specified.
https://github.com/apache/beam/blob/a5b2046b10bebc59c5bde41d4cb6498058fdada2/model/pipeline/src/main/proto/beam_runner_api.proto#L901
The code in question for the Go SDK:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/combine.go#L395
At present this implementation is "correct", as the default output time is
Unspecified, and there's no user mechanism to configure a windowing strategy to
this granularity.
So there are a few parts to this.
1. Propagate the windowing strategy information to exec.LiftedCombine somehow
and implement the correct output. This can be done whether or not 2 is
implemented.
2. Provide a trigger configuration for beam.WindowInto, so this can be
configured on the user side. This is significantly more work.
This matters only when using windows that are not the Global Window, and when
using a Lifted Combine, which commonly only happens in batch contexts. However,
since Beam is a unified model, the windowing features should work correctly in
both execution modes of a Go SDK pipeline.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)