Hi all,

Recently, I've been working on adding CombineFn.setup and
CombineFn.teardown to Python SDK [1] (Java SDK is also on the list, but for
now, it's only Python). Here's my implementation:
https://github.com/apache/beam/pull/13048. Would someone be willing to take
a look?

The functionality is based on `Operation.start` and `Operation.finish`. It
works on all runners I tested: Flink, Spark and both direct runners. The
only configuration that makes problems is the batch Dataflow worker without
`--experiments=beam_fn_api` being set. I wonder how big the problem is
though, since new pipelines are going to use the new Dataflow Runner V2 in
the near future.

Thank you,
Kamil

[1] https://issues.apache.org/jira/browse/BEAM-3736

Reply via email to