lostluck commented on code in PR #27632: URL: https://github.com/apache/beam/pull/27632#discussion_r1289369073
########## website/www/site/content/en/contribute/runner-guide.md: ########## @@ -399,59 +370,28 @@ fast path as an optimization. ### Special mention: the Combine composite A composite transform that is almost always treated specially by a runner is -`Combine` (per key), which applies an associative and commutative operator to +`CombinePerKey`, which applies an associative and commutative operator to the elements of a `PCollection`. This composite is not a primitive. It is implemented in terms of `ParDo` and `GroupByKey`, so your runner will work without treating it - but it does carry additional information that you probably want to use for optimizations: the associative-commutative operator, known as a `CombineFn`. +Generally runners will want to implement this via what is called +combiner lifting, where a new operation is placed before the `GroupByKey` +that does partial (within-bundle) combining, which often requires a slight +modification of what comes after the `GroupByKey` as well. +An example of this transformation can be found in the Review Comment: Same for Splittable DoFns too, but once I get around to State and Timers, Fusion, or Drain, that section will adjust a bit probably. https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/handlepardo.go#L108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
