Daniel Halperin created BEAM-68:
-----------------------------------

             Summary: Support for limiting parallelism of a step
                 Key: BEAM-68
                 URL: https://issues.apache.org/jira/browse/BEAM-68
             Project: Beam
          Issue Type: New Feature
          Components: beam-model
            Reporter: Daniel Halperin


Users may want to limit the parallelism of a step. Two classic uses cases are:

- User wants to produce at most k files, so sets TextIO.Write.withNumShards(k).
- External API only supports k QPS, so user sets a limit of k/(expected 
QPS/step) on the ParDo that makes the API call.

Unfortunately, there is no way to do this effectively within the Beam model. A 
GroupByKey with exactly k keys will guarantee that only k elements are 
produced, but runners are free to break fusion in ways that each element may be 
processed in parallel later.

To implement this functionaltiy, I believe we need to add this support to the 
Beam Model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to