We could allow parameterizing transforms by using transform identifiers from the pipeline, e.g.

  options = ['--parameterize=MyTransform;parallelism=5']
  with Pipeline.create(PipelineOptions(options)) as p:
    p | Create(1, 2, 3) | 'MyTransform' >> ParDo(..)


Those hints should always be optional, such that a pipeline continues to run on all runners.

-Max

On 28.06.20 14:30, Reuven Lax wrote:
However such a parameter would be specific to a single transform, whereas maxNumWorkers is a global parameter today.

On Sat, Jun 27, 2020 at 10:31 PM Daniel Collins <[email protected] <mailto:[email protected]>> wrote:

    I could imagine for example, a 'parallelismHint' field in the base
    parameters that could be set to maxNumWorkers when running on
    dataflow or an equivalent parameter when running on flink. It would
    be useful to get a default value for the sharding in the Reshuffle
    changes here https://github.com/apache/beam/pull/11919, but more
    generally to have some decent guess on how to best shard work. Then
    it would be runner-agnostic; you could set it to something like
    numCpus on the local runner for instance.

    On Sat, Jun 27, 2020 at 2:04 AM Reuven Lax <[email protected]
    <mailto:[email protected]>> wrote:

        It's an interesting question - this parameter is clearly very
        runner specific (e.g. it would be meaningless for the Dataflow
        runner, where parallelism is not a static constant). How should
        we go about passing runner-specific options per transform?

        On Fri, Jun 26, 2020 at 1:14 PM Akshay Iyangar
        <[email protected] <mailto:[email protected]>> wrote:

            Hi beam community,____

            __ __

            So I had brought this issue in our slack channel but I guess
            this warrants a deeper discussion and if we do go about what
            is the POA for it.____

            __ __

            So basically currently for Flink Runner we don’t support
            operator level parallelism which native Flink provides OOTB.
            So I was wondering what the community feels about having
            some way to pass parallelism for individual operators esp.
              for some of the existing IO’s ____

            __ __

            Wanted to know what people think of this.____

            __ __

            Thanks ____

            Akshay I____

Reply via email to