Re: Individual Parallelism support for Flink Runner

Reuven Lax Sun, 28 Jun 2020 05:30:39 -0700

However such a parameter would be specific to a single transform,
whereas maxNumWorkers is a global parameter today.


On Sat, Jun 27, 2020 at 10:31 PM Daniel Collins <dpcoll...@google.com>
wrote:

> I could imagine for example, a 'parallelismHint' field in the base
> parameters that could be set to maxNumWorkers when running on dataflow or
> an equivalent parameter when running on flink. It would be useful to get a
> default value for the sharding in the Reshuffle changes here
> https://github.com/apache/beam/pull/11919, but more generally to have
> some decent guess on how to best shard work. Then it would be
> runner-agnostic; you could set it to something like numCpus on the local
> runner for instance.
>
> On Sat, Jun 27, 2020 at 2:04 AM Reuven Lax <re...@google.com> wrote:
>
>> It's an interesting question - this parameter is clearly very runner
>> specific (e.g. it would be meaningless for the Dataflow runner, where
>> parallelism is not a static constant). How should we go about passing
>> runner-specific options per transform?
>>
>> On Fri, Jun 26, 2020 at 1:14 PM Akshay Iyangar <aiyan...@godaddy.com>
>> wrote:
>>
>>> Hi beam community,
>>>
>>>
>>>
>>> So I had brought this issue in our slack channel but I guess this
>>> warrants a deeper discussion and if we do go about what is the POA for it.
>>>
>>>
>>>
>>> So basically currently for Flink Runner we don’t support operator level
>>> parallelism which native Flink provides OOTB. So I was wondering what the
>>> community feels about having some way to pass parallelism for individual
>>> operators esp.  for some of the existing IO’s
>>>
>>>
>>>
>>> Wanted to know what people think of this.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Akshay I
>>>
>>

Re: Individual Parallelism support for Flink Runner

Reply via email to