Re: Reduce parallelism without network transfer.

2018-02-06 Thread Piotr Nowojski
Hi, Rebalance is more safe default setting that protects against data skew. And even the smallest data skew can create a bottleneck much larger then the serialisation/network transfer cost. Especially if one changes the parallelism to a value that’s not a result of multiplication or division (l

Re: Reduce parallelism without network transfer.

2018-02-05 Thread Kien Truong
Thanks Piotr, it works. May I ask why default behavior when reducing parallelism is rebalance, and not rescale ? Regards, Kien ⁣Sent from TypeApp ​ On Feb 5, 2018, 15:28, at 15:28, Piotr Nowojski wrote: >Hi, > >It should work like this out of the box if you use rescale method: > >https://ci.ap

Re: Reduce parallelism without network transfer.

2018-02-05 Thread Piotr Nowojski
Hi, It should work like this out of the box if you use rescale method: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/index.html#physical-partitioning

Reduce parallelism without network transfer.

2018-02-02 Thread Kien Truong
Hi, Assuming that I have a streaming job, using 30 task managers with 4 slot each. I want to change the parallelism of 1 operator from 120 to 30. Are there anyway so that each subtask of this operator get data from 4 upstream subtasks running in the same task manager, thus avoiding network comp