Hi  Yingjie,

+1 for this FLIP. I'm pretty sure this will greatly improve the ease
of batch jobs.

Looks like "taskmanager.memory.framework.off-heap.batch-shuffle.size"
and "taskmanager.network.sort-shuffle.min-buffers" are related to
network memory and framework.off-heap.size.

My question is, what is the maximum parallelism a job can have with
the default configuration? (Does this break out of the box)

How much network memory and framework.off-heap.size are required for
how much parallelism in the default configuration?

I do feel that this correspondence is a bit difficult to control at
the moment, and it would be best if a rough table could be provided.

Best,
Jingsong

On Tue, Dec 14, 2021 at 2:16 PM Yingjie Cao <kevin.ying...@gmail.com> wrote:
>
> Hi Jiangang,
>
> Thanks for your suggestion.
>
> >>> The config can affect the memory usage. Will the related memory configs 
> >>> be changed?
>
> I think we will not change the default network memory settings. My best 
> expectation is that the default value can work for most cases (though may not 
> the best) and for other cases, user may need to tune the memory settings.
>
> >>> Can you share the tpcds results for different configs? Although we change 
> >>> the default values, it is helpful to change them for different users. In 
> >>> this case, the experience can help a lot.
>
> I did not keep all previous TPCDS results, but from the results, I can tell 
> that on HDD, always using the sort-shuffle is a good choice. For small jobs, 
> using sort-shuffle may not bring much performance gain, this may because that 
> all shuffle data can be cached in memory (page cache), this is the case if 
> the cluster have enough resources. However, if the whole cluster is under 
> heavy burden or you are running large scale jobs, the performance of those 
> small jobs can also be influenced. For large-scale jobs, the configurations 
> suggested to be tuned are taskmanager.network.sort-shuffle.min-buffers and 
> taskmanager.memory.framework.off-heap.batch-shuffle.size, you can increase 
> these values for large-scale batch jobs.
>
> BTW, I am still running TPCDS tests these days and I can share these results 
> soon.
>
> Best,
> Yingjie
>
> 刘建刚 <liujiangangp...@gmail.com> 于2021年12月10日周五 18:30写道:
>>
>> Glad to see the suggestion. In our test, we found that small jobs with the 
>> changing configs can not improve the performance much just as your test. I 
>> have some suggestions:
>>
>> The config can affect the memory usage. Will the related memory configs be 
>> changed?
>> Can you share the tpcds results for different configs? Although we change 
>> the default values, it is helpful to change them for different users. In 
>> this case, the experience can help a lot.
>>
>> Best,
>> Liu Jiangang
>>
>> Yun Gao <yungao...@aliyun.com.invalid> 于2021年12月10日周五 17:20写道:
>>>
>>> Hi Yingjie,
>>>
>>> Very thanks for drafting the FLIP and initiating the discussion!
>>>
>>> May I have a double confirmation for 
>>> taskmanager.network.sort-shuffle.min-parallelism that
>>> since other frameworks like Spark have used sort-based shuffle for all the 
>>> cases, does our
>>> current circumstance still have difference with them?
>>>
>>> Best,
>>> Yun
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------
>>> From:Yingjie Cao <kevin.ying...@gmail.com>
>>> Send Time:2021 Dec. 10 (Fri.) 16:17
>>> To:dev <d...@flink.apache.org>; user <user@flink.apache.org>; user-zh 
>>> <user...@flink.apache.org>
>>> Subject:Re: [DISCUSS] Change some default config values of blocking shuffle
>>>
>>> Hi dev & users:
>>>
>>> I have created a FLIP [1] for it, feedbacks are highly appreciated.
>>>
>>> Best,
>>> Yingjie
>>>
>>> [1] 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability
>>> Yingjie Cao <kevin.ying...@gmail.com> 于2021年12月3日周五 17:02写道:
>>>
>>> Hi dev & users,
>>>
>>> We propose to change some default values of blocking shuffle to improve the 
>>> user out-of-box experience (not influence streaming). The default values we 
>>> want to change are as follows:
>>>
>>> 1. Data compression 
>>> (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the 
>>> default value is 'false'.  Usually, data compression can reduce both disk 
>>> and network IO which is good for performance. At the same time, it can save 
>>> storage space. We propose to change the default value to true.
>>>
>>> 2. Default shuffle implementation 
>>> (taskmanager.network.sort-shuffle.min-parallelism): Currently, the default 
>>> value is 'Integer.MAX', which means by default, Flink jobs will always use 
>>> hash-shuffle. In fact, for high parallelism, sort-shuffle is better for 
>>> both stability and performance. So we propose to reduce the default value 
>>> to a proper smaller one, for example, 128. (We tested 128, 256, 512 and 
>>> 1024 with a tpc-ds and 128 is the best one.)
>>>
>>> 3. Read buffer of sort-shuffle 
>>> (taskmanager.memory.framework.off-heap.batch-shuffle.size): Currently, the 
>>> default value is '32M'. Previously, when choosing the default value, both 
>>> ‘32M' and '64M' are OK for tests and we chose the smaller one in a cautious 
>>> way. However, recently, it is reported in the mailing list that the default 
>>> value is not enough which caused a buffer request timeout issue. We already 
>>> created a ticket to improve the behavior. At the same time, we propose to 
>>> increase this default value to '64M' which can also help.
>>>
>>> 4. Sort buffer size of sort-shuffle 
>>> (taskmanager.network.sort-shuffle.min-buffers): Currently, the default 
>>> value is '64' which means '64' network buffers (32k per buffer by default). 
>>> This default value is quite modest and the performance can be influenced. 
>>> We propose to increase this value to a larger one, for example, 512 (the 
>>> default TM and network buffer configuration can serve more than 10 result 
>>> partitions concurrently).
>>>
>>> We already tested these default values together with tpc-ds benchmark in a 
>>> cluster and both the performance and stability improved a lot. These 
>>> changes can help to improve the out-of-box experience of blocking shuffle. 
>>> What do you think about these changes? Is there any concern? If there are 
>>> no objections, I will make these changes soon.
>>>
>>> Best,
>>> Yingjie
>>>


-- 
Best, Jingsong Lee

Reply via email to