[GitHub] [flink] gaoyunhaii commented on a change in pull request #18350: [FLINK-25636][network] Change some default config values of blocking shuffle for better usability

GitBox Sat, 15 Jan 2022 20:29:50 -0800


gaoyunhaii commented on a change in pull request #18350:
URL: https://github.com/apache/flink/pull/18350#discussion_r785388438




##########
File path: 
docs/layouts/shortcodes/generated/all_taskmanager_network_section.html
##########
@@ -136,15 +136,15 @@
         </tr>
         <tr>
             <td><h5>taskmanager.network.sort-shuffle.min-buffers</h5></td>
-            <td style="word-wrap: break-word;">64</td>
+            <td style="word-wrap: break-word;">512</td>
             <td>Integer</td>
-            <td>Minimum number of network buffers required per sort-merge 
blocking result partition. For production usage, it is suggested to increase 
this config value to at least 2048 (64M memory if the default 32K memory 
segment size is used) to improve the data compression ratio and reduce the 
small network packets. Usually, several hundreds of megabytes memory is enough 
for large scale batch jobs. Note: you may also need to increase the size of 
total network memory to avoid the 'insufficient number of network buffers' 
error if you are increasing this config value.</td>
+            <td>Minimum number of network buffers required per blocking result 
partition for sort-shuffle. For production usage, it is suggested to increase 
this config value to at least 2048 (64M memory if the default 32K memory 
segment size is used) to improve the data compression ratio and reduce the 
small network packets. Usually, several hundreds of megabytes memory is enough 
for large scale batch jobs. Note: you may also need to increase the size of 
total network memory to avoid the 'insufficient number of network buffers' 
error if you are increasing this config value.</td>
         </tr>
         <tr>
             <td><h5>taskmanager.network.sort-shuffle.min-parallelism</h5></td>
-            <td style="word-wrap: break-word;">2147483647</td>
+            <td style="word-wrap: break-word;">1</td>
             <td>Integer</td>
-            <td>Parallelism threshold to switch between sort-merge blocking 
shuffle and the default hash-based blocking shuffle, which means for batch jobs 
of small parallelism, the hash-based blocking shuffle will be used and for 
batch jobs of large parallelism, the sort-merge one will be used. Note: For 
production usage, if sort-merge blocking shuffle is enabled, you may also need 
to enable data compression by setting 
'taskmanager.network.blocking-shuffle.compression.enabled' to true and tune 
'taskmanager.network.sort-shuffle.min-buffers' and 
'taskmanager.memory.framework.off-heap.batch-shuffle.size' for better 
performance.</td>
+            <td>Parallelism threshold to switch between sort-based blocking 
shuffle and hash-based blocking shuffle, which means for batch jobs of smaller 
parallelism, hash-shuffle will be used and for jobs of larger parallelism, 
sort-shuffle will be used. The default value 1 means that sort-shuffle is the 
default option. Note: For production usage, you may also need to enable data 
compression by setting 
'taskmanager.network.blocking-shuffle.compression.enabled' to true and tune 
'taskmanager.network.sort-shuffle.min-buffers' and 
'taskmanager.memory.framework.off-heap.batch-shuffle.size' for better 
performance.</td>

Review comment:
       `for jobs of larger parallelism` -> `for jobs of larger or equal 
parallelism` ? 

##########
File path: docs/content.zh/docs/ops/batch/blocking_shuffle.md
##########
@@ -68,11 +68,11 @@ Flink [DataStream API]({{< ref 
"docs/dev/datastream/execution_mode" >}}) 和 [Ta
 
 ## Sort Shuffle
 
-`Sort Shuffle` 是 1.13 版中引入的另一种 blocking shuffle 实现。不同于 `Hash Shuffle`，sort 
shuffle 
将每个分区结果写入到一个文件。当多个下游任务同时读取结果分片，数据文件只会被打开一次并共享给所有的读请求。因此，集群使用更少的资源。例如：节点和文件描述符以提升稳定性。此外，通过写更少的文件和尽可能线性的读取文件，尤其是在使用机械硬盘情况下
 sort shuffle 可以获得比 hash shuffle 更好的性能。另外，`sort shuffle` 使用额外管理的内存作为读数据缓存并不依赖 
`sendfile` 或 `mmap` 机制，因此也适用于 [SSL]({{< ref 
"docs/deployment/security/security-ssl" >}})。关于 sort shuffle 的更多细节请参考 
[FLINK-19582](https://issues.apache.org/jira/browse/FLINK-19582) 和 
[FLINK-19614](https://issues.apache.org/jira/browse/FLINK-19614)。
+`Sort Shuffle` 是 1.13 版中引入的另一种 blocking shuffle 实现，它在 1.15 版本成为默认。不同于 `Hash 
Shuffle`，sort shuffle 
将每个分区结果写入到一个文件。当多个下游任务同时读取结果分片，数据文件只会被打开一次并共享给所有的读请求。因此，集群使用更少的资源。例如：节点和文件描述符以提升稳定性。此外，通过写更少的文件和尽可能线性的读取文件，尤其是在使用机械硬盘情况下
 sort shuffle 可以获得比 hash shuffle 更好的性能。另外，`sort shuffle` 使用额外管理的内存作为读数据缓存并不依赖 
`sendfile` 或 `mmap` 机制，因此也适用于 [SSL]({{< ref 
"docs/deployment/security/security-ssl" >}})。关于 sort shuffle 的更多细节请参考 
[FLINK-19582](https://issues.apache.org/jira/browse/FLINK-19582) 和 
[FLINK-19614](https://issues.apache.org/jira/browse/FLINK-19614)。
 
 当使用sort blocking shuffle的时候有些配置需要适配:
 - [taskmanager.network.blocking-shuffle.compression.enabled]({{< ref 
"docs/deployment/config" 
>}}#taskmanager-network-blocking-shuffle-compression-enabled): 配置该选项以启用 shuffle 
data 压缩，大部分任务建议开启除非你的数据压缩比率比较低。
-- [taskmanager.network.sort-shuffle.min-parallelism]({{< ref 
"docs/deployment/config" >}}#taskmanager-network-sort-shuffle-min-parallelism): 
根据下游任务的并行度配置该选项以启用 sort shuffle。如果并行度低于设置的值，则使用 `hash shuffle`，否则 `sort 
shuffle`。
+- [taskmanager.network.sort-shuffle.min-parallelism]({{< ref 
"docs/deployment/config" >}}#taskmanager-network-sort-shuffle-min-parallelism): 
根据下游任务的并行度配置该选项以启用 sort shuffle。如果并行度低于设置的值，则使用 `hash shuffle`，否则 `sort 
shuffle`。对于 1.15 以下的版本，它的默认值是 `Integer.MAX_VALUE`，所以默认情况下总是会使用 `hash shuffle`。从 
1.15 开始，它的默认值是 1, 所以着默认情况下总是会使用 `sort shuffle`。

Review comment:
       `着默认情况` -> `默认情况`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] gaoyunhaii commented on a change in pull request #18350: [FLINK-25636][network] Change some default config values of blocking shuffle for better usability

Reply via email to