[GitHub] [flink] reswqa commented on a diff in pull request #21890: [FLINK-30860][doc] Add document for hybrid shuffle with adaptive batch scheduler

2023-02-09 Thread via GitHub


reswqa commented on code in PR #21890:
URL: https://github.com/apache/flink/pull/21890#discussion_r1102386579


##
docs/content.zh/docs/ops/batch/batch_shuffle.md:
##
@@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies:
 
 To use hybrid shuffle mode, you need to configure the 
[execution.batch-shuffle-mode]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling 
strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy).
 
+ Supports AdaptiveBatchScheduler and SpeculativeExecution
+
+Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you 
want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< 
ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. 
See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" 
>}}#adaptive-batch-scheduler) for details.
+
+If you want to enable `SpeculativeExecution` in the same time, see 
[speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) 
for details.
+
+Hybrid shuffle divides the partition data consumption constraints between 
producer and consumer into the following three cases:
+
+- **ALL_PRODUCERS_FINISHED** : hybrid partition data can be consumed only when 
all producers are finished.
+- **ONLY_FINISHED_PRODUCERS** : hybrid partition data can be consumed when its 
producer is finished.
+- **UNFINISHED_PRODUCERS** : hybrid partition data can be consumed even if its 
producer is un-finished.
+
+If `SpeculativeExecution` is enabled, the default constraint is 
`ONLY_FINISHED_PRODUCERS` to bring some performance optimization compared with 
blocking shuffle. Otherwise, the default constraint is `UNFINISHED_PRODUCERS` 
to perform pipelined-like shuffle. These could be configured via 
[jobmanager.partition.hybrid.partition-data-consume-constraint]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-partition-hybrid-partition-data-consume-constraint).

Review Comment:
   Added some descriptions of this part.



##
docs/content.zh/docs/ops/batch/batch_shuffle.md:
##
@@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies:
 
 To use hybrid shuffle mode, you need to configure the 
[execution.batch-shuffle-mode]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling 
strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy).
 
+ Supports AdaptiveBatchScheduler and SpeculativeExecution
+
+Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you 
want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< 
ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. 
See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" 
>}}#adaptive-batch-scheduler) for details.
+
+If you want to enable `SpeculativeExecution` in the same time, see 
[speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) 
for details.

Review Comment:
   Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #21890: [FLINK-30860][doc] Add document for hybrid shuffle with adaptive batch scheduler

2023-02-09 Thread via GitHub


reswqa commented on code in PR #21890:
URL: https://github.com/apache/flink/pull/21890#discussion_r1102385798


##
docs/content.zh/docs/ops/batch/batch_shuffle.md:
##
@@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies:
 
 To use hybrid shuffle mode, you need to configure the 
[execution.batch-shuffle-mode]({{< ref "docs/deployment/config" 
>}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling 
strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy).
 
+ Supports AdaptiveBatchScheduler and SpeculativeExecution
+
+Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you 
want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< 
ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. 
See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" 
>}}#adaptive-batch-scheduler) for details.
+
+If you want to enable `SpeculativeExecution` in the same time, see 
[speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) 
for details.
+
+Hybrid shuffle divides the partition data consumption constraints between 
producer and consumer into the following three cases:
+
+- **ALL_PRODUCERS_FINISHED** : hybrid partition data can be consumed only when 
all producers are finished.
+- **ONLY_FINISHED_PRODUCERS** : hybrid partition data can be consumed when its 
producer is finished.
+- **UNFINISHED_PRODUCERS** : hybrid partition data can be consumed even if its 
producer is un-finished.
+
+If `SpeculativeExecution` is enabled, the default constraint is 
`ONLY_FINISHED_PRODUCERS` to bring some performance optimization compared with 
blocking shuffle. Otherwise, the default constraint is `UNFINISHED_PRODUCERS` 
to perform pipelined-like shuffle. These could be configured via 
[jobmanager.partition.hybrid.partition-data-consume-constraint]({{< ref 
"docs/deployment/config" 
>}}#jobmanager-partition-hybrid-partition-data-consume-constraint).
+
+ Index Spilling
+
+Hybrid shuffle indexes the shuffle data in memory and disk. Generally 
speaking, all index can be cached in memory to speed up index retrieval. 
However, for large batch jobs, this part of memory may bring OOM risks.
+Therefore, hybrid shuffle supports spilling index data to disk. The following 
configuration options can control this behavior:
+
+- 
**[taskmanager.network.hybrid-shuffle.num-retained-in-memory-regions-max]({{< 
ref "docs/deployment/config" 
>}}#taskmanager-network-hybrid-shuffle-num-retained-in-memory-regions-max)** : 
Controls the max number of hybrid retained regions in memory. Increasing this 
value will allow more index entries to be cached in memory.
+- **[taskmanager.network.hybrid-shuffle.spill-index-segment-size]({{< ref 
"docs/deployment/config" 
>}}#taskmanager-network-hybrid-shuffle-spill-index-segment-size)** : Controls 
the segment size(in bytes) of hybrid spilled file data index.

Review Comment:
   Agree with you, removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org