[GitHub] [flink] reswqa commented on a diff in pull request #21890: [FLINK-30860][doc] Add document for hybrid shuffle with adaptive batch scheduler
reswqa commented on code in PR #21890: URL: https://github.com/apache/flink/pull/21890#discussion_r1102386579 ## docs/content.zh/docs/ops/batch/batch_shuffle.md: ## @@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies: To use hybrid shuffle mode, you need to configure the [execution.batch-shuffle-mode]({{< ref "docs/deployment/config" >}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy). + Supports AdaptiveBatchScheduler and SpeculativeExecution + +Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler) for details. + +If you want to enable `SpeculativeExecution` in the same time, see [speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) for details. + +Hybrid shuffle divides the partition data consumption constraints between producer and consumer into the following three cases: + +- **ALL_PRODUCERS_FINISHED** : hybrid partition data can be consumed only when all producers are finished. +- **ONLY_FINISHED_PRODUCERS** : hybrid partition data can be consumed when its producer is finished. +- **UNFINISHED_PRODUCERS** : hybrid partition data can be consumed even if its producer is un-finished. + +If `SpeculativeExecution` is enabled, the default constraint is `ONLY_FINISHED_PRODUCERS` to bring some performance optimization compared with blocking shuffle. Otherwise, the default constraint is `UNFINISHED_PRODUCERS` to perform pipelined-like shuffle. These could be configured via [jobmanager.partition.hybrid.partition-data-consume-constraint]({{< ref "docs/deployment/config" >}}#jobmanager-partition-hybrid-partition-data-consume-constraint). Review Comment: Added some descriptions of this part. ## docs/content.zh/docs/ops/batch/batch_shuffle.md: ## @@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies: To use hybrid shuffle mode, you need to configure the [execution.batch-shuffle-mode]({{< ref "docs/deployment/config" >}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy). + Supports AdaptiveBatchScheduler and SpeculativeExecution + +Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler) for details. + +If you want to enable `SpeculativeExecution` in the same time, see [speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) for details. Review Comment: Removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] reswqa commented on a diff in pull request #21890: [FLINK-30860][doc] Add document for hybrid shuffle with adaptive batch scheduler
reswqa commented on code in PR #21890: URL: https://github.com/apache/flink/pull/21890#discussion_r1102385798 ## docs/content.zh/docs/ops/batch/batch_shuffle.md: ## @@ -114,12 +114,34 @@ Hybrid shuffle provides two spilling strategies: To use hybrid shuffle mode, you need to configure the [execution.batch-shuffle-mode]({{< ref "docs/deployment/config" >}}#execution-batch-shuffle-mode) to `ALL_EXCHANGES_HYBRID_FULL` (full spilling strategy) or `ALL_EXCHANGES_HYBRID_SELECTIVE` (selective spilling strategy). + Supports AdaptiveBatchScheduler and SpeculativeExecution + +Hybrid shuffle currently supports `AdaptiveBatchScheduler` by default. If you want to use `DefaultScheduler`, please configure the [jobmanager.scheduler]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler) to `DefaultScheduler`. See [elastic_scaling]({{< ref "docs/deployment/elastic_scaling" >}}#adaptive-batch-scheduler) for details. + +If you want to enable `SpeculativeExecution` in the same time, see [speculative_execution]({{< ref "docs/deployment/speculative_execution" >}}) for details. + +Hybrid shuffle divides the partition data consumption constraints between producer and consumer into the following three cases: + +- **ALL_PRODUCERS_FINISHED** : hybrid partition data can be consumed only when all producers are finished. +- **ONLY_FINISHED_PRODUCERS** : hybrid partition data can be consumed when its producer is finished. +- **UNFINISHED_PRODUCERS** : hybrid partition data can be consumed even if its producer is un-finished. + +If `SpeculativeExecution` is enabled, the default constraint is `ONLY_FINISHED_PRODUCERS` to bring some performance optimization compared with blocking shuffle. Otherwise, the default constraint is `UNFINISHED_PRODUCERS` to perform pipelined-like shuffle. These could be configured via [jobmanager.partition.hybrid.partition-data-consume-constraint]({{< ref "docs/deployment/config" >}}#jobmanager-partition-hybrid-partition-data-consume-constraint). + + Index Spilling + +Hybrid shuffle indexes the shuffle data in memory and disk. Generally speaking, all index can be cached in memory to speed up index retrieval. However, for large batch jobs, this part of memory may bring OOM risks. +Therefore, hybrid shuffle supports spilling index data to disk. The following configuration options can control this behavior: + +- **[taskmanager.network.hybrid-shuffle.num-retained-in-memory-regions-max]({{< ref "docs/deployment/config" >}}#taskmanager-network-hybrid-shuffle-num-retained-in-memory-regions-max)** : Controls the max number of hybrid retained regions in memory. Increasing this value will allow more index entries to be cached in memory. +- **[taskmanager.network.hybrid-shuffle.spill-index-segment-size]({{< ref "docs/deployment/config" >}}#taskmanager-network-hybrid-shuffle-spill-index-segment-size)** : Controls the segment size(in bytes) of hybrid spilled file data index. Review Comment: Agree with you, removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org