Yuxin Tan created FLINK-35690: --------------------------------- Summary: Release Testing: Verify FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn Key: FLINK-35690 URL: https://issues.apache.org/jira/browse/FLINK-35690 Project: Flink Issue Type: Sub-task Reporter: Yuxin Tan Fix For: 1.20.0
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533 In Flink 1.20, we proposed integrating Flink's Hybrid Shuffle with Apache Celeborn through a pluggable remote tier interface. To verify this feature, you should reference these main two steps. 1. Implement Celeborn tier. * Implement a new tier factory and tier for Celeborn, including these APIs, including TierFactory/TierMasterAgent/TierProducerAgent/TierConsumerAgent. * The implementations should support granular data management at the Segment level for both client and server sides. 2. Use the implemented tier to shuffle data. * Compile Flink and Celeborn. * Deploy Celeborn service ** Deploy a new Celeborn service with the new compiled packages. You can reference the doc (https://celeborn.apache.org/docs/latest/) to deploy the cluster. * Add the compiled flink plugin jar (celeborn-client-flink-xxx.jar) to Flink classpaths. * Configure the options to enable the feature. ** Configure the option taskmanager.network.hybrid-shuffle.external-remote-tier-factory.class to the new Celeborn tier classes. Except for this option, the following options should also be added. {code:java} execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_FULL celeborn.master.endpoints: <the celeborn endpoint address> celeborn.client.shuffle.partition.type: MAP\{code} * Run some test examples(e.g., WordCount) to verify the feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)