This is an automated email from the ASF dual-hosted git repository. jshao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git
commit 2c1c554bb9a47a25e56164d1af2efa1acff66cd8 Author: frankliee <frankz...@tencent.com> AuthorDate: Tue Jun 28 11:02:00 2022 +0800 [Improvement] Move detailed client configuration to individual doc (#201) ### What changes were proposed in this pull request? 1. Put detailed configuration to doc subdirectory. 2. Add doc for client quorum setting. ### Why are the changes needed? Update doc ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Just doc. --- README.md | 22 +------ docs/client_guide.md | 148 ++++++++++++++++++++++++++++++++++++++++++++++ docs/coordinator_guide.md | 8 +++ docs/index.md | 8 +++ docs/pageA.md | 7 --- docs/server_guide.md | 7 +++ 6 files changed, 173 insertions(+), 27 deletions(-) diff --git a/README.md b/README.md index 51a1ed0..eba4fd3 100644 --- a/README.md +++ b/README.md @@ -233,27 +233,9 @@ The important configuration is listed as following. |rss.server.flush.cold.storage.threshold.size|64M| The threshold of data size for LOACALFILE and HDFS if MEMORY_LOCALFILE_HDFS is used| -### Spark Client +### Shuffle Client -|Property Name|Default|Description| -|---|---|---| -|spark.rss.writer.buffer.size|3m|Buffer size for single partition data| -|spark.rss.writer.buffer.spill.size|128m|Buffer size for total partition data| -|spark.rss.coordinator.quorum|-|Coordinator quorum| -|spark.rss.storage.type|-|Supports MEMORY_LOCALFILE, MEMORY_HDFS, MEMORY_LOCALFILE_HDFS| -|spark.rss.client.send.size.limit|16m|The max data size sent to shuffle server| -|spark.rss.client.read.buffer.size|32m|The max data size read from storage| -|spark.rss.client.send.threadPool.size|10|The thread size for send shuffle data to shuffle server| - - -### MapReduce Client - -|Property Name|Default|Description| -|---|---|---| -|mapreduce.rss.coordinator.quorum|-|Coordinator quorum| -|mapreduce.rss.storage.type|-|Supports MEMORY_LOCALFILE, MEMORY_HDFS, MEMORY_LOCALFILE_HDFS| -|mapreduce.rss.client.max.buffer.size|3k|The max buffer size in map side| -|mapreduce.rss.client.read.buffer.size|32m|The max data size read from storage| +For more details of advanced configuration, please see [Firestorm Shuffle Client Guide](https://github.com/Tencent/Firestorm/blob/master/docs/client_guide.md). ## LICENSE diff --git a/docs/client_guide.md b/docs/client_guide.md new file mode 100644 index 0000000..95b960b --- /dev/null +++ b/docs/client_guide.md @@ -0,0 +1,148 @@ +--- +layout: page +displayTitle: Firestorm Shuffle Client Guide +title: Firestorm Shuffle Client Guide +description: Firestorm Shuffle Client Guide +--- +# Firestorm Shuffle Client Guide + +Firestorm is designed as a unified shuffle engine for multiple computing frameworks, including Apache Spark and Apache Hadoop. +Firestorm has provided pluggable client plugins to enable remote shuffle in Spark and MapReduce. + +## Deploy +This document will introduce how to deploy Firestorm client plugins with Spark and MapReduce. + +### Deploy Spark Client Plugin + +1. Add client jar to Spark classpath, eg, SPARK_HOME/jars/ + + The jar for Spark2 is located in <RSS_HOME>/jars/client/spark2/rss-client-XXXXX-shaded.jar + + The jar for Spark3 is located in <RSS_HOME>/jars/client/spark3/rss-client-XXXXX-shaded.jar + +2. Update Spark conf to enable Firestorm, eg, + + ``` + spark.shuffle.manager org.apache.spark.shuffle.RssShuffleManager + spark.rss.coordinator.quorum <coordinatorIp1>:19999,<coordinatorIp2>:19999 + # Note: For Spark2, spark.sql.adaptive.enabled should be false because Spark2 doesn't support AQE. + ``` + +### Support Spark Dynamic Allocation + +To support spark dynamic allocation with Firestorm, spark code should be updated. +There are 2 patches for spark-2.4.6 and spark-3.1.2 in spark-patches folder for reference. + +After apply the patch and rebuild spark, add following configuration in spark conf to enable dynamic allocation: + ``` + spark.shuffle.service.enabled false + spark.dynamicAllocation.enabled true + ``` + +### Deploy MapReduce Client Plugin + +1. Add client jar to the classpath of each NodeManager, e.g., <HADOOP>/share/hadoop/mapreduce/ + +The jar for MapReduce is located in <RSS_HOME>/jars/client/mr/rss-client-mr-XXXXX-shaded.jar + +2. Update MapReduce conf to enable Firestorm, eg, + + ``` + -Dmapreduce.rss.coordinator.quorum=<coordinatorIp1>:19999,<coordinatorIp2>:19999 + -Dyarn.app.mapreduce.am.command-opts=org.apache.hadoop.mapreduce.v2.app.RssMRAppMaster + -Dmapreduce.job.map.output.collector.class=org.apache.hadoop.mapred.RssMapOutputCollector + -Dmapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.RssShuffle + ``` +Note that the RssMRAppMaster will automatically disable slow start (i.e., `mapreduce.job.reduce.slowstart.completedmaps=1`) +and job recovery (i.e., `yarn.app.mapreduce.am.job.recovery.enable=false`) + + +## Configuration + +The important configuration of client is listed as following. + +### Common Setting +These configurations are shared by all types of clients. + +|Property Name|Default|Description| +|---|---|---| +|<client_type>.rss.coordinator.quorum|-|Coordinator quorum| +|<client_type>.rss.writer.buffer.size|3m|Buffer size for single partition data| +|<client_type>.rss.storage.type|-|Supports MEMORY_LOCALFILE, MEMORY_HDFS, MEMORY_LOCALFILE_HDFS| +|<client_type>.rss.client.read.buffer.size|14m|The max data size read from storage| +|<client_type>.rss.client.send.threadPool.size|5|The thread size for send shuffle data to shuffle server| + +Notice: + +1. `<client_type>` should be `spark` or `mapreduce` + +2. `<client_type>.rss.coordinator.quorum` is compulsory, and other configurations are optional when coordinator dynamic configuration is enabled. + +### Adaptive Remote Shuffle Enabling + +To select build-in shuffle or remote shuffle in a smart manner, Firestorm support adaptive enabling. +The client should use `DelegationRssShuffleManager` and provide its unique <access_id> so that the coordinator could distinguish whether it should enable remote shuffle. + +``` +spark.shuffle.manager org.apache.spark.shuffle.DelegationRssShuffleManager +spark.rss.access.id=<access_id> +``` + +Notice: +Currently, this feature only supports Spark. + +Other configuration: + +|Property Name|Default|Description| +|---|---|---| +|spark.rss.access.timeout.ms|10000|The timeout to access Firestorm coordinator| + + +### Client Quorum Setting + +Firestorm supports client-side quorum protocol to tolerant shuffle server crash. +This feature is client-side behaviour, in which shuffle writer sends each block to multiple servers, and shuffle readers could fetch block data from one of server. +Since sending multiple replicas of blocks can reduce the shuffle performance and resource consumption, we designed it as an optional feature. + +|Property Name|Default|Description| +|---|---|---| +|<client_type>.rss.data.replica|1|The max server number that each block can be send by client in quorum protocol| +|<client_type>.rss.data.replica.write|1|The min server number that each block should be send by client successfully| +|<client_type>.rss.data.replica.read|1|The min server number that metadata should be fetched by client successfully | + +Notice: + +1. `spark.rss.data.replica.write` + `spark.rss.data.replica.read` > `spark.rss.data.replica` + +Recommended examples: + +1. Performance First (default) +``` +spark.rss.data.replica 1 +spark.rss.data.replica.write 1 +spark.rss.data.replica.read 1 +``` + +2. Fault-tolerant First +``` +spark.rss.data.replica 3 +spark.rss.data.replica.write 2 +spark.rss.data.replica.read 2 +``` + +### Spark Specialized Setting + +The important configuration is listed as following. + +|Property Name|Default|Description| +|---|---|---| +|spark.rss.writer.buffer.spill.size|128m|Buffer size for total partition data| +|spark.rss.client.send.size.limit|16m|The max data size sent to shuffle server| + + +### MapReduce Specialized Setting + +|Property Name|Default|Description| +|---|---|---| +|mapreduce.rss.client.max.buffer.size|3k|The max buffer size in map side| +|mapreduce.rss.client.batch.trigger.num|50|The max batch of buffers to send data in map side| \ No newline at end of file diff --git a/docs/coordinator_guide.md b/docs/coordinator_guide.md new file mode 100644 index 0000000..049ec18 --- /dev/null +++ b/docs/coordinator_guide.md @@ -0,0 +1,8 @@ +--- +layout: page +displayTitle: Firestorm Coordinator Guide +title: Firestorm Coordinator Guide +description: Firestorm Coordinator Guide +--- + +# Firestorm Coordinator Guide \ No newline at end of file diff --git a/docs/index.md b/docs/index.md index 3af2657..5d2599f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -12,6 +12,14 @@ to store shuffle data on remote servers. ![Rss Architecture](asset/rss_architecture.png) +More advanced details for Firestorm users are available in the following: + +- [Firestorm Coordinator Guide](coordinator_guide.html) + +- [Firestorm Shuffle Server Guide](server_guide.html) + +- [Firestorm Shuffle Client Guide](client_guide.html) + Here you can read API docs for Firestorm along with its submodules. diff --git a/docs/pageA.md b/docs/pageA.md deleted file mode 100644 index 50ef92e..0000000 --- a/docs/pageA.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -layout: page -displayTitle: A -title: A -description: Firestorm documentation homepage ---- - \ No newline at end of file diff --git a/docs/server_guide.md b/docs/server_guide.md new file mode 100644 index 0000000..3356f61 --- /dev/null +++ b/docs/server_guide.md @@ -0,0 +1,7 @@ +--- +layout: page +displayTitle: Firestorm Shuffle Server Guide +title: Firestorm Shuffle Server Guide +description: Firestorm Shuffle Server Guide +--- +# Firestorm Shuffle Server Guide \ No newline at end of file