This is an automated email from the ASF dual-hosted git repository.

roryqi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git


The following commit(s) were added to refs/heads/master by this push:
     new 9f860546 [DOC] Migrate the coordinator doc from README to docs page 
(#153)
9f860546 is described below

commit 9f860546b4aecb6fa175d72e85464e29c722cb13
Author: Junfan Zhang <[email protected]>
AuthorDate: Thu Aug 11 11:48:43 2022 +0800

    [DOC] Migrate the coordinator doc from README to docs page (#153)
    
    ### What changes were proposed in this pull request?
    [DOC] Migrate the coordinator doc from README to docs page
    
    ## Why are the changes needed?
    The dedicated doc page will benefit users to find configs
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    No need
---
 README.md                                          | 12 +---
 .../uniffle/coordinator/CoordinatorConf.java       |  2 +-
 docs/coordinator_guide.md                          | 79 ++++++++++++++++++++++
 3 files changed, 81 insertions(+), 12 deletions(-)

diff --git a/README.md b/README.md
index 24d8675a..1731a16a 100644
--- a/README.md
+++ b/README.md
@@ -226,17 +226,7 @@ The important configuration is listed as following.
 
 ### Coordinator
 
-|Property Name|Default|        Description|
-|---|---|---|
-|rss.coordinator.server.heartbeat.timeout|30000|Timeout if can't get heartbeat 
from shuffle server|
-|rss.coordinator.assignment.strategy|PARTITION_BALANCE|Strategy for assigning 
shuffle server, PARTITION_BALANCE should be used for workload balance|
-|rss.coordinator.app.expired|60000|Application expired time (ms), the 
heartbeat interval should be less than it|
-|rss.coordinator.shuffle.nodes.max|9|The max number of shuffle server when do 
the assignment|
-|rss.coordinator.dynamicClientConf.path|-|The path of configuration file which 
have default conf for rss client|
-|rss.coordinator.exclude.nodes.file.path|-|The path of configuration file 
which have exclude nodes|
-|rss.coordinator.exclude.nodes.check.interval.ms|60000|Update interval (ms) 
for exclude nodes|
-|rss.rpc.server.port|-|RPC port for coordinator|
-|rss.jetty.http.port|-|Http port for coordinator|
+For more details of advanced configuration, please see [Uniffle Coordinator 
Guide](https://github.com/apache/incubator-uniffle/blob/master/docs/coordinator_guide.md).
 
 ### Shuffle Server
 
diff --git 
a/coordinator/src/main/java/org/apache/uniffle/coordinator/CoordinatorConf.java 
b/coordinator/src/main/java/org/apache/uniffle/coordinator/CoordinatorConf.java
index 18bd6614..765f9702 100644
--- 
a/coordinator/src/main/java/org/apache/uniffle/coordinator/CoordinatorConf.java
+++ 
b/coordinator/src/main/java/org/apache/uniffle/coordinator/CoordinatorConf.java
@@ -119,7 +119,7 @@ public class CoordinatorConf extends RssBaseConf {
       .intType()
       .checkValue(ConfigUtils.POSITIVE_INTEGER_VALIDATOR_2, "dynamic client 
conf update interval in seconds")
       .defaultValue(120)
-      .withDescription("Accessed candidates update interval in seconds");
+      .withDescription("The dynamic client conf update interval in seconds");
   public static final ConfigOption<String> 
COORDINATOR_REMOTE_STORAGE_CLUSTER_CONF = ConfigOptions
       .key("rss.coordinator.remote.storage.cluster.conf")
       .stringType()
diff --git a/docs/coordinator_guide.md b/docs/coordinator_guide.md
index 274c875e..6764b529 100644
--- a/docs/coordinator_guide.md
+++ b/docs/coordinator_guide.md
@@ -21,3 +21,82 @@ license: |
 ---
 
 # Uniffle Coordinator Guide
+
+Uniffle is a unified remote shuffle service for compute engines, the role of 
coordinator is responsibility for
+collecting status of shuffle server and doing the assignment for the job.
+
+## Deploy
+This document will introduce how to deploy Uniffle coordinators.
+
+### Steps
+1. unzip package to RSS_HOME
+2. update RSS_HOME/bin/rss-env.sh, eg,
+   ```
+     JAVA_HOME=<java_home>
+     HADOOP_HOME=<hadoop home>
+     XMX_SIZE="16g"
+   ```
+3. update RSS_HOME/conf/coordinator.conf, eg,
+   ```
+     rss.rpc.server.port 19999
+     rss.jetty.http.port 19998
+     rss.coordinator.server.heartbeat.timeout 30000
+     rss.coordinator.app.expired 60000
+     rss.coordinator.shuffle.nodes.max 5
+     # enable dynamicClientConf, and coordinator will be responsible for most 
of client conf
+     rss.coordinator.dynamicClientConf.enabled true
+     # config the path of client conf
+     rss.coordinator.dynamicClientConf.path <RSS_HOME>/conf/dynamic_client.conf
+     # config the path of excluded shuffle server
+     rss.coordinator.exclude.nodes.file.path <RSS_HOME>/conf/exclude_nodes
+   ```
+4. update <RSS_HOME>/conf/dynamic_client.conf, rss client will get default 
conf from coordinator eg,
+   ```
+    # MEMORY_LOCALFILE_HDFS is recommandation for production environment
+    rss.storage.type MEMORY_LOCALFILE_HDFS
+    # multiple remote storages are supported, and client will get assignment 
from coordinator
+    rss.coordinator.remote.storage.path 
hdfs://cluster1/path,hdfs://cluster2/path
+    rss.writer.require.memory.retryMax 1200
+    rss.client.retry.max 100
+    rss.writer.send.check.timeout 600000
+    rss.client.read.buffer.size 14m
+   ```
+5. start Coordinator
+   ```
+    bash RSS_HOME/bin/start-coordnator.sh
+   ```
+
+## Configuration
+
+### Common settings
+|Property Name|Default|        Description|
+|---|---|---|
+|rss.coordinator.server.heartbeat.timeout|30000|Timeout if can't get heartbeat 
from shuffle server|
+|rss.coordinator.server.periodic.output.interval.times|30|The periodic 
interval times of output alive nodes. The interval sec can be calculated by 
(rss.coordinator.server.heartbeat.timeout/3 * 
rss.coordinator.server.periodic.output.interval.times). Default output interval 
is 5min.|
+|rss.coordinator.assignment.strategy|PARTITION_BALANCE|Strategy for assigning 
shuffle server, PARTITION_BALANCE should be used for workload balance|
+|rss.coordinator.app.expired|60000|Application expired time (ms), the 
heartbeat interval should be less than it|
+|rss.coordinator.shuffle.nodes.max|9|The max number of shuffle server when do 
the assignment|
+|rss.coordinator.dynamicClientConf.path|-|The path of configuration file which 
have default conf for rss client|
+|rss.coordinator.exclude.nodes.file.path|-|The path of configuration file 
which have exclude nodes|
+|rss.coordinator.exclude.nodes.check.interval.ms|60000|Update interval (ms) 
for exclude nodes|
+|rss.coordinator.access.checkers|org.apache.uniffle.coordinator.AccessClusterLoadChecker|The
 access checkers will be used when the spark client use the 
DelegationShuffleManager, which will decide whether to use rss according to the 
result of the specified access checkers|
+|rss.coordinator.access.loadChecker.memory.percentage|15.0|The minimal 
percentage of available memory percentage of a server|
+|rss.coordinator.dynamicClientConf.enabled|false|whether to enable dynamic 
client conf, which will be fetched by spark client|
+|rss.coordinator.dynamicClientConf.path|-|The dynamic client conf of this 
cluster and can be stored in HDFS or local|
+|rss.coordinator.dynamicClientConf.updateIntervalSec|120|The dynamic client 
conf update interval in seconds|
+|rss.coordinator.remote.storage.cluster.conf|-|Remote Storage Cluster related 
conf with format $clusterId,$key=$value, separated by ';'|
+|rss.rpc.server.port|-|RPC port for coordinator|
+|rss.jetty.http.port|-|Http port for coordinator|
+
+### AccessClusterLoadChecker settings
+|Property Name|Default|        Description|
+|---|---|---|
+|rss.coordinator.access.loadChecker.serverNum.threshold|-|The minimal required 
number of healthy shuffle servers when being accessed by client|
+
+### AccessCandidatesChecker settings
+AccessCandidatesChecker is one of the built-in access checker, which will 
allow user to define the candidates list to use rss.  
+
+|Property Name|Default|        Description|
+|---|---|---|
+|rss.coordinator.access.candidates.updateIntervalSec|120|Accessed candidates 
update interval in seconds, which is only valid when AccessCandidatesChecker is 
enabled.|
+|rss.coordinator.access.candidates.path|-|Accessed candidates file path, the 
file can be stored on HDFS|

Reply via email to