This is an automated email from the ASF dual-hosted git repository. rickyma pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git
The following commit(s) were added to refs/heads/master by this push: new 0a608347e [#1892] improvement(docs): Provide guidance on configuring memory-related configs (#1893) 0a608347e is described below commit 0a608347ef1cbfc7933a3fabf29c2d51520e365d Author: RickyMa <rick...@tencent.com> AuthorDate: Fri Jul 12 21:19:24 2024 +0800 [#1892] improvement(docs): Provide guidance on configuring memory-related configs (#1893) ### What changes were proposed in this pull request? Provide guidance on configuring memory-related configs. ### Why are the changes needed? For https://github.com/apache/incubator-uniffle/issues/1892. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need. --- docs/client_guide/spark_client_guide.md | 18 ++++++------- docs/server_guide.md | 47 ++++++++++++++++++++++++++------- 2 files changed, 46 insertions(+), 19 deletions(-) diff --git a/docs/client_guide/spark_client_guide.md b/docs/client_guide/spark_client_guide.md index e7d57141e..b08a1bdf4 100644 --- a/docs/client_guide/spark_client_guide.md +++ b/docs/client_guide/spark_client_guide.md @@ -78,15 +78,15 @@ Local shuffle reader as its name indicates is suitable and optimized for spark's The important configuration is listed as following. -| Property Name | Default | Description | -|-------------------------------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| spark.rss.writer.buffer.spill.size | 128m | Buffer size for total partition data | -| spark.rss.client.send.size.limit | 16m | The max data size sent to shuffle server | -| spark.rss.client.unregister.thread.pool.size | 10 | The max size of thread pool of unregistering | -| spark.rss.client.unregister.request.timeout.sec | 10 | The max timeout sec when doing unregister to remote shuffle-servers | -| spark.rss.client.off.heap.memory.enable | false | The client use off heap memory to process data | -| spark.rss.client.remote.storage.useLocalConfAsDefault | false | This option is only valid when the remote storage path is specified. If ture, the remote storage conf will use the client side hadoop configuration loaded from the classpath | -| spark.rss.hadoop.* | - | The prefix key for Hadoop conf. For Spark like that: `spark.rss.hadoop.fs.defaultFS=hdfs://rbf-x1`, this will be as `fs.defaultFS=hdfs://rbf-x1` for Hadoop storage | +| Property Name | Default | Description | +|-------------------------------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| spark.rss.writer.buffer.spill.size | 128m | Buffer size for total partition data. It is recommended to set spark.rss.writer.buffer.spill.size to 512m (default is 128m, 1g is preferable, theoretically the larger the better, but the executor's own memory should be considered, it may cause OOM when the executor's memory is not enough), This configuration can effectively improve task performance and alleviate server-side GC pressure. | +| spark.rss.client.send.size.limit | 16m | The max data size sent to shuffle server | +| spark.rss.client.unregister.thread.pool.size | 10 | The max size of thread pool of unregistering | +| spark.rss.client.unregister.request.timeout.sec | 10 | The max timeout sec when doing unregister to remote shuffle-servers | +| spark.rss.client.off.heap.memory.enable | false | The client use off heap memory to process data | +| spark.rss.client.remote.storage.useLocalConfAsDefault | false | This option is only valid when the remote storage path is specified. If ture, the remote storage conf will use the client side hadoop configuration loaded from the classpath | +| spark.rss.hadoop.* | - | The prefix key for Hadoop conf. For Spark like that: `spark.rss.hadoop.fs.defaultFS=hdfs://rbf-x1`, this will be as `fs.defaultFS=hdfs://rbf-x1` for Hadoop storage | ### Block id bits diff --git a/docs/server_guide.md b/docs/server_guide.md index d2d5e968c..1dcff9eaa 100644 --- a/docs/server_guide.md +++ b/docs/server_guide.md @@ -145,20 +145,47 @@ Finally, to improve the speed of writing to HDFS for a single partition, the val ### Netty In version 0.8.0, we introduced Netty. Enabling Netty on ShuffleServer can significantly reduce GC time in high-throughput scenarios. We can enable Netty through the parameters `rss.server.netty.port` and `rss.rpc.server.type`. Note: After setting the parameter `rss.rpc.server.type` to `GRPC_NETTY`, ShuffleServer will be tagged with `GRPC_NETTY`, that is, the node can only be assigned to clients with `spark.rss.client.type=GRPC_NETTY`. -When enabling Netty, we should also consider memory related configurations, the following is an example. +When enabling Netty, we should also consider memory related configurations. + +#### Memory Configuration Principles + +- Reserve about `15%` of the machine's memory space (reserved space for OS slab, reserved, cache, buffer, kernel stack, etc.) +- Recommended ratio for heap memory : off-heap memory is `1 : 9` +- `rss.server.buffer.capacity` + `rss.server.read.buffer.capacity` + reserved = maximum off-heap memory +- Recommended ratio for capacity configurations: `rss.server.read.buffer.capacity` : `rss.server.buffer.capacity` = 1 : 18 + +Note: The reserved memory can be adjusted according to the actual situation, if the memory is relatively small, configuring 1g is completely sufficient. + +##### rss-env.sh + +Assuming the machine has 470g of memory. +The machine reserves 15% of memory space, about 70g, following the above principle (heap:off-heap=1:9): -#### rss-env.sh ``` -XMX_SIZE=20g -MAX_DIRECT_MEMORY_SIZE=120g +heap = (470 - 70) * 1 / 10 = 40g +off-heap = (470 - 70) * 9 / 10 = 360g +heap + off-heap = 400g ``` -#### server.conf + +So, `rss-env.sh` will be: + +``` +XMX_SIZE=40g +MAX_DIRECT_MEMORY_SIZE=360g +``` + +##### server.conf + +Generally, `rss.server.read.buffer.capacity` of 20g is enough, you can pay more attention to the metric `read_used_buffer_size`. + +If we reserve 10g, and the remaining off-heap memory is for `rss.server.buffer.capacity`, also assuming the machine has 470g of memory, the configs will be: + ``` -rss.server.buffer.capacity 110g -rss.server.read.buffer.capacity 5g +rss.server.buffer.capacity 330g +rss.server.read.buffer.capacity 20g ``` -#### Example of server conf +##### Example of server conf ``` rss.rpc.server.port 19999 rss.jetty.http.port 19998 @@ -169,8 +196,8 @@ rss.storage.type MEMORY_LOCALFILE_HDFS rss.coordinator.quorum <coordinatorIp1>:19999,<coordinatorIp2>:19999 rss.storage.basePath /data1/rssdata,/data2/rssdata.... rss.server.flush.thread.alive 10 -rss.server.buffer.capacity 110g -rss.server.read.buffer.capacity 5g +rss.server.buffer.capacity 330g +rss.server.read.buffer.capacity 20g rss.server.heartbeat.interval 10000 rss.rpc.message.max.size 1073741824 rss.server.preAllocation.expired 120000