This is an automated email from the ASF dual-hosted git repository.
zuston pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/uniffle.git
The following commit(s) were added to refs/heads/master by this push:
new f3bc84fbe [#2494] feat(spark): Enable overlapping compression by
default (#2588)
f3bc84fbe is described below
commit f3bc84fbe82f85bcc6d0be2b6a8cf17bdb1291ff
Author: Junfan Zhang <[email protected]>
AuthorDate: Mon Aug 25 10:32:55 2025 +0800
[#2494] feat(spark): Enable overlapping compression by default (#2588)
### What changes were proposed in this pull request?
1. Enable overlapping compression by default
2. Add doc for this feature
### Why are the changes needed?
for #2494
### Does this PR introduce _any_ user-facing change?
Yes.
### How was this patch tested?
Needn't
---
.../main/java/org/apache/spark/shuffle/RssSparkConfig.java | 2 +-
.../spark/shuffle/writer/WriteBufferManagerTest.java | 4 ++++
docs/client_guide/spark_client_guide.md | 14 +++++++++++++-
3 files changed, 18 insertions(+), 2 deletions(-)
diff --git
a/client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkConfig.java
b/client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkConfig.java
index 8af38703b..6e2536caa 100644
---
a/client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkConfig.java
+++
b/client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkConfig.java
@@ -42,7 +42,7 @@ public class RssSparkConfig {
public static final ConfigOption<Boolean>
RSS_WRITE_OVERLAPPING_COMPRESSION_ENABLED =
ConfigOptions.key("rss.client.write.overlappingCompressionEnable")
.booleanType()
- .defaultValue(false)
+ .defaultValue(true)
.withDescription("Whether to overlapping compress shuffle blocks.");
public static final ConfigOption<Integer>
RSS_WRITE_OVERLAPPING_COMPRESSION_THREADS_PER_VCORE =
diff --git
a/client-spark/common/src/test/java/org/apache/spark/shuffle/writer/WriteBufferManagerTest.java
b/client-spark/common/src/test/java/org/apache/spark/shuffle/writer/WriteBufferManagerTest.java
index 501b57e44..7ebce3c54 100644
---
a/client-spark/common/src/test/java/org/apache/spark/shuffle/writer/WriteBufferManagerTest.java
+++
b/client-spark/common/src/test/java/org/apache/spark/shuffle/writer/WriteBufferManagerTest.java
@@ -120,6 +120,10 @@ public class WriteBufferManagerTest {
conf.set(
RssSparkConfig.SPARK_RSS_CONFIG_PREFIX +
RssClientConf.BLOCKID_TASK_ATTEMPT_ID_BITS.key(),
String.valueOf(layout.taskAttemptIdBits));
+ conf.set(
+ RssSparkConfig.SPARK_RSS_CONFIG_PREFIX
+ + RssSparkConfig.RSS_WRITE_OVERLAPPING_COMPRESSION_ENABLED.key(),
+ "false");
if (!compress) {
conf.set(RssSparkConfig.SPARK_SHUFFLE_COMPRESS_KEY,
String.valueOf(false));
}
diff --git a/docs/client_guide/spark_client_guide.md
b/docs/client_guide/spark_client_guide.md
index 11f072afb..2f06cda7c 100644
--- a/docs/client_guide/spark_client_guide.md
+++ b/docs/client_guide/spark_client_guide.md
@@ -191,4 +191,16 @@ spark.plugins org.apache.spark.UnifflePlugin
```
To enable this feature in the Spark History Server, place the Uniffle client
JAR file into the jars directory of your Spark HOME.
-A restart of the History Server may be required for the changes to take effect.
\ No newline at end of file
+A restart of the History Server may be required for the changes to take effect.
+
+### Overlapping compression for shuffle write
+
+This mechanism allows compression to overlap with upstream data reading,
maximizing shuffle write throughput. It can improve shuffle write speed by up
to 50%. Now this is enabled by default.
+
+The feature can be enabled or disabled through the following configuration:
+
+| Property Name | Default |
Description
|
+|--------------------------------------------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------|
+| spark.rss.client.write.overlappingCompressionEnable | true | Whether
to overlapping compress shuffle blocks.
|
+| rss.client.write.overlappingCompressionThreadsPerVcore | -1 |
Specifies the ratio of overlapping compression threads to Spark executor
vCores. By default, all cores on the machine are used for compression. |
+