This is an automated email from the ASF dual-hosted git repository.
zuston pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/uniffle.git
The following commit(s) were added to refs/heads/master by this push:
new ece59eee1 [#2537] feat(spark): Introduce option to activate small
cache in grpc server (#2538)
ece59eee1 is described below
commit ece59eee1bcec0f5d710bdbd12beded51babe8e7
Author: Junfan Zhang <[email protected]>
AuthorDate: Tue Jul 8 15:49:37 2025 +0800
[#2537] feat(spark): Introduce option to activate small cache in grpc
server (#2538)
### What changes were proposed in this pull request?
Introduce the config option to activate small cache in grpc server
### Why are the changes needed?
for #2537
When partition reassignment is enabled in the production environment, we
observed that some Spark jobs failed due to gRPC request timeouts
(DEADLINE_EXCEEDED). Upon investigating the Spark driver logs, we found severe
GC events, indicating significant memory pressure on the driver process.
Based on the PR #1780, the small cache looks effective for the grpc mode.
This PR is to make the small cache being enabled as the default option
because GRPC_NETTY mode has been as the default rpc mode.
### Does this PR introduce _any_ user-facing change?
Yes.
`rss.rpc.netty.smallCacheEnabled=true`
### How was this patch tested?
Existing unit tests.
---
.../main/java/org/apache/uniffle/common/config/RssBaseConf.java | 7 +++++++
.../src/main/java/org/apache/uniffle/common/rpc/GrpcServer.java | 9 ++++-----
docs/server_guide.md | 1 +
3 files changed, 12 insertions(+), 5 deletions(-)
diff --git
a/common/src/main/java/org/apache/uniffle/common/config/RssBaseConf.java
b/common/src/main/java/org/apache/uniffle/common/config/RssBaseConf.java
index d5be88f97..9997bc1e8 100644
--- a/common/src/main/java/org/apache/uniffle/common/config/RssBaseConf.java
+++ b/common/src/main/java/org/apache/uniffle/common/config/RssBaseConf.java
@@ -55,6 +55,13 @@ public class RssBaseConf extends RssConf {
.defaultValue(true)
.withDescription("If enable metrics for rpc connection");
+ public static final ConfigOption<Boolean> RPC_NETTY_SMALL_CACHE_ENABLED =
+ ConfigOptions.key("rss.rpc.netty.smallCacheEnabled")
+ .booleanType()
+ .defaultValue(true)
+ .withDescription(
+ "The option to control whether the small cache of the Netty
allocator used by gRPC is enabled.");
+
public static final ConfigOption<Integer> RPC_NETTY_PAGE_SIZE =
ConfigOptions.key("rss.rpc.netty.pageSize")
.intType()
diff --git a/common/src/main/java/org/apache/uniffle/common/rpc/GrpcServer.java
b/common/src/main/java/org/apache/uniffle/common/rpc/GrpcServer.java
index 7a93b2b82..70ff4a7a6 100644
--- a/common/src/main/java/org/apache/uniffle/common/rpc/GrpcServer.java
+++ b/common/src/main/java/org/apache/uniffle/common/rpc/GrpcServer.java
@@ -104,15 +104,14 @@ public class GrpcServer implements ServerInterface {
private Server buildGrpcServer(int serverPort) {
boolean isMetricsEnabled =
rssConf.getBoolean(RssBaseConf.RPC_METRICS_ENABLED);
long maxInboundMessageSize =
rssConf.getLong(RssBaseConf.RPC_MESSAGE_MAX_SIZE);
- ServerType serverType = rssConf.get(RssBaseConf.RPC_SERVER_TYPE);
int pageSize = rssConf.getInteger(RssBaseConf.RPC_NETTY_PAGE_SIZE);
int maxOrder = rssConf.getInteger(RssBaseConf.RPC_NETTY_MAX_ORDER);
int smallCacheSize =
rssConf.getInteger(RssBaseConf.RPC_NETTY_SMALL_CACHE_SIZE);
PooledByteBufAllocator pooledByteBufAllocator =
- serverType == ServerType.GRPC
- ? GrpcNettyUtils.createPooledByteBufAllocator(true, 0, 0, 0, 0)
- : GrpcNettyUtils.createPooledByteBufAllocatorWithSmallCacheOnly(
- true, 0, pageSize, maxOrder, smallCacheSize);
+ rssConf.getBoolean(RssBaseConf.RPC_NETTY_SMALL_CACHE_ENABLED)
+ ? GrpcNettyUtils.createPooledByteBufAllocatorWithSmallCacheOnly(
+ true, 0, pageSize, maxOrder, smallCacheSize)
+ : GrpcNettyUtils.createPooledByteBufAllocator(true, 0, 0, 0, 0);
ServerBuilder<?> builder =
NettyServerBuilder.forPort(serverPort)
.executor(pool)
diff --git a/docs/server_guide.md b/docs/server_guide.md
index a07134d8b..4c3da4201 100644
--- a/docs/server_guide.md
+++ b/docs/server_guide.md
@@ -75,6 +75,7 @@ This document will introduce how to deploy Uniffle shuffle
servers.
| rss.coordinator.rpc.client.type | GRPC
| The client type for
coordinator rpc client.
[...]
| rss.rpc.server.type | GRPC_NETTY
| Shuffle server type,
supports GRPC_NETTY, GRPC. The default value is GRPC_NETTY. We recommend using
GRPC_NETTY to enable Netty on the server side for better stability and
performance.
[...]
| rss.rpc.server.port | 19999
| RPC port for Shuffle
server, if set zero, grpc server start on random port.
[...]
+| rss.rpc.netty.smallCacheEnabled | true
| The option to control
whether the small cache of the Netty allocator used by gRPC is enabled.
[...]
| rss.rpc.netty.pageSize | 4096
| The value of pageSize
for PooledByteBufAllocator when using gRPC internal Netty on the server-side.
This configuration will only take effect when rss.rpc.server.type is set to
GRPC_NETTY.
[...]
| rss.rpc.netty.maxOrder | 3
| The value of maxOrder
for PooledByteBufAllocator when using gRPC internal Netty on the server-side.
This configuration will only take effect when rss.rpc.server.type is set to
GRPC_NETTY.
[...]
| rss.rpc.netty.smallCacheSize | 1024
| The value of
smallCacheSize for PooledByteBufAllocator when using gRPC internal Netty on the
server-side. This configuration will only take effect when rss.rpc.server.type
is set to GRPC_NETTY.
[...]