roczei commented on PR #55028:
URL: https://github.com/apache/spark/pull/55028#issuecomment-4371644332
@aajisaka
> Did you update the jar in both NodeManager (for external shuffle service)
and Spark classpath (both driver and executor)?
I've identified the cause of the issue. The property
spark.network.crypto.cipher="AES/GCM/NoPadding" needs to be added to
yarn-site.xml to ensure the external shuffle service utilizes it. The following
error will appear in the NodeManager log without this configuration:
```
2026-05-04 13:10:27,652 WARN
org.apache.spark.network.server.TransportChannelHandler: Exception in
connection from /10.140.126.143:33358
java.lang.IllegalArgumentException: Too large frame: 4683959760308586577
at
org.sparkproject.guava.base.Preconditions.checkArgument(Preconditions.java:203)
at
org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:148)
at
org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:98)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at
org.apache.spark.network.crypto.CtrTransportCipher$DecryptionHandler.channelRead(CtrTransportCipher.java:195)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at
org.sparkproject.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
at
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at
org.sparkproject.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
at
org.sparkproject.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:171)
at
org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:796)
at
org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:732)
at
org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:658)
at
org.sparkproject.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at
org.sparkproject.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
at
org.sparkproject.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
org.sparkproject.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:840)
```
It works perfectly after setting the above.
> Added 1 commit for Java 8 compatibility in
https://github.com/apache/spark/pull/55621. If you run the cluster with Java 8,
would you try the latest one?
I've verified this with both JDK 17 and JDK 8. It functions correctly in
both environments.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]