Remzi Yang created SPARK-52081: ---------------------------------- Summary: Met NoClassDefFoundError in Spark connect when touching both SparkConnect service and thrift server on k8s Key: SPARK-52081 URL: https://issues.apache.org/jira/browse/SPARK-52081 Project: Spark Issue Type: Bug Components: Connect Affects Versions: 3.5.5 Environment: k8s (orbstack on Mac m1) Reporter: Remzi Yang
I create a Spark cluster on k8s by using start-thriftserver.sh and add SparkConnectPlugin as `spark.plugins` to enable both JDBC and GRPC services. {code:java} /opt/spark/sbin/start-thriftserver.sh --master k8s://https://kubernetes:443 --deploy-mode client --packages org.apache.spark:spark-connect_2.12:3.5.5 --conf hive.server2.transport.mode=http --conf spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp –conf spark.driver.host=xxx --conf spark.kubernetes.container.image=apache/spark:3.5.5 --conf spark.kubernetes.driver.pod.name=xxx --conf spark.kubernetes.executor.podNamePrefix=xxx --conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin {code} (I execute the spark-submit from a pod in k8s, so both driver and executor are k8s pods) I then port-forward the 15002 and 10000 ports to local, and use Apache Superset to connect to the JDBC port (which is 10000). I ran some queries on SuperSet and everything was fine. However, when I tried to create a Spark connect client using Python {code:java} spark = SparkSession.builder.remote("sc://localhost:15002").create(){code} It was hanging forever. And I went to see the driver log and found the error: {code:java} 25/05/12 08:58:42 WARN ChannelInitializer: Failed to initialize a channel. Closing: [id: 0x722f3759, L:/127.0.0.1:15002 - R:/127.0.0.1:47692] java.lang.NoClassDefFoundError: org/sparkproject/connect/grpc/netty/NettyServerTransport at org.sparkproject.connect.grpc.netty.NettyServer$1.initChannel(NettyServer.java:241) at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) at io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:1114) at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:609) at io.netty.channel.DefaultChannelPipeline.access$100(DefaultChannelPipeline.java:46) at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1463) at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1115) at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:650) at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:514) at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429) at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:416) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Unknown Source) 25/05/12 08:58:42 WARN ChannelInitializer: Failed to initialize a channel. Closing: [id: 0x7bd68e70, L:/127.0.0.1:15002 - R:/127.0.0.1:47690] java.lang.NoClassDefFoundError: org/sparkproject/connect/grpc/netty/NettyServerTransport at org.sparkproject.connect.grpc.netty.NettyServer$1.initChannel(NettyServer.java:241) at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) at io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:1114) at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:609) at io.netty.channel.DefaultChannelPipeline.access$100(DefaultChannelPipeline.java:46) at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1463) at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1115) at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:650) at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:514) at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429) at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:416) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Unknown Source){code} I also see some error of thrift server but I don't know if it affects {code:java} 25/05/12 08:58:01 ERROR TThreadPoolServer: Thrift error occurred during processing of message. org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374) at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433) at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:52) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source){code} However, if I don't touch the thrift server and only use Spark connect, then everything goes fine. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org