[ https://issues.apache.org/jira/browse/SPARK-45201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767622#comment-17767622 ]
Sebastian Daberdaku edited comment on SPARK-45201 at 10/11/23 9:58 AM: ----------------------------------------------------------------------- After spending hours analyzing the project pom files, I discovered two things. First, the shade plugin is relocating the guava/failureaccess package twice in the connect jars (once by the module shade plugin, once by the base project plugin). I created a simple patch to prevent the relocation of failureacces by the base plugin. I am adding the patch file [^spark-3.5.0.patch] to this Jira issue, I do not have time to create a pull request, you can apply the patch by navigating inside the source folder and running: {{{{patch -p1 <spark-3.5.0.patch }}}} Second, the spark-connect-common jar produced by make-distribution is redundant and was the cause of the class loading issues. Removing it resolved all these issues I had. was (Author: JIRAUSER302265): After spending hours analyzing the project pom files, I discovered two things. First, the shade plugin is relocating the guava/failureaccess package twice in the connect jars (once by the module shade plugin, once by the base project plugin). I created a simple patch to prevent the relocation of failureacces by the base plugin. I am adding the patch file [^spark-3.5.0.patch] to this Jira issue, I do not have time to create a pull request, you can apply the patch by navigating inside the source folder and run: {{patch -p1 <spark-3.5.0.patch }} Second, the spark-connect-common jar produced by make-distribution is redundant and was the cause of the class loading issues. Removing it resolved all these issues I had. > NoClassDefFoundError: InternalFutureFailureAccess when compiling Spark 3.5.0 > ---------------------------------------------------------------------------- > > Key: SPARK-45201 > URL: https://issues.apache.org/jira/browse/SPARK-45201 > Project: Spark > Issue Type: Bug > Components: Connect > Affects Versions: 3.5.0 > Reporter: Sebastian Daberdaku > Priority: Major > Attachments: Dockerfile, spark-3.5.0.patch > > > I am trying to compile Spark 3.5.0 and make a distribution that supports > Spark Connect and Kubernetes. The compilation seems to complete correctly, > but when I try to run the Spark Connect server on kubernetes I get a > "NoClassDefFoundError" as follows: > {code:java} > Exception in thread "main" java.lang.NoClassDefFoundError: > org/sparkproject/guava/util/concurrent/internal/InternalFutureFailureAccess > at java.base/java.lang.ClassLoader.defineClass1(Native Method) > at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017) > at > java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) > at > java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862) > at > java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760) > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681) > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) > at java.base/java.lang.ClassLoader.defineClass1(Native Method) > at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017) > at > java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) > at > java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862) > at > java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760) > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681) > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) > at java.base/java.lang.ClassLoader.defineClass1(Native Method) > at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017) > at > java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) > at > java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862) > at > java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760) > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681) > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) > at > org.sparkproject.guava.cache.LocalCache$LoadingValueReference.<init>(LocalCache.java:3511) > at > org.sparkproject.guava.cache.LocalCache$LoadingValueReference.<init>(LocalCache.java:3515) > at > org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2168) > at > org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2079) > at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4011) > at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4034) > at > org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:5010) > at > org.apache.spark.storage.BlockManagerId$.getCachedBlockManagerId(BlockManagerId.scala:146) > at > org.apache.spark.storage.BlockManagerId$.apply(BlockManagerId.scala:127) > at > org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:536) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:625) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2888) > at > org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1099) > at scala.Option.getOrElse(Option.scala:189) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1093) > at > org.apache.spark.sql.connect.service.SparkConnectServer$.main(SparkConnectServer.scala:34) > at > org.apache.spark.sql.connect.service.SparkConnectServer.main(SparkConnectServer.scala) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.lang.ClassNotFoundException: > org.sparkproject.guava.util.concurrent.internal.InternalFutureFailureAccess > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) > ... 56 more{code} > My build command is as follows: > {code:java} > export MAVEN_OPTS="-Xss256m -Xmx8g -XX:ReservedCodeCacheSize=2g" > ./dev/make-distribution.sh --name spark --pip -Pscala-2.12 -Pconnect > -Pkubernetes -Phive -Phive-thriftserver -Phadoop-3 -Dhadoop.version="3.3.4" > -Dhive.version="2.3.9" -Dhive23.version="2.3.9" > -Dhive.version.short="2.3"{code} > I am building Spark using Debian Bookworm, with Java 8 8u382 and Maven 3.8.8. > I get the same error even when I omitt the -Pconnect profile from Maven, and > simply add the "org.apache.spark:spark-connect_2.12:3.5.0" jar with the > appropriate spark config. On the other hand, if I download the pre-built > spark package, and add the spark-connect jar, this error does not appear. > What could I be possibly missing in my build environment? I have omitted the > yarn, mesos and sparkr profiles (which are used in the distributed built) on > purpose, but I do not see how these affect spark connect. > Any help will be appreciated! > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org