Please see these logs. The error is thrown in executor: 23/01/02 15:14:44 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.ExceptionInInitializerError at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1274) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2093) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1655) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83) at org.apache.spark.scheduler.Task.run(Task.scala:127) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:385) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2574) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:934) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:928) at TestMain$.<init>(TestMain.scala:12) at TestMain$.<clinit>(TestMain.scala) On Mon, 2 Jan 2023 at 8:29 PM, Sean Owen <sro...@gmail.com> wrote: > It's not running on the executor; that's not the issue. See your stack > trace, where it clearly happens in the driver. > > On Mon, Jan 2, 2023 at 8:58 AM Shrikant Prasad <shrikant....@gmail.com> > wrote: > >> Even if I set the master as yarn, it will not have access to rest of the >> spark confs. It will need spark.yarn.app.id. >> >> The main issue is if its working as it is in Spark 2.3 why its not >> working in Spark 3 i.e why the session is getting created on executor. >> Another thing we tried is removing the df to rdd conversion just for >> debug and it works in Spark 3. >> >> So, it might be something to do with df to rdd conversion or >> serialization behavior change from Spark 2.3 to Spark 3.0 if there is any. >> But couldn't find the root cause. >> >> Regards, >> Shrikant >> >> On Mon, 2 Jan 2023 at 7:54 PM, Sean Owen <sro...@gmail.com> wrote: >> >>> So call .setMaster("yarn"), per the error >>> >>> On Mon, Jan 2, 2023 at 8:20 AM Shrikant Prasad <shrikant....@gmail.com> >>> wrote: >>> >>>> We are running it in cluster deploy mode with yarn. >>>> >>>> Regards, >>>> Shrikant >>>> >>>> On Mon, 2 Jan 2023 at 6:15 PM, Stelios Philippou <stevo...@gmail.com> >>>> wrote: >>>> >>>>> Can we see your Spark Configuration parameters ? >>>>> >>>>> The mater URL refers to as per java >>>>> new SparkConf()....setMaster("local[*]") >>>>> according to where you want to run this >>>>> >>>>> On Mon, 2 Jan 2023 at 14:38, Shrikant Prasad <shrikant....@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to migrate one spark application from Spark 2.3 to 3.0.1. >>>>>> >>>>>> The issue can be reproduced using below sample code: >>>>>> >>>>>> object TestMain { >>>>>> >>>>>> val session = >>>>>> SparkSession.builder().appName("test").enableHiveSupport().getOrCreate() >>>>>> >>>>>> def main(args: Array[String]): Unit = { >>>>>> >>>>>> import session.implicits._ >>>>>> val a = *session.*sparkContext.parallelize(*Array* >>>>>> (("A",1),("B",2))).toDF("_c1","_c2").*rdd*.map(x=> >>>>>> x(0).toString).collect() >>>>>> *println*(a.mkString("|")) >>>>>> >>>>>> } >>>>>> } >>>>>> >>>>>> It runs successfully in Spark 2.3 but fails with Spark 3.0.1 with >>>>>> below exception: >>>>>> >>>>>> Caused by: org.apache.spark.SparkException: A master URL must be set >>>>>> in your configuration >>>>>> >>>>>> at >>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:394) >>>>>> >>>>>> at >>>>>> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690) >>>>>> >>>>>> at >>>>>> org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949) >>>>>> >>>>>> at scala.Option.getOrElse(Option.scala:189) >>>>>> >>>>>> at >>>>>> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943) >>>>>> >>>>>> at TestMain$.<init>(TestMain.scala:7) >>>>>> >>>>>> at TestMain$.<clinit>(TestMain.scala) >>>>>> >>>>>> >>>>>> From the exception it appears that it tries to create spark session >>>>>> on executor also in Spark 3 whereas its not created again on executor in >>>>>> Spark 2.3. >>>>>> >>>>>> Can anyone help in identfying why there is this change in behavior? >>>>>> >>>>>> Thanks and Regards, >>>>>> >>>>>> Shrikant >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Shrikant Prasad >>>>>> >>>>> -- >>>> Regards, >>>> Shrikant Prasad >>>> >>> -- >> Regards, >> Shrikant Prasad >> > -- Regards, Shrikant Prasad