Even if I set the master as yarn, it will not have access to rest of the
spark confs. It will need spark.yarn.app.id.

The main issue is if its working as it is in Spark 2.3 why its not working
in Spark 3 i.e why the session is getting created on executor.
Another thing we tried is removing the df to rdd conversion just for debug
and it works in Spark 3.

So, it might be something to do with df to rdd conversion or serialization
behavior change from Spark 2.3 to Spark 3.0 if there is any. But couldn't
find the root cause.

Regards,
Shrikant

On Mon, 2 Jan 2023 at 7:54 PM, Sean Owen <sro...@gmail.com> wrote:

> So call .setMaster("yarn"), per the error
>
> On Mon, Jan 2, 2023 at 8:20 AM Shrikant Prasad <shrikant....@gmail.com>
> wrote:
>
>> We are running it in cluster deploy mode with yarn.
>>
>> Regards,
>> Shrikant
>>
>> On Mon, 2 Jan 2023 at 6:15 PM, Stelios Philippou <stevo...@gmail.com>
>> wrote:
>>
>>> Can we see your Spark Configuration parameters ?
>>>
>>> The mater URL refers to as per java
>>> new SparkConf()....setMaster("local[*]")
>>> according to where you want to run this
>>>
>>> On Mon, 2 Jan 2023 at 14:38, Shrikant Prasad <shrikant....@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to migrate one spark application from Spark 2.3 to 3.0.1.
>>>>
>>>> The issue can be reproduced using below sample code:
>>>>
>>>> object TestMain {
>>>>
>>>> val session =
>>>> SparkSession.builder().appName("test").enableHiveSupport().getOrCreate()
>>>>
>>>> def main(args: Array[String]): Unit = {
>>>>
>>>> import session.implicits._
>>>> val a = *session.*sparkContext.parallelize(*Array*
>>>> (("A",1),("B",2))).toDF("_c1","_c2").*rdd*.map(x=>
>>>> x(0).toString).collect()
>>>> *println*(a.mkString("|"))
>>>>
>>>> }
>>>> }
>>>>
>>>> It runs successfully in Spark 2.3 but fails with Spark 3.0.1 with below
>>>> exception:
>>>>
>>>> Caused by: org.apache.spark.SparkException: A master URL must be set in
>>>> your configuration
>>>>
>>>>                 at
>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:394)
>>>>
>>>>                 at
>>>> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
>>>>
>>>>                 at
>>>> org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
>>>>
>>>>                 at scala.Option.getOrElse(Option.scala:189)
>>>>
>>>>                 at
>>>> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
>>>>
>>>>                 at TestMain$.<init>(TestMain.scala:7)
>>>>
>>>>                 at TestMain$.<clinit>(TestMain.scala)
>>>>
>>>>
>>>> From the exception it appears that it tries to create spark session on
>>>> executor also in Spark 3 whereas its not created again on executor in Spark
>>>> 2.3.
>>>>
>>>> Can anyone help in identfying why there is this change in behavior?
>>>>
>>>> Thanks and Regards,
>>>>
>>>> Shrikant
>>>>
>>>> --
>>>> Regards,
>>>> Shrikant Prasad
>>>>
>>> --
>> Regards,
>> Shrikant Prasad
>>
> --
Regards,
Shrikant Prasad

Reply via email to