Alon Shoham created SPARK-24456:
-----------------------------------

             Summary: Spark submit - server environment variables are 
overwritten by client environment variable 
                 Key: SPARK-24456
                 URL: https://issues.apache.org/jira/browse/SPARK-24456
             Project: Spark
          Issue Type: Bug
          Components: Spark Submit
    Affects Versions: 2.3.0
            Reporter: Alon Shoham


When submitting a spark application in --deploy-mode cluster + spark standalone 
cluster, environment variables from the client machine overwrite server 
environment variables. 

 

We use *SPARK_DIST_CLASSPATH* environment variable to add extra required 
dependencies to the application. We observed that client machine 
SPARK_DIST_CLASSPATH overwrite remote server machine value, resulting in 
application submission failure. 

 

We have inspected the code and found:

1. In org.apache.spark.deploy.Client line 86:

{{val command = new Command(mainClass,}}
{{ Seq("\{{WORKER_URL}}", "\{{USER_JAR}}", driverArgs.mainClass) ++ 
driverArgs.driverOptions,}}
{{ *sys.env,* classPathEntries, libraryPathEntries, javaOpts)}}

2. In org.apache.spark.launcher.WorkerCommandBuilder line 35:

{{childEnv.putAll(command.environment.asJava)}}
{{childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome)}}

Seen in line 35  is that the environment is overwritten in the server machine 
but in line 36 the SPARK_HOME is restored to the server value.

We think the bug can be fixed by adding a line that restores 
SPARK_DIST_CLASSPATH to its server value, similar to SPARK_HOME

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to