[ https://issues.apache.org/jira/browse/SPARK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alon Shoham updated SPARK-24456: -------------------------------- Description: When submitting a spark application in --deploy-mode cluster + spark standalone cluster, environment variables from the client machine overwrite server environment variables. We use *SPARK_DIST_CLASSPATH* environment variable to add extra required dependencies to the application. We observed that client machine SPARK_DIST_CLASSPATH overwrite remote server machine value, resulting in application submission failure. We have inspected the code and found: 1. In org.apache.spark.deploy.Client line 86: {{val command = new Command(mainClass,}} {{ \{{ {{ Seq("}}}}{{WORKER_URL}}{{", "}}{{USER_JAR}}{{", driverArgs.mainClass) ++ driverArgs.driverOptions,}}{{}}}} {{ {{ {{ *sys.env,* classPathEntries, libraryPathEntries, javaOpts)}}}}}} 2. In org.apache.spark.launcher.WorkerCommandBuilder line 35: {{childEnv.putAll(command.environment.asJava)}} {{childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome)}} Seen in line 35 is that the environment is overwritten in the server machine but in line 36 the SPARK_HOME is restored to the server value. We think the bug can be fixed by adding a line that restores SPARK_DIST_CLASSPATH to its server value, similar to SPARK_HOME was: When submitting a spark application in --deploy-mode cluster + spark standalone cluster, environment variables from the client machine overwrite server environment variables. We use *SPARK_DIST_CLASSPATH* environment variable to add extra required dependencies to the application. We observed that client machine SPARK_DIST_CLASSPATH overwrite remote server machine value, resulting in application submission failure. We have inspected the code and found: 1. In org.apache.spark.deploy.Client line 86: {{val command = new Command(mainClass,}} {{ {{ Seq("}}{{WORKER_URL}}{{", "}}{{USER_JAR}}{{", driverArgs.mainClass) ++ driverArgs.driverOptions,}}}} {{ {{ *sys.env,* classPathEntries, libraryPathEntries, javaOpts)}}}} 2. In org.apache.spark.launcher.WorkerCommandBuilder line 35: {{childEnv.putAll(command.environment.asJava)}} {{childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome)}} Seen in line 35 is that the environment is overwritten in the server machine but in line 36 the SPARK_HOME is restored to the server value. We think the bug can be fixed by adding a line that restores SPARK_DIST_CLASSPATH to its server value, similar to SPARK_HOME > Spark submit - server environment variables are overwritten by client > environment variables > -------------------------------------------------------------------------------------------- > > Key: SPARK-24456 > URL: https://issues.apache.org/jira/browse/SPARK-24456 > Project: Spark > Issue Type: Bug > Components: Spark Submit > Affects Versions: 2.3.0 > Reporter: Alon Shoham > Priority: Minor > > When submitting a spark application in --deploy-mode cluster + spark > standalone cluster, environment variables from the client machine overwrite > server environment variables. > > We use *SPARK_DIST_CLASSPATH* environment variable to add extra required > dependencies to the application. We observed that client machine > SPARK_DIST_CLASSPATH overwrite remote server machine value, resulting in > application submission failure. > > We have inspected the code and found: > 1. In org.apache.spark.deploy.Client line 86: > {{val command = new Command(mainClass,}} > {{ \{{ {{ Seq("}}}}{{WORKER_URL}}{{", "}}{{USER_JAR}}{{", > driverArgs.mainClass) ++ driverArgs.driverOptions,}}{{}}}} > {{ {{ {{ *sys.env,* classPathEntries, libraryPathEntries, javaOpts)}}}}}} > 2. In org.apache.spark.launcher.WorkerCommandBuilder line 35: > {{childEnv.putAll(command.environment.asJava)}} > {{childEnv.put(CommandBuilderUtils.ENV_SPARK_HOME, sparkHome)}} > Seen in line 35 is that the environment is overwritten in the server machine > but in line 36 the SPARK_HOME is restored to the server value. > We think the bug can be fixed by adding a line that restores > SPARK_DIST_CLASSPATH to its server value, similar to SPARK_HOME > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org