Ken Williams created SPARK-6681:
-----------------------------------
Summary: JAVA_HOME error with upgrade to Spark 1.3.0
Key: SPARK-6681
URL: https://issues.apache.org/jira/browse/SPARK-6681
Project: Spark
Issue Type: Bug
Components: Spark Submit
Affects Versions: 1.3.0
Environment: Client is Mac OS X version 10.10.2, cluster is running
HDP 2.1 stack.
Reporter: Ken Williams
I’m trying to upgrade a Spark project, written in Scala, from Spark 1.2.1 to
1.3.0, so I changed my `build.sbt` like so:
{code}
-libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.1" %
"provided"
+libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" %
"provided"
{code}
then make an `assembly` jar, and submit it:
{code}
HADOOP_CONF_DIR=/etc/hadoop/conf \
spark-submit \
--driver-class-path=/etc/hbase/conf \
--conf spark.hadoop.validateOutputSpecs=false \
--conf
spark.yarn.jar=hdfs:/apps/local/spark-assembly-1.3.0-hadoop2.4.0.jar \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--deploy-mode=cluster \
--master=yarn \
--class=TestObject \
--num-executors=54 \
target/scala-2.11/myapp-assembly-1.2.jar
{code}
The job fails to submit, with the following exception in the terminal:
{code}
15/03/19 10:30:07 INFO yarn.Client:
15/03/19 10:20:03 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1420225286501_4698 failed 2 times
due to AM
Container for appattempt_1420225286501_4698_000002 exited with
exitCode: 127
due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{code}
Finally, I go and check the YARN app master’s web interface (since the job is
there, I know it at least made it that far), and the only logs it shows are
these:
{code}
Log Type: stderr
Log Length: 61
/bin/bash: {{JAVA_HOME}}/bin/java: No such file or directory
Log Type: stdout
Log Length: 0
{code}
I’m not sure how to interpret that - is {{ {{JAVA_HOME}} }} a literal
(including the brackets) that’s somehow making it into a script? Is this
coming from the worker nodes or the driver? Anything I can do to experiment &
troubleshoot?
I do have {{JAVA_HOME}} set in the hadoop config files on all the nodes of the
cluster:
{code}
% grep JAVA_HOME /etc/hadoop/conf/*.sh
/etc/hadoop/conf/hadoop-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31
/etc/hadoop/conf/yarn-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31
{code}
Has this behavior changed in 1.3.0 since 1.2.1? Using 1.2.1 and making no
other changes, the job completes fine.
(Note: I originally posted this on the Spark mailing list and also on Stack
Overflow, I'll update both places if/when I find a solution.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]