Ken Williams created SPARK-6681: ----------------------------------- Summary: JAVA_HOME error with upgrade to Spark 1.3.0 Key: SPARK-6681 URL: https://issues.apache.org/jira/browse/SPARK-6681 Project: Spark Issue Type: Bug Components: Spark Submit Affects Versions: 1.3.0 Environment: Client is Mac OS X version 10.10.2, cluster is running HDP 2.1 stack. Reporter: Ken Williams
I’m trying to upgrade a Spark project, written in Scala, from Spark 1.2.1 to 1.3.0, so I changed my `build.sbt` like so: {code} -libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.1" % "provided" +libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" % "provided" {code} then make an `assembly` jar, and submit it: {code} HADOOP_CONF_DIR=/etc/hadoop/conf \ spark-submit \ --driver-class-path=/etc/hbase/conf \ --conf spark.hadoop.validateOutputSpecs=false \ --conf spark.yarn.jar=hdfs:/apps/local/spark-assembly-1.3.0-hadoop2.4.0.jar \ --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ --deploy-mode=cluster \ --master=yarn \ --class=TestObject \ --num-executors=54 \ target/scala-2.11/myapp-assembly-1.2.jar {code} The job fails to submit, with the following exception in the terminal: {code} 15/03/19 10:30:07 INFO yarn.Client: 15/03/19 10:20:03 INFO yarn.Client: client token: N/A diagnostics: Application application_1420225286501_4698 failed 2 times due to AM Container for appattempt_1420225286501_4698_000002 exited with exitCode: 127 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} Finally, I go and check the YARN app master’s web interface (since the job is there, I know it at least made it that far), and the only logs it shows are these: {code} Log Type: stderr Log Length: 61 /bin/bash: {{JAVA_HOME}}/bin/java: No such file or directory Log Type: stdout Log Length: 0 {code} I’m not sure how to interpret that - is {{ {{JAVA_HOME}} }} a literal (including the brackets) that’s somehow making it into a script? Is this coming from the worker nodes or the driver? Anything I can do to experiment & troubleshoot? I do have {{JAVA_HOME}} set in the hadoop config files on all the nodes of the cluster: {code} % grep JAVA_HOME /etc/hadoop/conf/*.sh /etc/hadoop/conf/hadoop-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31 /etc/hadoop/conf/yarn-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31 {code} Has this behavior changed in 1.3.0 since 1.2.1? Using 1.2.1 and making no other changes, the job completes fine. (Note: I originally posted this on the Spark mailing list and also on Stack Overflow, I'll update both places if/when I find a solution.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org