Hi Zork, >From the exception, it is still caused by hdp.version not being propagated >correctly. Can you check whether there is any typo?
[root@c6402 conf]# more java-opts -Dhdp.version=2.2.0.0–2041 [root@c6402 conf]# more spark-defaults.conf spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0–2041 spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0–2041 This is HDP specific question, and you can move the topic to HDP forum. Thanks. Zhan Zhang On Apr 13, 2015, at 3:00 AM, Zork Sail <zorks...@gmail.com<mailto:zorks...@gmail.com>> wrote: Hi Zhan, Alas setting: -Dhdp.version=2.2.0.0–2041 Does not help. Still get the same error: 15/04/13 09:53:59 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: UNDEFINED tracking URL: http://foo.bar.site:8088/proxy/application_1427875242006_0037/ user: test 15/04/13 09:54:00 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:01 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:02 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:03 INFO yarn.Client: Application report for application_1427875242006_0037 (state: FAILED) 15/04/13 09:54:03 INFO yarn.Client: client token: N/A diagnostics: Application application_1427875242006_0037 failed 2 times due to AM Container for appattempt_1427875242006_0037_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://foo.bar.site:8088/proxy/application_1427875242006_0037/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1427875242006_0037_02_000001 Exit code: 1 Exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_000001/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution Stack trace: ExitCodeException exitCode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_000001/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: FAILED tracking URL: http://foo.bar.site:8088/cluster/app/application_1427875242006_0037 user: test Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:622) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) On Fri, Apr 10, 2015 at 8:50 PM, Zhan Zhang <zzh...@hortonworks.com<mailto:zzh...@hortonworks.com>> wrote: Hi Zork, There is some script change in spark-1.3 when starting the spark. You can try put java-opts in your conf/ with following contents. -Dhdp.version=2.2.0.0–2041 Please let me know whether it works or not. Thanks. Zhan Zhang On Apr 10, 2015, at 7:21 AM, Zork Sail <zorks...@gmail.com<mailto:zorks...@gmail.com>> wrote: Many thanks. Yet even after setting: spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0–2041 spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0–2041 in SPARK_HOME/conf/spark-defaults.conf does not help, I still have exactly the same error log as before (( On Fri, Apr 10, 2015 at 5:44 PM, Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>> wrote: Zork: See http://search-hadoop.com/m/JW1q5iQhwz1 On Apr 10, 2015, at 5:08 AM, Zork Sail <zorks...@gmail.com<mailto:zorks...@gmail.com>> wrote: I have built Spark with command: mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests package What is missing in this command to build it for YARN? I have also tried latest pre-built version with Hadoop support. In both cases I get the same errors described above. What else can be wrong? Maybe Spark 1.3.0 does not support Hadoop 2.6? On Fri, Apr 10, 2015 at 3:29 PM, Sean Owen <so...@cloudera.com<mailto:so...@cloudera.com>> wrote: I see at least two possible problems: maybe you did not build Spark for YARN, and looks like a variable hdp.version is expected in your environment but not set (this isn't specific to Spark) On Fri, Apr 10, 2015 at 6:34 AM, Zork Sail <zorks...@gmail.com<mailto:zorks...@gmail.com>> wrote: > > Please help! Completely stuck trying to run Spark 1.3.0 on YARN! > I have `Hadoop 2.6.0.2.2.0.0-2041` with `Hive 0.14.0.2.2.0.0-2041 > ` > After building Spark with command: > > mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive > -Phive-thriftserver -DskipTests package > > I try to run Pi example on YARN with the following command: > > export HADOOP_CONF_DIR=/etc/hadoop/conf > /var/home2/test/spark/bin/spark-submit \ > --class org.apache.spark.examples.SparkPi \ > --master yarn-cluster \ > --executor-memory 3G \ > --num-executors 50 \ > hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \ > 1000 > > I get exceptions: `application_1427875242006_0029 failed 2 times due to AM > Container for appattempt_1427875242006_0029_000002 exited with exitCode: 1` > Which in fact is `Diagnostics: Exception from container-launch.`(please see > log below). > > Application tracking url reveals the following messages: > > java.lang.Exception: Unknown container. Container either has not started > or has already completed or doesn't belong to this node at all > > and also: > > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > > I have Hadoop working fine on 4 nodes and completly at a loss how to make > Spark work on YARN. Please advise where to look for, any ideas would be of > great help, thank you! > > Spark assembly has been built with Hive, including Datanucleus jars on > classpath > 15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service > address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/ > 15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at > etl-hdp-yarn.foo.bar.com/192.168.0.16:8050<http://etl-hdp-yarn.foo.bar.com/192.168.0.16:8050> > 15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from > cluster with 4 NodeManagers > 15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not > requested more than the maximum memory capability of the cluster (4096 MB > per container) > 15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with 896 > MB memory including 384 MB overhead > 15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context > for our AM > 15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM > container > 15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The > short-circuit local reads feature cannot be used because libhadoop cannot be > loaded. > 15/04/06 10:53:43 INFO yarn.Client: Uploading resource > file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar > -> > hdfs://etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar<http://etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar> > 15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems > are the same. Not copying > hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar > 15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment > for our AM container > 15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to: > test > 15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to: > test > 15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(test); users with modify permissions: Set(test) > 15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to > ResourceManager > 15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application > application_1427875242006_0029 > 15/04/06 10:53:45 INFO yarn.Client: Application report for > application_1427875242006_0029 (state: ACCEPTED) > 15/04/06 10:53:45 INFO yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: N/A > ApplicationMaster RPC port: -1 > queue: default > start time: 1428317623905 > final status: UNDEFINED > tracking URL: > http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/ > user: test > 15/04/06 10:53:46 INFO yarn.Client: Application report for > application_1427875242006_0029 (state: ACCEPTED) > 15/04/06 10:53:47 INFO yarn.Client: Application report for > application_1427875242006_0029 (state: ACCEPTED) > 15/04/06 10:53:48 INFO yarn.Client: Application report for > application_1427875242006_0029 (state: ACCEPTED) > 15/04/06 10:53:49 INFO yarn.Client: Application report for > application_1427875242006_0029 (state: FAILED) > 15/04/06 10:53:49 INFO yarn.Client: > client token: N/A > diagnostics: Application application_1427875242006_0029 failed 2 > times due to AM Container for appattempt_1427875242006_0029_000002 exited > with exitCode: 1 > For more detailed output, check application tracking > page:http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then, > click on links to logs of each attempt. > Diagnostics: Exception from container-launch. > Container id: container_1427875242006_0029_02_000001 > Exit code: 1 > Exception message: > /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh: > line 27: > $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: > bad substitution > > Stack trace: ExitCodeException exitCode=1: > /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh: > line 27: > $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: > bad substitution > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > Container exited with a non-zero exit code 1 > Failing this attempt. Failing the application. > ApplicationMaster host: N/A > ApplicationMaster RPC port: -1 > queue: default > start time: 1428317623905 > final status: FAILED > tracking URL: > http://etl-hdp-yarn.foo.bar.com:8088/cluster/app/application_1427875242006_0029 > user: test > Exception in thread "main" org.apache.spark.SparkException: Application > finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:622) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) > at > org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > >