Re: Spark 1.3.0: Running Pi example on YARN fails
Hi Zork, From the exception, it is still caused by hdp.version not being propagated correctly. Can you check whether there is any typo? [root@c6402 conf]# more java-opts -Dhdp.version=2.2.0.0–2041 [root@c6402 conf]# more spark-defaults.conf spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0–2041 spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0–2041 This is HDP specific question, and you can move the topic to HDP forum. Thanks. Zhan Zhang On Apr 13, 2015, at 3:00 AM, Zork Sail zorks...@gmail.commailto:zorks...@gmail.com wrote: Hi Zhan, Alas setting: -Dhdp.version=2.2.0.0–2041 Does not help. Still get the same error: 15/04/13 09:53:59 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: UNDEFINED tracking URL: http://foo.bar.site:8088/proxy/application_1427875242006_0037/ user: test 15/04/13 09:54:00 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:01 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:02 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:03 INFO yarn.Client: Application report for application_1427875242006_0037 (state: FAILED) 15/04/13 09:54:03 INFO yarn.Client: client token: N/A diagnostics: Application application_1427875242006_0037 failed 2 times due to AM Container for appattempt_1427875242006_0037_02 exited with exitCode: 1 For more detailed output, check application tracking page:http://foo.bar.site:8088/proxy/application_1427875242006_0037/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1427875242006_0037_02_01 Exit code: 1 Exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution Stack trace: ExitCodeException exitCode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: FAILED tracking URL:
Re: Spark 1.3.0: Running Pi example on YARN fails
Hi Zhan, Alas setting: -Dhdp.version=2.2.0.0–2041 Does not help. Still get the same error: 15/04/13 09:53:59 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: UNDEFINED tracking URL: http://foo.bar.site:8088/proxy/application_1427875242006_0037/ user: test 15/04/13 09:54:00 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:01 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:02 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:03 INFO yarn.Client: Application report for application_1427875242006_0037 (state: FAILED) 15/04/13 09:54:03 INFO yarn.Client: client token: N/A diagnostics: Application application_1427875242006_0037 failed 2 times due to AM Container for appattempt_1427875242006_0037_02 exited with exitCode: 1 For more detailed output, check application tracking page: http://foo.bar.site:8088/proxy/application_1427875242006_0037/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1427875242006_0037_02_01 Exit code: 1 Exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution Stack trace: ExitCodeException exitCode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: FAILED tracking URL: http://foo.bar.site:8088/cluster/app/application_1427875242006_0037 user: test Exception in thread main org.apache.spark.SparkException: Application finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:622) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
Spark 1.3.0: Running Pi example on YARN fails
I have `Hadoop 2.6.0.2.2.0.0-2041` with `Hive 0.14.0.2.2.0.0-2041 ` After building Spark with command: mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests package I try to run Pi example on YARN with the following command: export HADOOP_CONF_DIR=/etc/hadoop/conf /var/home2/test/spark/bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn-cluster \ --executor-memory 3G \ --num-executors 50 \ hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \ 1000 I get exceptions: `application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_02 exited with exitCode: 1` Which in fact is `Diagnostics: Exception from container-launch.`(please see log below). Application tracking url reveals the following messages: java.lang.Exception: Unknown container. Container either has not started or has already completed or doesn't belong to this node at all and also: Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster I have Hadoop working fine on 4 nodes and completly at a loss how to make Spark work on YARN. Please advise where to look for, any ideas would be of great help, thank you! Spark assembly has been built with Hive, including Datanucleus jars on classpath 15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/ 15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at etl-hdp-yarn.foo.bar.com/192.168.0.16:8050 15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers 15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4096 MB per container) 15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context for our AM 15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM container 15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 15/04/06 10:53:43 INFO yarn.Client: Uploading resource file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar - hdfs:// etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar 15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar 15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment for our AM container 15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to: test 15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to: test 15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(test); users with modify permissions: Set(test) 15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to ResourceManager 15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application application_1427875242006_0029 15/04/06 10:53:45 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:45 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428317623905 final status: UNDEFINED tracking URL: http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/ user: test 15/04/06 10:53:46 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:47 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:48 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:49 INFO yarn.Client: Application report for application_1427875242006_0029 (state: FAILED) 15/04/06 10:53:49 INFO yarn.Client: client token: N/A diagnostics: Application application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_02 exited with exitCode: 1 For more detailed output, check application tracking page: http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1427875242006_0029_02_01
Spark 1.3.0: Running Pi example on YARN fails
I have `Hadoop 2.6.0.2.2.0.0-2041` with `Hive 0.14.0.2.2.0.0-2041 ` After building Spark with command: mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests package I try to run Pi example on YARN with the following command: export HADOOP_CONF_DIR=/etc/hadoop/conf /var/home2/test/spark/bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn-cluster \ --executor-memory 3G \ --num-executors 50 \ hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \ 1000 I get exceptions: `application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_02 exited with exitCode: 1` Which in fact is `Diagnostics: Exception from container-launch.`(please see log below). Application tracking url reveals the following messages: java.lang.Exception: Unknown container. Container either has not started or has already completed or doesn't belong to this node at all and also: Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster I have Hadoop working fine on 4 nodes and completly at a loss how to make Spark work on YARN. Please advise where to look for, any ideas would be of great help, thank you! Spark assembly has been built with Hive, including Datanucleus jars on classpath 15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/ 15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at etl-hdp-yarn.foo.bar.com/192.168.0.16:8050 15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers 15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4096 MB per container) 15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context for our AM 15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM container 15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 15/04/06 10:53:43 INFO yarn.Client: Uploading resource file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar - hdfs://etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar 15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar 15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment for our AM container 15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to: test 15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to: test 15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(test); users with modify permissions: Set(test) 15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to ResourceManager 15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application application_1427875242006_0029 15/04/06 10:53:45 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:45 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428317623905 final status: UNDEFINED tracking URL: http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/ user: test 15/04/06 10:53:46 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:47 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:48 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED) 15/04/06 10:53:49 INFO yarn.Client: Application report for application_1427875242006_0029 (state: FAILED) 15/04/06 10:53:49 INFO yarn.Client: client token: N/A diagnostics: Application application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_02 exited with exitCode: 1 For more detailed output, check application tracking page:http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1427875242006_0029_02_01