Re: Spark 1.3.0: Running Pi example on YARN fails

2015-04-13 Thread Zhan Zhang
Hi Zork,

From the exception, it is still caused by hdp.version not being propagated 
correctly.  Can you check whether there is any typo?

[root@c6402 conf]# more java-opts -Dhdp.version=2.2.0.0–2041

[root@c6402 conf]# more spark-defaults.conf
spark.driver.extraJavaOptions  -Dhdp.version=2.2.0.0–2041
spark.yarn.am.extraJavaOptions  -Dhdp.version=2.2.0.0–2041

This is HDP specific question, and you can move the topic to HDP forum.


Thanks.

Zhan Zhang


On Apr 13, 2015, at 3:00 AM, Zork Sail 
zorks...@gmail.commailto:zorks...@gmail.com wrote:

Hi Zhan,
Alas setting:

-Dhdp.version=2.2.0.0–2041

Does not help. Still get the same error:
15/04/13 09:53:59 INFO yarn.Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1428918838408
 final status: UNDEFINED
 tracking URL: 
http://foo.bar.site:8088/proxy/application_1427875242006_0037/
 user: test
15/04/13 09:54:00 INFO yarn.Client: Application report for 
application_1427875242006_0037 (state: ACCEPTED)
15/04/13 09:54:01 INFO yarn.Client: Application report for 
application_1427875242006_0037 (state: ACCEPTED)
15/04/13 09:54:02 INFO yarn.Client: Application report for 
application_1427875242006_0037 (state: ACCEPTED)
15/04/13 09:54:03 INFO yarn.Client: Application report for 
application_1427875242006_0037 (state: FAILED)
15/04/13 09:54:03 INFO yarn.Client:
 client token: N/A
 diagnostics: Application application_1427875242006_0037 failed 2 times due 
to AM Container for appattempt_1427875242006_0037_02 exited with  exitCode: 
1
For more detailed output, check application tracking 
page:http://foo.bar.site:8088/proxy/application_1427875242006_0037/Then, click 
on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1427875242006_0037_02_01
Exit code: 1
Exception message: 
/mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh:
 line 27: 
$PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:
 bad substitution

Stack trace: ExitCodeException exitCode=1: 
/mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh:
 line 27: 
$PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:
 bad substitution

at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1428918838408
 final status: FAILED
 tracking URL: 

Re: Spark 1.3.0: Running Pi example on YARN fails

2015-04-13 Thread Zork Sail
Hi Zhan,
Alas setting:

-Dhdp.version=2.2.0.0–2041

Does not help. Still get the same error:
15/04/13 09:53:59 INFO yarn.Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1428918838408
 final status: UNDEFINED
 tracking URL:
http://foo.bar.site:8088/proxy/application_1427875242006_0037/
 user: test
15/04/13 09:54:00 INFO yarn.Client: Application report for
application_1427875242006_0037 (state: ACCEPTED)
15/04/13 09:54:01 INFO yarn.Client: Application report for
application_1427875242006_0037 (state: ACCEPTED)
15/04/13 09:54:02 INFO yarn.Client: Application report for
application_1427875242006_0037 (state: ACCEPTED)
15/04/13 09:54:03 INFO yarn.Client: Application report for
application_1427875242006_0037 (state: FAILED)
15/04/13 09:54:03 INFO yarn.Client:
 client token: N/A
 diagnostics: Application application_1427875242006_0037 failed 2 times
due to AM Container for appattempt_1427875242006_0037_02 exited with
exitCode: 1
For more detailed output, check application tracking page:
http://foo.bar.site:8088/proxy/application_1427875242006_0037/Then, click
on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1427875242006_0037_02_01
Exit code: 1
Exception message:
/mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh:
line 27:
$PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:
bad substitution

Stack trace: ExitCodeException exitCode=1:
/mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_01/launch_container.sh:
line 27:
$PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:
bad substitution

at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1428918838408
 final status: FAILED
 tracking URL:
http://foo.bar.site:8088/cluster/app/application_1427875242006_0037
 user: test
Exception in thread main org.apache.spark.SparkException: Application
finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at

Spark 1.3.0: Running Pi example on YARN fails

2015-04-06 Thread Zork Sail
I have `Hadoop 2.6.0.2.2.0.0-2041` with `Hive 0.14.0.2.2.0.0-2041
`
After building Spark with command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive
-Phive-thriftserver -DskipTests package

I try to run Pi example on YARN with the following command:

export HADOOP_CONF_DIR=/etc/hadoop/conf
/var/home2/test/spark/bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--executor-memory 3G \
--num-executors 50 \
hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \
1000

I get exceptions: `application_1427875242006_0029 failed 2 times due to AM
Container for appattempt_1427875242006_0029_02 exited with  exitCode:
1` Which in fact is `Diagnostics: Exception from container-launch.`(please
see log below).

Application tracking url reveals the following messages:

java.lang.Exception: Unknown container. Container either has not
started or has already completed or doesn't belong to this node at all

and also:

Error: Could not find or load main class
org.apache.spark.deploy.yarn.ApplicationMaster

I have Hadoop working fine on 4 nodes and completly at a loss how to make
Spark work on YARN. Please advise where to look for, any ideas would be of
great help, thank you!

Spark assembly has been built with Hive, including Datanucleus jars on
classpath
15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service
address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/
15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at
etl-hdp-yarn.foo.bar.com/192.168.0.16:8050
15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from
cluster with 4 NodeManagers
15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not
requested more than the maximum memory capability of the cluster (4096 MB
per container)
15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with
896 MB memory including 384 MB overhead
15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context
for our AM
15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM
container
15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The
short-circuit local reads feature cannot be used because libhadoop cannot
be loaded.
15/04/06 10:53:43 INFO yarn.Client: Uploading resource
file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar
- hdfs://
etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar
15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems
are the same. Not copying
hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar
15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment
for our AM container
15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to:
test
15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to:
test
15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(test); users with modify permissions: Set(test)
15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to
ResourceManager
15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application
application_1427875242006_0029
15/04/06 10:53:45 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:45 INFO yarn.Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1428317623905
 final status: UNDEFINED
 tracking URL:
http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/
 user: test
15/04/06 10:53:46 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:47 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:48 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:49 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: FAILED)
15/04/06 10:53:49 INFO yarn.Client:
 client token: N/A
 diagnostics: Application application_1427875242006_0029 failed 2
times due to AM Container for appattempt_1427875242006_0029_02 exited
with  exitCode: 1
For more detailed output, check application tracking page:
http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then,
click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1427875242006_0029_02_01

Spark 1.3.0: Running Pi example on YARN fails

2015-04-06 Thread Zork
I have `Hadoop 2.6.0.2.2.0.0-2041` with `Hive 0.14.0.2.2.0.0-2041
`
After building Spark with command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive
-Phive-thriftserver -DskipTests package

I try to run Pi example on YARN with the following command:

export HADOOP_CONF_DIR=/etc/hadoop/conf
/var/home2/test/spark/bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--executor-memory 3G \
--num-executors 50 \
hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \
1000
   
I get exceptions: `application_1427875242006_0029 failed 2 times due to AM
Container for appattempt_1427875242006_0029_02 exited with  exitCode: 1`
Which in fact is `Diagnostics: Exception from container-launch.`(please see
log below).

Application tracking url reveals the following messages:

java.lang.Exception: Unknown container. Container either has not started
or has already completed or doesn't belong to this node at all

and also:

Error: Could not find or load main class
org.apache.spark.deploy.yarn.ApplicationMaster

I have Hadoop working fine on 4 nodes and completly at a loss how to make
Spark work on YARN. Please advise where to look for, any ideas would be of
great help, thank you!

Spark assembly has been built with Hive, including Datanucleus jars on
classpath
15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service
address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/
15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at
etl-hdp-yarn.foo.bar.com/192.168.0.16:8050
15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from
cluster with 4 NodeManagers
15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not
requested more than the maximum memory capability of the cluster (4096 MB
per container)
15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with 896
MB memory including 384 MB overhead
15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context
for our AM
15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM
container
15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The
short-circuit local reads feature cannot be used because libhadoop cannot be
loaded.
15/04/06 10:53:43 INFO yarn.Client: Uploading resource
file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar
-
hdfs://etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar
15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems
are the same. Not copying
hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar
15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment
for our AM container
15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to:
test
15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to:
test
15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(test); users with modify permissions: Set(test)
15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to
ResourceManager
15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application
application_1427875242006_0029
15/04/06 10:53:45 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:45 INFO yarn.Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1428317623905
 final status: UNDEFINED
 tracking URL:
http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/
 user: test
15/04/06 10:53:46 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:47 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:48 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:49 INFO yarn.Client: Application report for
application_1427875242006_0029 (state: FAILED)
15/04/06 10:53:49 INFO yarn.Client:
 client token: N/A
 diagnostics: Application application_1427875242006_0029 failed 2
times due to AM Container for appattempt_1427875242006_0029_02 exited
with  exitCode: 1
For more detailed output, check application tracking
page:http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then,
click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1427875242006_0029_02_01