Re: spark on yarn is trying to use file:// instead of hdfs://

Koert Kuipers Fri, 20 Jun 2014 09:58:44 -0700

 yeah sure see below. i strongly suspect its something i misconfigured
causing yarn to try to use local filesystem mistakenly.


*********************

[koert@cdh5-yarn ~]$ /usr/local/lib/spark/bin/spark-submit --class
org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3
--executor-cores 1
hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar 10
14/06/20 12:54:40 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/06/20 12:54:40 INFO RMProxy: Connecting to ResourceManager at
cdh5-yarn.tresata.com/192.168.1.85:8032
14/06/20 12:54:41 INFO Client: Got Cluster metric info from
ApplicationsManager (ASM), number of NodeManagers: 1
14/06/20 12:54:41 INFO Client: Queue info ... queueName: root.default,
queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0,
      queueApplicationCount = 0, queueChildQueueCount = 0
14/06/20 12:54:41 INFO Client: Max mem capabililty of a single resource in
this cluster 8192
14/06/20 12:54:41 INFO Client: Preparing Local resources
14/06/20 12:54:41 WARN BlockReaderLocal: The short-circuit local reads
feature cannot be used because libhadoop cannot be loaded.
14/06/20 12:54:41 INFO Client: Uploading
hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar to
file:/home/koert/.sparkStaging/application_1403201750110_0060/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar
14/06/20 12:54:43 INFO Client: Setting up the launch environment
14/06/20 12:54:43 INFO Client: Setting up container launch context
14/06/20 12:54:43 INFO Client: Command for starting the Spark
ApplicationMaster: List($JAVA_HOME/bin/java, -server, -Xmx512m,
-Djava.io.tmpdir=$PWD/tmp, -Dspark.akka.retry.wait=\"30000\",
-Dspark.storage.blockManagerTimeoutIntervalMs=\"120000\",
-Dspark.storage.blockManagerHeartBeatMs=\"120000\",
-Dspark.app.name=\"org.apache.spark.examples.SparkPi\",
-Dspark.akka.frameSize=\"10000\", -Dspark.akka.timeout=\"30000\",
-Dspark.worker.timeout=\"30000\",
-Dspark.akka.logLifecycleEvents=\"true\",
-Dlog4j.configuration=log4j-spark-container.properties,
org.apache.spark.deploy.yarn.ApplicationMaster, --class,
org.apache.spark.examples.SparkPi, --jar ,
hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar,
--args  '10' , --executor-memory, 1024, --executor-cores, 1,
--num-executors , 3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/06/20 12:54:43 INFO Client: Submitting application to ASM
14/06/20 12:54:43 INFO YarnClientImpl: Submitted application
application_1403201750110_0060
14/06/20 12:54:44 INFO Client: Application report from ASM:
     application identifier: application_1403201750110_0060
     appId: 60
     clientToAMToken: null
     appDiagnostics:
     appMasterHost: N/A
     appQueue: root.koert
     appMasterRpcPort: -1
     appStartTime: 1403283283505
     yarnAppState: ACCEPTED
     distributedFinalState: UNDEFINED
     appTrackingUrl:
http://cdh5-yarn.tresata.com:8088/proxy/application_1403201750110_0060/
     appUser: koert
14/06/20 12:54:45 INFO Client: Application report from ASM:
     application identifier: application_1403201750110_0060
     appId: 60
     clientToAMToken: null
     appDiagnostics:
     appMasterHost: N/A
     appQueue: root.koert
     appMasterRpcPort: -1
     appStartTime: 1403283283505
     yarnAppState: ACCEPTED
     distributedFinalState: UNDEFINED
     appTrackingUrl:
http://cdh5-yarn.tresata.com:8088/proxy/application_1403201750110_0060/
     appUser: koert
14/06/20 12:54:46 INFO Client: Application report from ASM:
     application identifier: application_1403201750110_0060
     appId: 60
     clientToAMToken: null
     appDiagnostics:
     appMasterHost: N/A
     appQueue: root.koert
     appMasterRpcPort: -1
     appStartTime: 1403283283505
     yarnAppState: ACCEPTED
     distributedFinalState: UNDEFINED
     appTrackingUrl:
http://cdh5-yarn.tresata.com:8088/proxy/application_1403201750110_0060/
     appUser: koert
14/06/20 12:54:47 INFO Client: Application report from ASM:
     application identifier: application_1403201750110_0060
     appId: 60
     clientToAMToken: null
     appDiagnostics: Application application_1403201750110_0060 failed 2
times due to AM Container for appattempt_1403201750110_0060_000002 exited
with  exitCode: -1000 due to: File
file:/home/koert/.sparkStaging/application_1403201750110_0060/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar
does not exist
.Failing this attempt.. Failing the application.
     appMasterHost: N/A
     appQueue: root.koert
     appMasterRpcPort: -1
     appStartTime: 1403283283505
     yarnAppState: FAILED
     distributedFinalState: FAILED
     appTrackingUrl:
cdh5-yarn.tresata.com:8088/cluster/app/application_1403201750110_0060
     appUser: koert




On Fri, Jun 20, 2014 at 12:42 PM, Marcelo Vanzin <van...@cloudera.com>
wrote:

> Hi Koert,
>
> Could you provide more details? Job arguments, log messages, errors, etc.
>
> On Fri, Jun 20, 2014 at 9:40 AM, Koert Kuipers <ko...@tresata.com> wrote:
> > i noticed that when i submit a job to yarn it mistakenly tries to upload
> > files to local filesystem instead of hdfs. what could cause this?
> >
> > in spark-env.sh i have HADOOP_CONF_DIR set correctly (and spark-submit
> does
> > find yarn), and my core-site.xml has a fs.defaultFS that is hdfs, not
> local
> > filesystem.
> >
> > thanks! koert
>
>
>
> --
> Marcelo
>

Re: spark on yarn is trying to use file:// instead of hdfs://

Reply via email to