Re: spark on yarn is trying to use file:// instead of hdfs://

2014-06-20 Thread Marcelo Vanzin
Hi Koert,

Could you provide more details? Job arguments, log messages, errors, etc.

On Fri, Jun 20, 2014 at 9:40 AM, Koert Kuipers wrote:
 i noticed that when i submit a job to yarn it mistakenly tries to upload
 files to local filesystem instead of hdfs. what could cause this?

 in i have HADOOP_CONF_DIR set correctly (and spark-submit does
 find yarn), and my core-site.xml has a fs.defaultFS that is hdfs, not local

 thanks! koert


Re: spark on yarn is trying to use file:// instead of hdfs://

2014-06-20 Thread Koert Kuipers
 yeah sure see below. i strongly suspect its something i misconfigured
causing yarn to try to use local filesystem mistakenly.


[koert@cdh5-yarn ~]$ /usr/local/lib/spark/bin/spark-submit --class
org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3
--executor-cores 1
hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar 10
14/06/20 12:54:40 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/06/20 12:54:40 INFO RMProxy: Connecting to ResourceManager at
14/06/20 12:54:41 INFO Client: Got Cluster metric info from
ApplicationsManager (ASM), number of NodeManagers: 1
14/06/20 12:54:41 INFO Client: Queue info ... queueName: root.default,
queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0,
  queueApplicationCount = 0, queueChildQueueCount = 0
14/06/20 12:54:41 INFO Client: Max mem capabililty of a single resource in
this cluster 8192
14/06/20 12:54:41 INFO Client: Preparing Local resources
14/06/20 12:54:41 WARN BlockReaderLocal: The short-circuit local reads
feature cannot be used because libhadoop cannot be loaded.
14/06/20 12:54:41 INFO Client: Uploading
hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar to
14/06/20 12:54:43 INFO Client: Setting up the launch environment
14/06/20 12:54:43 INFO Client: Setting up container launch context
14/06/20 12:54:43 INFO Client: Command for starting the Spark
ApplicationMaster: List($JAVA_HOME/bin/java, -server, -Xmx512m,$PWD/tmp, -Dspark.akka.retry.wait=\3\,\12\,\12\,\org.apache.spark.examples.SparkPi\,
-Dspark.akka.frameSize=\1\, -Dspark.akka.timeout=\3\,
org.apache.spark.deploy.yarn.ApplicationMaster, --class,
org.apache.spark.examples.SparkPi, --jar ,
--args  '10' , --executor-memory, 1024, --executor-cores, 1,
--num-executors , 3, 1, LOG_DIR/stdout, 2, LOG_DIR/stderr)
14/06/20 12:54:43 INFO Client: Submitting application to ASM
14/06/20 12:54:43 INFO YarnClientImpl: Submitted application
14/06/20 12:54:44 INFO Client: Application report from ASM:
 application identifier: application_1403201750110_0060
 appId: 60
 clientToAMToken: null
 appMasterHost: N/A
 appQueue: root.koert
 appMasterRpcPort: -1
 appStartTime: 1403283283505
 yarnAppState: ACCEPTED
 distributedFinalState: UNDEFINED
 appUser: koert
14/06/20 12:54:45 INFO Client: Application report from ASM:
 application identifier: application_1403201750110_0060
 appId: 60
 clientToAMToken: null
 appMasterHost: N/A
 appQueue: root.koert
 appMasterRpcPort: -1
 appStartTime: 1403283283505
 yarnAppState: ACCEPTED
 distributedFinalState: UNDEFINED
 appUser: koert
14/06/20 12:54:46 INFO Client: Application report from ASM:
 application identifier: application_1403201750110_0060
 appId: 60
 clientToAMToken: null
 appMasterHost: N/A
 appQueue: root.koert
 appMasterRpcPort: -1
 appStartTime: 1403283283505
 yarnAppState: ACCEPTED
 distributedFinalState: UNDEFINED
 appUser: koert
14/06/20 12:54:47 INFO Client: Application report from ASM:
 application identifier: application_1403201750110_0060
 appId: 60
 clientToAMToken: null
 appDiagnostics: Application application_1403201750110_0060 failed 2
times due to AM Container for appattempt_1403201750110_0060_02 exited
with  exitCode: -1000 due to: File
does not exist
.Failing this attempt.. Failing the application.
 appMasterHost: N/A
 appQueue: root.koert
 appMasterRpcPort: -1
 appStartTime: 1403283283505
 yarnAppState: FAILED
 distributedFinalState: FAILED
 appUser: koert

On Fri, Jun 20, 2014 at 12:42 PM, Marcelo Vanzin

 Hi Koert,

 Could you provide more details? Job arguments, log messages, errors, etc.

 On Fri, Jun 20, 2014 at 9:40 AM, Koert Kuipers wrote:
  i noticed that 

Re: spark on yarn is trying to use file:// instead of hdfs://

2014-06-20 Thread bc Wong
Koert, is there any chance that your fs.defaultFS isn't setup right?

On Fri, Jun 20, 2014 at 9:57 AM, Koert Kuipers wrote:

  yeah sure see below. i strongly suspect its something i misconfigured
 causing yarn to try to use local filesystem mistakenly.


 [koert@cdh5-yarn ~]$ /usr/local/lib/spark/bin/spark-submit --class
 org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3
 --executor-cores 1
 hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar 10
 14/06/20 12:54:40 WARN NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 14/06/20 12:54:40 INFO RMProxy: Connecting to ResourceManager at
 14/06/20 12:54:41 INFO Client: Got Cluster metric info from
 ApplicationsManager (ASM), number of NodeManagers: 1
 14/06/20 12:54:41 INFO Client: Queue info ... queueName: root.default,
 queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0,
   queueApplicationCount = 0, queueChildQueueCount = 0
 14/06/20 12:54:41 INFO Client: Max mem capabililty of a single resource in
 this cluster 8192
 14/06/20 12:54:41 INFO Client: Preparing Local resources
 14/06/20 12:54:41 WARN BlockReaderLocal: The short-circuit local reads
 feature cannot be used because libhadoop cannot be loaded.
 14/06/20 12:54:41 INFO Client: Uploading
 hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar to
 14/06/20 12:54:43 INFO Client: Setting up the launch environment
 14/06/20 12:54:43 INFO Client: Setting up container launch context
 14/06/20 12:54:43 INFO Client: Command for starting the Spark
 ApplicationMaster: List($JAVA_HOME/bin/java, -server, -Xmx512m,$PWD/tmp, -Dspark.akka.retry.wait=\3\,\12\,\12\,\org.apache.spark.examples.SparkPi\,
 -Dspark.akka.frameSize=\1\, -Dspark.akka.timeout=\3\,
 org.apache.spark.deploy.yarn.ApplicationMaster, --class,
 org.apache.spark.examples.SparkPi, --jar ,
 --args  '10' , --executor-memory, 1024, --executor-cores, 1,
 --num-executors , 3, 1, LOG_DIR/stdout, 2, LOG_DIR/stderr)
 14/06/20 12:54:43 INFO Client: Submitting application to ASM
 14/06/20 12:54:43 INFO YarnClientImpl: Submitted application
 14/06/20 12:54:44 INFO Client: Application report from ASM:
  application identifier: application_1403201750110_0060
  appId: 60
  clientToAMToken: null
  appMasterHost: N/A
  appQueue: root.koert
  appMasterRpcPort: -1
  appStartTime: 1403283283505
  yarnAppState: ACCEPTED
  distributedFinalState: UNDEFINED
  appUser: koert
 14/06/20 12:54:45 INFO Client: Application report from ASM:
  application identifier: application_1403201750110_0060
  appId: 60
  clientToAMToken: null
  appMasterHost: N/A
  appQueue: root.koert
  appMasterRpcPort: -1
  appStartTime: 1403283283505
  yarnAppState: ACCEPTED
  distributedFinalState: UNDEFINED
  appUser: koert
 14/06/20 12:54:46 INFO Client: Application report from ASM:
  application identifier: application_1403201750110_0060
  appId: 60
  clientToAMToken: null
  appMasterHost: N/A
  appQueue: root.koert
  appMasterRpcPort: -1
  appStartTime: 1403283283505
  yarnAppState: ACCEPTED
  distributedFinalState: UNDEFINED
  appUser: koert
 14/06/20 12:54:47 INFO Client: Application report from ASM:
  application identifier: application_1403201750110_0060
  appId: 60
  clientToAMToken: null
  appDiagnostics: Application application_1403201750110_0060 failed 2
 times due to AM Container for appattempt_1403201750110_0060_02 exited
 with  exitCode: -1000 due to: File
 does not exist
 .Failing this attempt.. Failing the application.
  appMasterHost: N/A
  appQueue: root.koert
  appMasterRpcPort: -1
  appStartTime: 1403283283505
  yarnAppState: FAILED
  distributedFinalState: FAILED
  appUser: koert


Re: spark on yarn is trying to use file:// instead of hdfs://

2014-06-20 Thread Koert Kuipers
ok solved it. as it happened in spark/conf i also had a file called (with some tachyone related stuff in it) so thats why it
ignored /etc/hadoop/conf/core-site.xml

On Fri, Jun 20, 2014 at 3:24 PM, Koert Kuipers wrote:

 i put some logging statements in yarn.Client and that confirms its using
 local filesystem:
 14/06/20 15:20:33 INFO Client: fs.defaultFS is file:///

 so somehow fs.defaultFS is not being picked up from
 /etc/hadoop/conf/core-site.xml, but spark does correctly pick up
 yarn.resourcemanager.hostname from /etc/hadoop/conf/yarn-site.xml


 On Fri, Jun 20, 2014 at 1:26 PM, Koert Kuipers wrote:

 in /etc/hadoop/conf/core-site.xml:

 also hdfs seems the default:
 [koert@cdh5-yarn ~]$ hadoop fs -ls /
 Found 5 items
 drwxr-xr-x   - hdfs supergroup  0 2014-06-19 12:31 /data
 drwxrwxrwt   - hdfs supergroup  0 2014-06-20 12:17 /lib
 drwxrwxrwt   - hdfs supergroup  0 2014-06-18 14:58 /tmp
 drwxr-xr-x   - hdfs supergroup  0 2014-06-18 15:02 /user
 drwxr-xr-x   - hdfs supergroup  0 2014-06-18 14:59 /var

 and in my spark-site.env:
 export HADOOP_CONF_DIR=/etc/hadoop/conf

 On Fri, Jun 20, 2014 at 1:04 PM, bc Wong wrote:

 Koert, is there any chance that your fs.defaultFS isn't setup right?

 On Fri, Jun 20, 2014 at 9:57 AM, Koert Kuipers

  yeah sure see below. i strongly suspect its something i misconfigured
 causing yarn to try to use local filesystem mistakenly.


 [koert@cdh5-yarn ~]$ /usr/local/lib/spark/bin/spark-submit --class
 org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3
 --executor-cores 1
 hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar 10
 14/06/20 12:54:40 WARN NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 14/06/20 12:54:40 INFO RMProxy: Connecting to ResourceManager at
 14/06/20 12:54:41 INFO Client: Got Cluster metric info from
 ApplicationsManager (ASM), number of NodeManagers: 1
 14/06/20 12:54:41 INFO Client: Queue info ... queueName: root.default,
 queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0,
   queueApplicationCount = 0, queueChildQueueCount = 0
 14/06/20 12:54:41 INFO Client: Max mem capabililty of a single resource
 in this cluster 8192
 14/06/20 12:54:41 INFO Client: Preparing Local resources
 14/06/20 12:54:41 WARN BlockReaderLocal: The short-circuit local reads
 feature cannot be used because libhadoop cannot be loaded.
 14/06/20 12:54:41 INFO Client: Uploading
 hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar to
 14/06/20 12:54:43 INFO Client: Setting up the launch environment
 14/06/20 12:54:43 INFO Client: Setting up container launch context
 14/06/20 12:54:43 INFO Client: Command for starting the Spark
 ApplicationMaster: List($JAVA_HOME/bin/java, -server, -Xmx512m,$PWD/tmp, -Dspark.akka.retry.wait=\3\,\12\,\12\,\org.apache.spark.examples.SparkPi\,
 -Dspark.akka.frameSize=\1\, -Dspark.akka.timeout=\3\,
 org.apache.spark.deploy.yarn.ApplicationMaster, --class,
 org.apache.spark.examples.SparkPi, --jar ,
 --args  '10' , --executor-memory, 1024, --executor-cores, 1,
 --num-executors , 3, 1, LOG_DIR/stdout, 2, LOG_DIR/stderr)
 14/06/20 12:54:43 INFO Client: Submitting application to ASM
 14/06/20 12:54:43 INFO YarnClientImpl: Submitted application
 14/06/20 12:54:44 INFO Client: Application report from ASM:
  application identifier: application_1403201750110_0060
  appId: 60
  clientToAMToken: null
  appMasterHost: N/A
  appQueue: root.koert
  appMasterRpcPort: -1
  appStartTime: 1403283283505
  yarnAppState: ACCEPTED
  distributedFinalState: UNDEFINED
  appUser: koert
 14/06/20 12:54:45 INFO Client: Application report from ASM:
  application identifier: application_1403201750110_0060
  appId: 60
  clientToAMToken: null
  appMasterHost: N/A
  appQueue: root.koert
  appMasterRpcPort: -1
  appStartTime: 1403283283505
  yarnAppState: ACCEPTED
  distributedFinalState: UNDEFINED