Hi,

I am trying to submit a job from a Windows system to a YARN cluster running
on Linux (the HDP2.2 sandbox). I have copied the relevant Hadoop directories
as well as the yarn-site.xml and mapred-site.xml to the Windows file system.
Further, I have added winutils.exe to $HADOOP_HOME/bin.

I can tell that the ApplicationMaster is properly created on YARN (it's
visible in the ResourceManager UI) but the job fails with the following
error:

Diagnostics: File
file:/D:/tools/spark-1.2.0-bin-hadoop2.4/lib/spark-assembly-1.2.0-hadoop2.4.0.jar
does not exist
java.io.FileNotFoundException: File
file:/D:/tools/spark-1.2.0-bin-hadoop2.4/lib/spark-assembly-1.2.0-hadoop2.4.0.jar
does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)

When I try to run the same job from the cluster itself, it works fine. When
I tried this, I noticed a difference: on the cluster one log entry was
"yarn.Client: Uploading resource..." but on the Windows machine it was
"Client: Source and destination file systems are the same. Not copying ".

Looking at the source code (in org.apache.spark.deploy.yarn.Client) I can
see that this happens because the client is led to believe that my Windows
machine as well as the destination (the cluster running in a VM) use the
same file system. Clearly this is not the case and so the above error
message is not surprising. But I can't figure out how to adjust the
configuration to get this to work. Also I'm surprised that when doing the
same thing on the cluster, the Client uploads the resource (because in that
case it really is on the same FS).

Is there something in the mapred-site.xml or yarn-site.xml files that I need
to adjust on my Windows machine? What am I missing?

Thanks,

Stefan



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Resources-not-uploaded-when-submitting-job-in-yarn-client-mode-tp21516.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to