If you log into the cluster, do you see the file if you type:

hdfs dfs
-ls 
hdfs://ipx-x-x-x:8020/user/hadoop/.sparkStaging/application_123344567_0018/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar

(with the correct server address for "ipx-x-x-x"). If not, is the server
address correct and routable inside the cluster. Recall that EC2 instances
have both public and private host names & IP addresses.

Also, is the port number correct for HDFS in the cluster?

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Wed, Sep 9, 2015 at 9:28 AM, shahab <shahab.mok...@gmail.com> wrote:

> Hi,
> I am using Spark on Amazon EMR. So far I have not succeeded to submit the
> application successfully, not sure what's problem. In the log file I see
> the followings.
> java.io.FileNotFoundException: File does not exist:
> hdfs://ipx-x-x-x:8020/user/hadoop/.sparkStaging/application_123344567_0018/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar
>
> However, even putting spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar in the
> fat jar file didn't solve the problem. I am out of clue now.
> I want to submit a spark application, using aws web console, as a step. I
> submit the application as : spark-submit --deploy-mode cluster --class
> mypack.MyMainClass --master yarn-cluster s3://mybucket/MySparkApp.jar Is
> there any one who has similar problem with EMR?
>
> best,
> /Shahab
>

Reply via email to