Ewan,
What issue are you having with HDFS when only Spark is installed? I'm not aware of any issue like this. Thanks, Jonathan — Sent from Mailbox On Wed, Sep 9, 2015 at 11:48 PM, Ewan Leith <ewan.le...@realitymine.com> wrote: > The last time I checked, if you launch EMR 4 with only Spark selected as an > application, HDFS isn't correctly installed. > Did you select another application like Hive at launch time as well as Spark? > If not, try that. > Thanks, > Ewan > ------ Original message------ > From: Dean Wampler > Date: Wed, 9 Sep 2015 22:29 > To: shahab; > Cc: user@spark.apache.org; > Subject:Re: [Spark on Amazon EMR] : File does not exist: > hdfs://ip-x-x-x-x:/.../spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar > If you log into the cluster, do you see the file if you type: > hdfs dfs -ls > hdfs://ipx-x-x-x:8020/user/hadoop/.sparkStaging/application_123344567_0018/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar > (with the correct server address for "ipx-x-x-x"). If not, is the server > address correct and routable inside the cluster. Recall that EC2 instances > have both public and private host names & IP addresses. > Also, is the port number correct for HDFS in the cluster? > dean > Dean Wampler, Ph.D. > Author: Programming Scala, 2nd > Edition<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) > Typesafe<http://typesafe.com> > @deanwampler<http://twitter.com/deanwampler> > http://polyglotprogramming.com > On Wed, Sep 9, 2015 at 9:28 AM, shahab > <shahab.mok...@gmail.com<mailto:shahab.mok...@gmail.com>> wrote: > Hi, > I am using Spark on Amazon EMR. So far I have not succeeded to submit the > application successfully, not sure what's problem. In the log file I see the > followings. > java.io.FileNotFoundException: File does not exist: > hdfs://ipx-x-x-x:8020/user/hadoop/.sparkStaging/application_123344567_0018/spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar > However, even putting spark-assembly-1.4.1-hadoop2.6.0-amzn-0.jar in the fat > jar file didn't solve the problem. I am out of clue now. > I want to submit a spark application, using aws web console, as a step. I > submit the application as : spark-submit --deploy-mode cluster --class > mypack.MyMainClass --master yarn-cluster s3://mybucket/MySparkApp.jar Is > there any one who has similar problem with EMR? > best, > /Shahab