[ 
https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591868#comment-16591868
 ] 

Furcy Pin commented on SPARK-10795:
-----------------------------------

Hi, I came across this ticket with the same issue: my yarn job was failing with 
an error {code:java}java.io.FileNotFoundException: File does not exist{code} 
for some file called *__spark_conf__.zip* or *pyspark.zip* on hdfs, in the 
staging directory.

For me too, the files where uploaded correctly on hdfs, and the error happened 
at shutdown, because something was trying to read them after the staging 
directory had been wiped.

Thanks to Carlos Bribiescas's comment, I found out that I had left a 
{code:java}
SparkSession.builder.master("local[4]"){code}
in my code. After removing it everything worked like a charm.

I suggest creating a new ticket to add a check with a nice error message when 
the users make such kind of mistakes and close this ticket when it's done.


> FileNotFoundException while deploying pyspark job on cluster
> ------------------------------------------------------------
>
>                 Key: SPARK-10795
>                 URL: https://issues.apache.org/jira/browse/SPARK-10795
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>         Environment: EMR 
>            Reporter: Harshit
>            Priority: Major
>
> I am trying to run simple spark job using pyspark, it works as standalone , 
> but while I deploy over cluster it fails.
> Events :
> 2015-09-24 10:38:49,602 INFO  [main] yarn.Client (Logging.scala:logInfo(59)) 
> - Uploading resource file:/usr/lib/spark/python/lib/pyspark.zip -> 
> hdfs://ip-xxxx.ap-southeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1439967440341_0461/pyspark.zip
> Above uploading resource file is successfull , I manually checked file is 
> present in above specified path , but after a while I face following error :
> Diagnostics: File does not exist: 
> hdfs://ip-xxx.ap-southeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1439967440341_0461/pyspark.zip
> java.io.FileNotFoundException: File does not exist: 
> hdfs://ip-1xxx.ap-southeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1439967440341_0461/pyspark.zip



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to