[jira] [Commented] (SPARK-1616) input file not found issue

Marcelo Vanzin (JIRA) Tue, 13 May 2014 00:33:27 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994044#comment-13994044
 ]


Marcelo Vanzin commented on SPARK-1616:
---------------------------------------

Hi Prasad,

This doesn't really sound like a bug, but a mismatch between your expectations 
and Spark's.

When you tell a Spark job to read data from a file, Spark expects the file to 
be available to all the workers. This can be achieved in several ways:

* Using a distributed file system such as HDFS
* Using a networked file system such as NFS
* Using Spark's file distribution mechanism, which will copy the file to the 
workers for you (e.g. spark-submit's --files argument if you run 1.0)
* Manually copying the file like you did

But Spark will not automatically copy data to worker nodes on your behalf.

> input file not found issue 
> ---------------------------
>
>                 Key: SPARK-1616
>                 URL: https://issues.apache.org/jira/browse/SPARK-1616
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 0.9.0
>         Environment: Linux 2.6.18-348.3.1.el5 
>            Reporter: prasad potipireddi
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1616) input file not found issue

Reply via email to