Or you can try mounting that drive to all node. On Fri, Sep 29, 2017 at 6:14 AM Jörn Franke <jornfra...@gmail.com> wrote:
> You should use a distributed filesystem such as HDFS. If you want to use > the local filesystem then you have to copy each file to each node. > > > On 29. Sep 2017, at 12:05, Gaurav1809 <gauravhpan...@gmail.com> wrote: > > > > Hi All, > > > > I have multi node architecture of (1 master,2 workers) Spark cluster, the > > job runs to read CSV file data and it works fine when run on local mode > > (Local(*)). > > However, when the same job is ran in cluster mode(Spark://HOST:PORT), it > is > > not able to read it. > > I want to know how to reference the files Or where to store them? > Currently > > the CSV data file is on master(from where the job is submitted). > > > > Following code works fine in local mode but not in cluster mode. > > > > val spark = SparkSession > > .builder() > > .appName("SampleFlightsApp") > > .master("spark://masterIP:7077") // change it to .master("local[*]) > > for local mode > > .getOrCreate() > > > > val flightDF = > > spark.read.option("header",true).csv("/home/username/sampleflightdata") > > flightDF.printSchema() > > > > Error: FileNotFoundException: File file:/home/username/sampleflightdata > does > > not exist > > > > > > > > -- > > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >