Place it in HDFS and give the reference path in your code. Thanks, Sathish
On Fri, Sep 29, 2017 at 3:31 PM, Gaurav1809 <gauravhpan...@gmail.com> wrote: > Hi All, > > I have multi node architecture of (1 master,2 workers) Spark cluster, the > job runs to read CSV file data and it works fine when run on local mode > (Local(*)). However, when the same job is ran in cluster mode > (Spark://HOST:PORT), it is not able to read it. I want to know how to > reference the files Or where to store them? Currently the CSV data file is > on master(from where the job is submitted). > > Following code works fine in local mode but not in cluster mode. > > val spark = SparkSession > .builder() > .appName("SampleFlightsApp") > .master("spark://masterIP:7077") // change it to .master("local[*]) > for local mode > .getOrCreate() > > val flightDF = > spark.read.option("header",true).csv("/home/username/sampleflightdata") > flightDF.printSchema() > > Error: FileNotFoundException: File file:/home/gaurav/sampleflightdata does > not exist > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >