Hi All, I have multi node architecture of (1 master,2 workers) Spark cluster, the job runs to read CSV file data and it works fine when run on local mode (Local(*)). However, when the same job is ran in cluster mode(Spark://HOST:PORT), it is not able to read it. I want to know how to reference the files Or where to store them? Currently the CSV data file is on master(from where the job is submitted).
Following code works fine in local mode but not in cluster mode. val spark = SparkSession .builder() .appName("SampleFlightsApp") .master("spark://masterIP:7077") // change it to .master("local[*]) for local mode .getOrCreate() val flightDF = spark.read.option("header",true).csv("/home/username/sampleflightdata") flightDF.printSchema() Error: FileNotFoundException: File file:/home/username/sampleflightdata does not exist -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org