Based on your code:
sparkContext.addFile("/home/files/data.txt");
List file =sparkContext.textFile(SparkFiles.get("data.txt")).collect();
I’m assuming the file in “/home/files/data.txt” exists and is readable in the
driver’s filesystem.
Did you try just doing this:
List file
My understanding of the model is that you’re supposed to execute
SparkFiles.get(…) on each worker node, not on the driver.
Since you already know where the files are on the driver, if you want to load
these into an RDD with SparkContext.textFile, then this will distribute it out
to the
I am trying to play a little bit with apache-spark cluster mode.
So my cluster consists of a driver in my machine and a worker and manager in
host machine(separate machine).
I send a textfile using `sparkContext.addFile(filepath)` where the filepath
is the path of my text file in local machine