Re: SparkFiles.get() returns with driver path Instead of Worker Path

2016-03-08 Thread Tristan Nixon
Based on your code: sparkContext.addFile("/home/files/data.txt"); List file =sparkContext.textFile(SparkFiles.get("data.txt")).collect(); I’m assuming the file in “/home/files/data.txt” exists and is readable in the driver’s filesystem. Did you try just doing this: List file

Re: SparkFiles.get() returns with driver path Instead of Worker Path

2016-03-08 Thread Tristan Nixon
My understanding of the model is that you’re supposed to execute SparkFiles.get(…) on each worker node, not on the driver. Since you already know where the files are on the driver, if you want to load these into an RDD with SparkContext.textFile, then this will distribute it out to the

SparkFiles.get() returns with driver path Instead of Worker Path

2016-03-08 Thread ashikvc
I am trying to play a little bit with apache-spark cluster mode. So my cluster consists of a driver in my machine and a worker and manager in host machine(separate machine). I send a textfile using `sparkContext.addFile(filepath)` where the filepath is the path of my text file in local machine