I presume that you need to have access to the path of each file you are
reading.
I don't know whether there is a good way to do that for HDFS, I need to
read the files myself, something like:
def openWithPath(inputPath: String, sc:SparkContext) = {
val fs= (new
Anwar,
Will try this as it might do exactly what I need. I will follow your
pattern but use sc.textFile() for each file.
I am now thinking that I could start with an RDD of file paths and map it
into (path, content) pairs, provided I could read a file on the server.
Thank you,
Oleg
On 1 June