Re: Access file name in map function

Cheng Lian Fri, 26 Sep 2014 06:22:45 -0700

If the size of each file is small, you may try|SparkContext.wholeTextFiles|. Otherwise you can try something like this:


|val  filenames:  Seq[String] = ...
val  combined:  RDD[(String,String)] = filenames.map { name =>
  sc.textFile(name).map(line => name -> line)
}.reduce(_ ++ _)
|


On 9/26/14 6:45 PM, Shekhar Bansal wrote:

Hi
In one of our usecase, filename contains timestamp and we have toappend it in the record for aggregation.
How can I access filename in map function?

Thanks!

Re: Access file name in map function

Reply via email to