Re: get and append file name in record being reading
You can use RDD.wholeTextFiles(). For example, suppose all your files are under /tmp/ABC_input/, val rdd = sc.wholeTextFiles("file:///tmp/ABC_input”) val rdd1 = rdd.flatMap { case (path, content) => val fileName = new java.io.File(path).getName content.split("\n").map { line => (line, fileName) } } val df = sqlContext.createDataFrame(rdd1).toDF("line", "file") > On Jun 2, 2016, at 03:13, Vikash Kumarwrote: > > 100,abc,299 > 200,xyz,499
get and append file name in record being reading
How I can get the file name of each record being reading? suppose input file ABC_input_0528.txt contains 111,abc,234 222,xyz,456 suppose input file ABC_input_0531.txt contains 100,abc,299 200,xyz,499 and I need to create one final output with file name in each record using dataframes my output file should looks like this: 111,abc,234,ABC_input_0528.txt 222,xyz,456,ABC_input_0528.txt 100,abc,299,ABC_input_0531.txt 200,xyz,499,ABC_input_0531.txt I need some working code.