Hi,

I have  multiple files with JSON format, such as:

/data/test1_data/sub100/test.data
/data/test2_data/sub200/test.data


I can sc.textFile(“/data/*/*”)

but I want to add the {“source” : “HDFS_LOCATION”} to each line, then save it 
the one target HDFS location. 

how to do it, Thanks.






---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to