Hi, I have multiple files with JSON format, such as:
/data/test1_data/sub100/test.data /data/test2_data/sub200/test.data I can sc.textFile(“/data/*/*”) but I want to add the {“source” : “HDFS_LOCATION”} to each line, then save it the one target HDFS location. how to do it, Thanks. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org