Hi, I have multiple files with JSON format, such as:
/data/test1_data/sub100/test.data
/data/test2_data/sub200/test.data
I can sc.textFile(“/data/*/*”)
but I want to add the {“source” : “HDFS_LOCATION”} to each line, then save it
the one target HDFS location.
how to do it, Thanks.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
