Make a function (or lambda) that reads the text file. Make a RDD with a
list of X/Y, then map that RDD throught the file reading function. Same
with you X/Y/Z directory. You then have RDDs with the content of each file
as a record. Work with those as needed.
On Wed, May 11, 2016 at 2:36 PM
Hi -
I have a very unique problem which I am trying to solve and I am not sure
if spark would help here.
I have a directory: /X/Y/a.txt and in the same structure /X/Y/Z/b.txt.
a.txt contains a unique serial number, say:
12345
and b.txt contains key value pairs.
a,1
b,1,
c,0 etc.
Everyday you