Hi,
I have 2 files in this format:
file1: (source, target)
file2: (source)
I would like to write MR which will output all records in file1 that their
source isn't in file2. Example:
file1:
1,2
2,9
3,5
file2:
2
7
outcome:
1,2
3,5
Could you help me with this ?
Map- Output key,value pair as- (source, file_num)
1,1
2,1
3,1
2,2
7,2
Reduce- (1, [1]), (2, [1,2]), (3, [1]), (7, [2])
Ouptut only those keys whose list of values do not contain file2-
1
3
-Taran
On Sun, Mar 15, 2009 at 7:24 AM, Tamir Kamara tamirkam...@gmail.com wrote:
Hi,
I have 2 files