From: Enis Soztutar [mailto:enis@gmail.com]
Sent: Wednesday, March 18, 2009 3:07 PM
To: core-user@hadoop.apache.org
Subject: Re: merging files
Use MultipleInputs and use two different mappers for the inputs. map1
should be IdentityMapper, mapper 2 should output key, value pairs where
value is
Use MultipleInputs and use two different mappers for the inputs. map1
should be IdentityMapper, mapper 2 should output key, value pairs where
value is a peudo marker value(same for all keys), which marks that the
value is null/empty. In the reducer just output the key/value pairs
which does not
I would use DistributedCache.
Put file2 to distributed cache, but you should read it for every map.
If you find a better solution, please let me know, because I have a similar
issue.
Rasit
2009/3/18 Nir Zohar
> Hi,
>
>
>
> I would like your help with the below question.
>
> I have 2 files: file
Hi,
I would like your help with the below question.
I have 2 files: file1 (key, value), file2 (only key) and I need to exclude
all records from file1 that these key records not in file2.
1. The output format is key-value, not only keys.
2. The key is not primary key; hence it's not possible