;
> Any help would be highly appreciated.
>
>
>
>
From: Aaron Kimball
To: common-user@hadoop.apache.org
Sent: Mon, July 5, 2010 8:51:44 AM
Subject: Re: Partitioned Datasets Map/Reduce
One possibility: write out all the partition numbers (one per line) to a
single file, then use the NLineInputFormat to make each line its own map
task. Then in your mapper itself, you will get in a key of "0" or "1" or "2"
etc. Then explicitly open /dataset1/part-(n) and /dataset2/part-(n) in your
Hello everyone,
I have written my custom partitioner for partitioning datasets. I want to
partition two datasets using the same partitioner and then in the next
mapreduce job, I want each mapper to handle the same partition from the two
sources and perform some function such as joining et