Hey thanks a lot. That's basically what I needed.

2008/4/11 Zhang, jian <[EMAIL PROTECTED]>:

> Hi,
>
> Please read this, you need to implement partitioner.
> It controls which key is sent to which reducer, if u want to get unique
> key result, you need to implement partitioner and the compareTO function
> should work properly.
> [WIKI]
> Partitioner
>
> Partitioner partitions the key space.
>
> Partitioner controls the partitioning of the keys of the intermediate
> map-outputs. The key (or a subset of the key) is used to derive the
> partition, typically by a hash function. The total number of partitions is
> the same as the number of reduce tasks for the job. Hence this controls
> which of the m reduce tasks the intermediate key (and hence the record) is
> sent to for reduction.
>
> HashPartitioner is the default Partitioner.
>
>
>
> Best Regards
>
> Jian Zhang
>
>
> -----邮件原件-----
> 发件人: Harish Mallipeddi [mailto:[EMAIL PROTECTED]
> 发送时间: 2008年4月11日 19:06
> 收件人: core-user@hadoop.apache.org
> 主题: Problem with key aggregation when number of reduce tasks is more than
> 1
>
> Hi all,
>
> I wrote a custom key class (implements WritableComparable) and implemented
> the compareTo() method inside this class. Everything works fine when I run
> the m/r job with 1 reduce task (via setNumReduceTasks). Keys are sorted
> correctly in the output files.
>
> But when I increase the number of reduce tasks, keys don't get aggregated
> properly; same keys seem to end up in separate output files
> (output/part-00000, output/part-00001, etc). This should not happen
> because
> right before reduce() gets called, all (k,v) pairs from all map outputs
> with
> the same 'k' are aggregated and the reduce function just iterates over the
> values (v1, v2, etc)?
>
> Do I need to implement anything else inside my custom key class other than
> compareTo? I also tried implementing equals() but that didn't help either.
> Then I came across setOutputKeyComparator(). So I added a custom
> Comparator
> class inside the key class and tried setting this on the JobConf object.
> But
> that didn't work either. What could be wrong?
>
> Cheers,
>
> --
> Harish Mallipeddi
> circos.com : poundbang.in/blog/
>



-- 
Harish Mallipeddi
circos.com : poundbang.in/blog/

Reply via email to