Re: I want to group "similar" keys in the reducer.

Jim Twensky Mon, 15 Mar 2010 14:25:32 -0700

Hi Raymond,

Take a look at 
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#setGroupingComparatorClass(java.lang.Class).
I think this is what you want. Also make sure to implement a custom
partitioner that only takes into account the first part of the key,
namely the KEY part. You can search for "Secondary Sort" and "Hadoop"
to see some tutorials on this topic.


Cheers,
Jim

2010/3/15 Gang Luo <lgpub...@yahoo.com.cn>:
> you need to define a pattern and implement you own partitioner so that all 
> the similar keys you want to group will go the the same reducer. At reduce 
> side, you possibly need to  implement secondary  sorting so that the keys you 
> want to group are grouped in the sorted input to reducer. For reduce method 
> process on key at one time, you also need to maintain a window to buffer all 
> the keys being grouped.
>
> -Gang
>
>
>
> ----- 原始邮件 ----
> 发件人： Raymond Jennings III <raymondj...@yahoo.com>
> 收件人： common-user@hadoop.apache.org
> 发送日期： 2010/3/15 (周一) 1:26:09 下午
> 主   题： I want to group "similar" keys in the reducer.
>
> Is it possible to override a method in the reducer so that similar keys will 
> be grouped together?  For example I want all keys of value "KEY1" and "KEY2" 
> to merged together.  (My reducer has a KEY of type TEXT.)  Thanks.
>
>
>
>

Re: I want to group "similar" keys in the reducer.

Reply via email to