Hi Schubert,

Currently, combiners use the OutputKeyComparator and not
OutputValueGroupingComparator.

If you think this functionality is very useful to you, you could raise a
Jira for the same and discuss it there.

Needless to say, you could always contribute your patch for the Jira too :)

Jothi


On 5/11/09 5:12 PM, "zsongbo" <zson...@gmail.com> wrote:

> Thanks Jothi,
> For example, I have a dataset with map key="city+userid+time". The output of
> mapper are sorted by this map key.
> 
> Than, I group the reduce output according to "city+userid" by define
> my OutputValueGroupingComparator
> which just compare "city+userid" in the mapkey. I still want the output are
> sorted by time in each group.
> 
> It works fine.
> 
> But to improve the performance, I want to use combiner which should also
> group as "city+userid", but sorted by "city+userid+time".
> 
> I do not know if this requirement is reasonable.
> 
> 
> Schubert
> 
> On Thu, May 7, 2009 at 7:53 PM, Jothi Padmanabhan
> <joth...@yahoo-inc.com>wrote:
> 
>> OutputValueGroupingComparator is used only at the reducer. AFAIK, I do not
>> think you can have a different comparator for combiners.
>> 
>> Jothi
>> 
>> 
>> On 5/7/09 3:32 PM, "zsongbo" <zson...@gmail.com> wrote:
>> 
>>> Hi all,
>>> I have a application want the rules of sorting and grouping use
>>> different Comparator.
>>> 
>>> I had tested 0.19.1 and 0.20.0 about this function, but both do not work
>> for
>>> Combiner.
>>> 
>>> In 0.19.1, I use job.setOutputValueGroupingComparator(), and
>>> in 0.20.0, I use job.setGroupingComparatorClass()
>>> 
>>> This function is ok for reduce phase, the reduce phase can group the keys
>> by
>>> above Comparator, and sort by default comparator of the key class.
>>> 
>>> But I want the combiner can use a separator comparator for group,
>> different
>>> from sorting, is it possible?
>>> 
>>> Schubert
>> 
>> 

Reply via email to