RE: [jira] Updated: (HADOOP-1535) Wrong comparator used to merge files in Reduce phase

Vivek Ratan Thu, 05 Jul 2007 09:43:38 -0700

 Sorry. I didn't realize it went to the whole list.

-----Original Message-----
From: Vivek Ratan 
Sent: Thursday, July 05, 2007 10:11 PM
To: [email protected]
Subject: RE: [jira] Updated: (HADOOP-1535) Wrong comparator used to merge
files in Reduce phase


Sure, but can you give me a quick overview of writing unit tests for Hadoop?
I've used jUnit before, so just want specifics related to Hadoop. I'm
actually in the MC campus, sitting in the visitor cubes on the 8th floor.
Ping me when you have a little bit of time, orlet me know who the right
person for this is. Thx.   

-----Original Message-----
From: Nigel Daley [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 28, 2007 9:43 PM
To: [email protected]
Subject: Re: [jira] Updated: (HADOOP-1535) Wrong comparator used to merge
files in Reduce phase

Hi Vivek,

Can you include a unit test for this fix?

On Jun 28, 2007, at 2:40 AM, Vivek Ratan (JIRA) wrote:

>
>      [ https://issues.apache.org/jira/browse/HADOOP-1535? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Vivek Ratan updated HADOOP-1535:
> --------------------------------
>
>     Attachment: 1535_01.patch
>
> We use the comparator returned by JobConf.getOutputKeyComparator() for 
> the sort/merge phases of Map and Reduce. We use the comparator 
> returned by JobConf.getOutputValueGroupingComparator() for the 
> iterator across values for a given key. See 1535_01.patch.
>
>> Wrong comparator used to merge files in Reduce phase
>> ----------------------------------------------------
>>
>>                 Key: HADOOP-1535
>>                 URL: https://issues.apache.org/jira/browse/
>> HADOOP-1535
>>             Project: Hadoop
>>          Issue Type: Bug
>>          Components: mapred
>>    Affects Versions: 0.12.3, 0.13.0
>>            Reporter: Vivek Ratan
>>            Assignee: Vivek Ratan
>>             Fix For: 0.14.0
>>
>>         Attachments: 1535_01.patch
>>
>>
>> As per the fix for HADOOP-485, we allow users to optionally provide a 
>> different comparator to group values when calling the user's Reduce 
>> function. Devaraj and I were looking at the code yesterday and we 
>> found that in ReduceTask.java, we use the user- supplied comparator 
>> to merge the output files from the Map tasks (we use the 
>> user-supplied comparator when creating a new SequenceFile.Sorter 
>> object). This is incorrect as the comparator used to merge Map output 
>> files should be the same as that used to create those files in the 
>> Map phase. The user-supplied comparator for grouping values should be 
>> used only in the iterator passed to the user's Reduce function (which 
>> is done correctly in the code).
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>

RE: [jira] Updated: (HADOOP-1535) Wrong comparator used to merge files in Reduce phase

Reply via email to