[jira] Commented: (MAPREDUCE-1287) HashPartitioner calls hashCode() when there is only 1 reducer

2009-12-30 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795447#action_12795447
 ] 

Tom White commented on MAPREDUCE-1287:
--

Any reason that the old partitioner uses {{1 - numPartitions}} and the new one 
uses {{partitions - 1}}? It shouldn't make any difference since the partitioner 
is not actually used in the zero partition case, but it would be good to make 
the code consistent.

> Clearly, any application that depends on the partitioner for correctness can 
> be rewritten, but is it worth calling out?

I think so - put a comment in the release notes.

> HashPartitioner calls hashCode() when there is only 1 reducer
> -
>
> Key: MAPREDUCE-1287
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1287
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Ed Mazur
>Assignee: Ed Mazur
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: M1287-4.patch, MAPREDUCE-1287.2.patch, 
> MAPREDUCE-1287.3.patch, MAPREDUCE-1287.patch
>
>
> HashPartitioner could be optimized to not call the key's hashCode() if there 
> is only 1 reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1287) HashPartitioner calls hashCode() when there is only 1 reducer

2009-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792932#action_12792932
 ] 

Hadoop QA commented on MAPREDUCE-1287:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428502/M1287-4.patch
  against trunk revision 892479.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/228/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/228/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/228/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/228/console

This message is automatically generated.

> HashPartitioner calls hashCode() when there is only 1 reducer
> -
>
> Key: MAPREDUCE-1287
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1287
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Ed Mazur
>Assignee: Ed Mazur
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: M1287-4.patch, MAPREDUCE-1287.2.patch, 
> MAPREDUCE-1287.3.patch, MAPREDUCE-1287.patch
>
>
> HashPartitioner could be optimized to not call the key's hashCode() if there 
> is only 1 reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1287) HashPartitioner calls hashCode() when there is only 1 reducer

2009-12-11 Thread Ed Mazur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789323#action_12789323
 ] 

Ed Mazur commented on MAPREDUCE-1287:
-

I haven't ran this, but here's a quick analysis:

- Cost: (_number of map output pairs_)*(_cost of "reducers == 1" check_)
- Gain: (_number of map output pairs_)*(_cost of key's hashCode()_), but only 
in the case of 1 reducer (no gain otherwise)

Your suggestion of moving this into the framework makes a lot of sense. That 
way you only have to check for the 1 reducer case when you assign the 
partitioner and not for every map output, essentially eliminating the cost of 
the optimization.

> HashPartitioner calls hashCode() when there is only 1 reducer
> -
>
> Key: MAPREDUCE-1287
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1287
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Ed Mazur
>Assignee: Ed Mazur
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1287.2.patch, MAPREDUCE-1287.3.patch, 
> MAPREDUCE-1287.patch
>
>
> HashPartitioner could be optimized to not call the key's hashCode() if there 
> is only 1 reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1287) HashPartitioner calls hashCode() when there is only 1 reducer

2009-12-10 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788792#action_12788792
 ] 

Tom White commented on MAPREDUCE-1287:
--

What size of performance gain does this change give?

This might be better done in the framework, by using a special partitioner in 
the single reduce case. A class called, say, SinglePartitionPartitioner whose 
getPartition() method always returns 0.


> HashPartitioner calls hashCode() when there is only 1 reducer
> -
>
> Key: MAPREDUCE-1287
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1287
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Ed Mazur
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1287.2.patch, MAPREDUCE-1287.3.patch, 
> MAPREDUCE-1287.patch
>
>
> HashPartitioner could be optimized to not call the key's hashCode() if there 
> is only 1 reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1287) HashPartitioner calls hashCode() when there is only 1 reducer

2009-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788433#action_12788433
 ] 

Hadoop QA commented on MAPREDUCE-1287:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12427524/MAPREDUCE-1287.3.patch
  against trunk revision 888761.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/312/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/312/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/312/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/312/console

This message is automatically generated.

> HashPartitioner calls hashCode() when there is only 1 reducer
> -
>
> Key: MAPREDUCE-1287
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1287
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Ed Mazur
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1287.2.patch, MAPREDUCE-1287.3.patch, 
> MAPREDUCE-1287.patch
>
>
> HashPartitioner could be optimized to not call the key's hashCode() if there 
> is only 1 reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1287) HashPartitioner calls hashCode() when there is only 1 reducer

2009-12-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788320#action_12788320
 ] 

Todd Lipcon commented on MAPREDUCE-1287:


- Can you also make this change to the old API HashPartitioner? 
src/java/org/apache/hadoop/mapred/lib/HashPartitioner.java
- The "else" on a separate line from the '}' is different style than the usual 
Hadoop style. Also a space after 'if' is usual style.

> HashPartitioner calls hashCode() when there is only 1 reducer
> -
>
> Key: MAPREDUCE-1287
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1287
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Ed Mazur
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1287.patch
>
>
> HashPartitioner could be optimized to not call the key's hashCode() if there 
> is only 1 reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.