[jira] [Updated] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-12-17 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4827:
---

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Patch rejected for backward compatibility reasons.

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt, betterhash2.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-12-05 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4827:
---

Attachment: betterhash2.txt

change it for old mapred api as well

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt, betterhash2.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-12-04 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4827:
---

Status: Patch Available  (was: Open)

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-11-28 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4827:
---

Attachment: betterhash1.txt

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira