[jira] [Commented] (MAPREDUCE-4839) TextPartioner for hashing Text with good hashing function to get better distribution

Robert Joseph Evans (JIRA) Tue, 18 Dec 2012 07:26:15 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534965#comment-13534965
 ]


Robert Joseph Evans commented on MAPREDUCE-4839:
------------------------------------------------

I have not really looked at this in too much detail, but in the partitioner you 
are converting key to a String to convert it to UTF-8 bytes.  Text was designed 
to store the data in UTF-8 internally instead of UCS-2 like String does.  I 
think you could just call key.getBytes() directly.  I think the only difference 
is that malformed and unmappable bytes will be replaced in the returned String, 
instead of leaving the bytes as is.
                
> TextPartioner for hashing Text with good hashing function to get better 
> distribution
> ------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4839
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4839
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Radim Kolar
>         Attachments: textpartitioner1.txt, textpartitioner2.txt, 
> textpartitioner3.txt
>
>
> partitioner for Text keys using util.Hash framework for hashing function

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4839) TextPartioner for hashing Text with good hashing function to get better distribution

Reply via email to