[ 
https://issues.apache.org/jira/browse/HADOOP-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689842#action_12689842
 ] 

Klaas Bosteels commented on HADOOP-5528:
----------------------------------------

Chris,

* 3:-1 does indeed not refer to the last two bytes, but
** that's how it works in Python as well:
{code}
>>> l = [1,2,3,4,5]
>>> l[1:-2], l[1:3], l[3:-1]
([2, 3], [2, 3], [4])
{code}
** you can specify "the last n" bytes by setting only the left offset (because 
{{LastIndexer}} is the default right indexer), which is also how you do it in 
Python:
{code}
>>> l[-2:]
[4, 5]
{code}
** because of the {{min}} in the {{PosOffsetIndexer}}, you can also just set 
the right offset to a large enough number to get "the last n" bytes.
* I don't think that -1 should be the default right offset, since that would 
mean that the last byte is ignored by default.
* It might indeed be possible to use {{(index + key.getLength()) % 
key.getLength()}} for both negative and positive offsets, but we need a 
separate indexer to implement the default right index anyway, and I think it 
makes sense to minimize the required computations by using more specialized 
indexers.

So, personally, I think that:

* we need the indexer classes (and cannot use -1 as default right index),
* the max/min games are useful (and not merely a way of preventing exceptions),
* the semantics are correct,

which leaves me with nothing to change in the latest patch *smile* Can you 
agree with this, or is there still something you want me to change nevertheless?

> Binary partitioner
> ------------------
>
>                 Key: HADOOP-5528
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5528
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Klaas Bosteels
>            Assignee: Klaas Bosteels
>         Attachments: HADOOP-5528.patch, HADOOP-5528.patch, HADOOP-5528.patch, 
> HADOOP-5528.patch
>
>
> It would be useful to have a {{BinaryPartitioner}} that partitions 
> {{BinaryComparable}} keys by hashing a configurable part of the bytes array 
> corresponding to each key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to