[ 
https://issues.apache.org/jira/browse/HBASE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143812#comment-14143812
 ] 

Niels Basjes commented on HBASE-11990:
--------------------------------------

The effect is that in almost all cases of this filter in combination with MR 
you will have a "large" number of mappers that will be started and stopped that 
have not seen any data.
I say that this starting of needless tasks can be avoided beforehand by using 
the original approach of my patch.

I don't yet see when setting the startRow and stopRow would require changing 
getSplits. The way the stopRow value is calculated from the prefix should not 
hit such an effect.
Can you give me an example where you expect to hit such an edge case?
I'll include it as an additional test.

> Make setting the start and stop row for a specific prefix easier
> ----------------------------------------------------------------
>
>                 Key: HBASE-11990
>                 URL: https://issues.apache.org/jira/browse/HBASE-11990
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client
>            Reporter: Niels Basjes
>         Attachments: 11990v4.txt, HBASE-11990-20140916-v2.patch, 
> HBASE-11990-20140916-v3.patch, HBASE-11990-20140916-v5.patch, 
> HBASE-11990-20140916-v6.patch, HBASE-11990-20140916.patch, 
> HBASE-11990-20140917-v7.patch, HBASE-11990-20140919-v8.patch, 
> HBASE-11990-20140921-v9.patch
>
>
> If you want to set a scan from your application to scan for a specific row 
> prefix this is actually quite hard.
> As described in several places you can set the startRow to the prefix; yet 
> the stopRow should be set to the prefix '+1'
> If the prefix 'ASCII' put into a byte[] then this is easy because you can 
> simply increment the last byte of the array. 
> But if your application uses real binary rowids you may run into the scenario 
> that your prefix is something like 
> {code}{ 0x12, 0x23, 0xFF, 0xFF }{code} Then the increment should be {code}{ 
> 0x12, 0x24 }{code}
> I have prepared a proposed patch that makes setting these values correctly a 
> lot easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to