[jira] [Updated] (HBASE-11990) Make setting the start and stop row for a specific prefix easier

Niels Basjes (JIRA) Tue, 23 Sep 2014 02:43:06 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Niels Basjes updated HBASE-11990:
---------------------------------
    Attachment: HBASE-11990-20140923-v10.patch

This patch reverts back to the method of calculating the correct the stopRow 
and startRow to scan for the desired rows.
The method indicated by Lars is easier to understand, yet when doing a 
multithreaded query of the data (i.e. MapReduce) will start many mappers that 
are not needed at all and that could have been avoided beforehand. 

I have revised and clarified both the documentation and method naming to better 
clarify the distinction between the start/stop *Row* and the fact that we are 
looking for rowKey *Prefixes*.

Just to be clear; Doing a prefix scan right is a hard problem. Just look at all 
the good discussion that this ticket triggered about all the ways it can be 
done and what the various pros and cons are. My intend with this patch has 
always been to solve this problem in such a way that doing a prefix scan will 
be a trivial task in the future. 

> Make setting the start and stop row for a specific prefix easier
> ----------------------------------------------------------------
>
>                 Key: HBASE-11990
>                 URL: https://issues.apache.org/jira/browse/HBASE-11990
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client
>            Reporter: Niels Basjes
>         Attachments: 11990v4.txt, HBASE-11990-20140916-v2.patch, 
> HBASE-11990-20140916-v3.patch, HBASE-11990-20140916-v5.patch, 
> HBASE-11990-20140916-v6.patch, HBASE-11990-20140916.patch, 
> HBASE-11990-20140917-v7.patch, HBASE-11990-20140919-v8.patch, 
> HBASE-11990-20140921-v9.patch, HBASE-11990-20140923-v10.patch
>
>
> If you want to set a scan from your application to scan for a specific row 
> prefix this is actually quite hard.
> As described in several places you can set the startRow to the prefix; yet 
> the stopRow should be set to the prefix '+1'
> If the prefix 'ASCII' put into a byte[] then this is easy because you can 
> simply increment the last byte of the array. 
> But if your application uses real binary rowids you may run into the scenario 
> that your prefix is something like 
> {code}{ 0x12, 0x23, 0xFF, 0xFF }{code} Then the increment should be {code}{ 
> 0x12, 0x24 }{code}
> I have prepared a proposed patch that makes setting these values correctly a 
> lot easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11990) Make setting the start and stop row for a specific prefix easier

Reply via email to