cuijianwei created HBASE-14397:
----------------------------------

             Summary: PrefixFilter fail to filter all remainings if the prefix 
is longer than compared rowkey
                 Key: HBASE-14397
                 URL: https://issues.apache.org/jira/browse/HBASE-14397
             Project: HBase
          Issue Type: Improvement
          Components: Filters
    Affects Versions: 2.0.0
            Reporter: cuijianwei
            Priority: Minor


The PrefixFilter will filter rowkey as:
{code}
  public boolean filterRowKey(Cell firstRowCell) {
    ...
    int length = firstRowCell.getRowLength();
    if (length < prefix.length) return true; // ===> return directly if the 
prefix is longer
    ....
    if ((!isReversed() && cmp > 0) || (isReversed() && cmp < 0)) {
      passedPrefix = true;
    }
    filterRow = (cmp != 0);
    return filterRow;
  }
{code}
If the prefix is longer than the current rowkey, PrefixFilter#filterRowKey will 
filter the rowkey directly without comparing, so that won't set 'passedPrefix' 
flag even the current row is larger than the prefix.
For example, if there are three rows 'a', 'b' and 'c' in the table, and we 
issue a scan request as:
{code}
hbase(main):001:0> scan 'test_table', {STARTROW => 'a', FILTER => 
"(PrefixFilter ('aa'))"}
{code}
The region server will check the three rows before returning.  In our 
production, the user issue a scan with a PrefixFilter. The prefix is longer 
than the rowkeys of following millions of rows, so the region server will 
continue to check rows until hit a rowkey longer than the prefix. This make the 
client easily timeout. To fix this case, it seems we need to compare the prefix 
with the rowkey even when the prefix is longer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to