[jira] Updated: (HBASE-1647) Filter#filterRow is called too often, filters rows it shouldn't have

JIRA Wed, 15 Jul 2009 06:09:43 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Doğacan Güney updated HBASE-1647:
---------------------------------

    Attachment: HBASE-1647-v4.patch

v4 of patch.

* I have removed all test methods in TestStoreScanner as most of the filter 
methods are now called in RegionScanner. Should I also refactor the test 
methods to TestScanner?

* I have made a small change in TestScanner. RegionScanner#next's javadoc:

{code}

    /**
     * Get the next row of results from this region.
     * @param results list to append results to
     * @return true if there are more rows, false if scanner is done
     */

{code}

And in TestScanner#testStopRow:

{code}

      InternalScanner s = r.getScanner(scan);
      int count = 0;
      while (s.next(results)) {
        count++;
      }

{code}

In trunk count is 1. However, there is only one row to scan ("abc"). Since once 
we call next (and put KeyValue-s in results) there are no more rows so I think 
we must return false (thus count is 0). Please correct me if I am wrong here.

* There was a possibly serious bug in v3 in RegionScanner. It implicitly 
assumes that the caller cleared results list between calls to 
RegionScanner#next. If caller doesn't do that, we may delete results from older 
rows or even get stuck in an infinite loop. So I added a new field to 
RegionScanner. KeyValue-s are initially accumulated (or filtered) in this new 
field. Upon completion of next, they are added to the outResults. I am not sure 
if this is  necessary (no code in hbase reuses results).

> Filter#filterRow is called too often, filters rows it shouldn't have
> --------------------------------------------------------------------
>
>                 Key: HBASE-1647
>                 URL: https://issues.apache.org/jira/browse/HBASE-1647
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1647-v2.patch, HBASE-1647-v3.patch, 
> HBASE-1647-v4.patch, ScanBug.java, scanfilter.patch
>
>
> Filter#filterRow is called from ScanQueryMatcher#filterEntireRow which is 
> called from StoreScanner.next. However, if I understood the code correctly, 
> StoreScanner processes KeyValue-s in a column-oriented order (i.e. after 
> row1-col1 comes row2-col1, not row1-col2). Thus, when filterEntireRow is 
> called, in reality, the filter only processed (via filterKeyValue) only one 
> column of a row.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1647) Filter#filterRow is called too often, filters rows it shouldn't have

Reply via email to