[
https://issues.apache.org/jira/browse/HBASE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doğacan Güney updated HBASE-1647:
---------------------------------
Attachment: HBASE-1647-v4.patch
v4 of patch.
* I have removed all test methods in TestStoreScanner as most of the filter
methods are now called in RegionScanner. Should I also refactor the test
methods to TestScanner?
* I have made a small change in TestScanner. RegionScanner#next's javadoc:
{code}
/**
* Get the next row of results from this region.
* @param results list to append results to
* @return true if there are more rows, false if scanner is done
*/
{code}
And in TestScanner#testStopRow:
{code}
InternalScanner s = r.getScanner(scan);
int count = 0;
while (s.next(results)) {
count++;
}
{code}
In trunk count is 1. However, there is only one row to scan ("abc"). Since once
we call next (and put KeyValue-s in results) there are no more rows so I think
we must return false (thus count is 0). Please correct me if I am wrong here.
* There was a possibly serious bug in v3 in RegionScanner. It implicitly
assumes that the caller cleared results list between calls to
RegionScanner#next. If caller doesn't do that, we may delete results from older
rows or even get stuck in an infinite loop. So I added a new field to
RegionScanner. KeyValue-s are initially accumulated (or filtered) in this new
field. Upon completion of next, they are added to the outResults. I am not sure
if this is necessary (no code in hbase reuses results).
> Filter#filterRow is called too often, filters rows it shouldn't have
> --------------------------------------------------------------------
>
> Key: HBASE-1647
> URL: https://issues.apache.org/jira/browse/HBASE-1647
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.0
> Reporter: Doğacan Güney
> Fix For: 0.20.0
>
> Attachments: HBASE-1647-v2.patch, HBASE-1647-v3.patch,
> HBASE-1647-v4.patch, ScanBug.java, scanfilter.patch
>
>
> Filter#filterRow is called from ScanQueryMatcher#filterEntireRow which is
> called from StoreScanner.next. However, if I understood the code correctly,
> StoreScanner processes KeyValue-s in a column-oriented order (i.e. after
> row1-col1 comes row2-col1, not row1-col2). Thus, when filterEntireRow is
> called, in reality, the filter only processed (via filterKeyValue) only one
> column of a row.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.