[jira] Commented: (HBASE-1647) Filter#filterRow is called too often, filters rows it shouldn't have

JIRA Fri, 17 Jul 2009 04:19:42 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732469#action_12732469
 ]


Doğacan Güney commented on HBASE-1647:
--------------------------------------

>  # results is now a field for no reason. This reduces GC efficiency and 
> performance.

I explained why in my previous comment. Not sure if mine is a valid reason for 
worrying though. It seems results is always cleared in internal hbase usage so 
my extra safeguard there may be pointless.

> RegionScanner#next is a mess now. Too many boolean flags, I don't detect a 
> sense of clear minded purpose. 
> Unbalanced and uncertain flags and filter.reset calls make me concerned about 
> bugs.

I see your point, yet in other ways, it is also clearer now. All the extra 
logic outside the while loop is moved into the loop, and stop row comparison 
code is now in one place.

I reduced boolean flags to one (filterCurrentRow). It is an optimization flag 
like stickyNextRow in underlying scanners.

I also refactored code a bit. Let me know if it is clearer now.

> # The last bug one is tests were deleted, instead of migrated. We lose test 
> coverage with this patch.

I added tests to TestScanner.

> Filter#filterRow is called too often, filters rows it shouldn't have
> --------------------------------------------------------------------
>
>                 Key: HBASE-1647
>                 URL: https://issues.apache.org/jira/browse/HBASE-1647
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1647-v2.patch, HBASE-1647-v3.patch, 
> HBASE-1647-v4.patch, HBASE-1647-v5.patch, ScanBug.java, scanfilter.patch
>
>
> Filter#filterRow is called from ScanQueryMatcher#filterEntireRow which is 
> called from StoreScanner.next. However, if I understood the code correctly, 
> StoreScanner processes KeyValue-s in a column-oriented order (i.e. after 
> row1-col1 comes row2-col1, not row1-col2). Thus, when filterEntireRow is 
> called, in reality, the filter only processed (via filterKeyValue) only one 
> column of a row.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1647) Filter#filterRow is called too often, filters rows it shouldn't have

Reply via email to