[jira] Commented: (HBASE-1647) Filter#filterRow is called too often, filters rows it shouldn't have

ryan rawson (JIRA) Thu, 16 Jul 2009 17:27:41 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732262#action_12732262
 ]


ryan rawson commented on HBASE-1647:
------------------------------------

There are some issues that need to be addressed before this can go in:

- results is now a field for no reason. This reduces GC efficiency and 
performance.
- RegionScanner#next is a mess now. Too many boolean flags, I don't detect a 
sense of clear minded purpose. Unbalanced and uncertain flags and filter.reset 
calls make me concerned about bugs.
- The last bug one is tests were deleted, instead of migrated. We lose test 
coverage with this patch.

I'm poking at it more, but the next and test issue are show stoppers.

> Filter#filterRow is called too often, filters rows it shouldn't have
> --------------------------------------------------------------------
>
>                 Key: HBASE-1647
>                 URL: https://issues.apache.org/jira/browse/HBASE-1647
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1647-v2.patch, HBASE-1647-v3.patch, 
> HBASE-1647-v4.patch, ScanBug.java, scanfilter.patch
>
>
> Filter#filterRow is called from ScanQueryMatcher#filterEntireRow which is 
> called from StoreScanner.next. However, if I understood the code correctly, 
> StoreScanner processes KeyValue-s in a column-oriented order (i.e. after 
> row1-col1 comes row2-col1, not row1-col2). Thus, when filterEntireRow is 
> called, in reality, the filter only processed (via filterKeyValue) only one 
> column of a row.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1647) Filter#filterRow is called too often, filters rows it shouldn't have

Reply via email to