[jira] Commented: (HADOOP-1531) Add RowFilter to HRegion.HScanner

stack (JIRA) Wed, 27 Jun 2007 10:47:46 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508601
 ]


stack commented on HADOOP-1531:
-------------------------------

Nice addition James.

Do you think HADOOP-1439 should be done as a filter?

in RowFilterInterface.java and elsewhere, you open class javadoc comment with a 
'<p>'.  Superfluous?

Why have this constructor:

+  /** Default constructor, filters nothing. */
+  public RowFilter() {
+    // nada
+  }

Is it needed?

Why pass row regexes on construction but use a setter for adding column filters 
rather than pass both to the constructor?

Should these additions go into a filter subpackage? (Its getting a little 
crowded in the hbase home directory).

Fix the auto-formatting in HRegion.next (Tests are split over lines.  Makes it 
harder to follow).

As you state, looks like a feature that would benefit from basic unit tests.

Patch applied cleanly for me to r551038



> Add RowFilter to HRegion.HScanner
> ---------------------------------
>
>                 Key: HADOOP-1531
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1531
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>    Affects Versions: 0.14.0
>            Reporter: James Kennedy
>            Assignee: James Kennedy
>         Attachments: RowFilter.patch
>
>
> I've implemented a RowFilterInterface and a RowFilter implementation.  This 
> is passed to the HRegion.HScanner via HClient.openScanner() though it is an 
> entirely optional parameter.
> HScanner applies the filter in the next() call by iterating until it 
> encounters a row that is not filtered by the RowFilter.  The filter applies 
> criteria based on row keys and/or column data values.
> Null values are little tricky since the resultSet in that loop may represent 
> nulls as absent columns or as DELETED_BYTES.  Nevertheless null cases are 
> taken care of by the filter and you can for example retrieve all rows where 
> column X = null.
> The initial RowFilter implementation is limited in several ways:
> * Equality test only with literal values. No !=, <, >, etc. No col1 == col2. 
> This is a straight-up byte[] comparison.
> * Multiple column criteria are treated as an implicit conjunction, no 
> disjunction possible.
> * row key criteria is a regular expression only
> * row key criteria is independent of column criteria. No "if 
> rowkey.matches(A)  and col1==B"  although the interface is created to allow 
> for that.
> But it should be easy to write an improved RowFilterInterface implementation 
> to take care of most of the above without having to change code elsewhere.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1531) Add RowFilter to HRegion.HScanner

Reply via email to