[jira] Commented: (HADOOP-1531) Add RowFilter to HRegion.HScanner

stack (JIRA) Tue, 03 Jul 2007 10:31:26 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509948
 ]


stack commented on HADOOP-1531:
-------------------------------

Regards the filterAllRemaining above, I should add another version of the regex 
filter if I just want a scanner to return me all row keys that match apache.com 
(and that gives up scanning once it leaves row keys that contain that domain)?  
Thanks for setting me right on final(Text, Text, byte []) . In 
RegExpRowFilter.filter(Text), its returning true if the regex matches.  
Shouldn't that be inverted?

Regards export of my eclipse format, I would not inflict my settings on others. 
 I loose interest configuring eclipse after about the tenth dialog box so at a 
minimum they are incomplete.   IIRC, others have posted 'hadoop' eclipse 
formatters to the mailing list.

On wrapping after the operator, it does not appear as an option in the eclipse 
formatter tablet (I'm looking at an eclipse 3.3).  it always wants to wrap 
before. Odd, because if you use eclipse to break a long string, it leaves the 
'+' as the last character on the wrapped line (which is what I want).  Hanging 
operators when wrapping a line as an indicator of continuation I first read of 
in 'Java Elements of Style' I believe.  It made sense to me.

> Add RowFilter to HRegion.HScanner
> ---------------------------------
>
>                 Key: HADOOP-1531
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1531
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>    Affects Versions: 0.14.0
>            Reporter: James Kennedy
>            Assignee: James Kennedy
>         Attachments: eclipse.preferences, RowFilter-v2.patch, 
> RowFilter-v3.patch, RowFilter.patch
>
>
> I've implemented a RowFilterInterface and a RowFilter implementation.  This 
> is passed to the HRegion.HScanner via HClient.openScanner() though it is an 
> entirely optional parameter.
> HScanner applies the filter in the next() call by iterating until it 
> encounters a row that is not filtered by the RowFilter.  The filter applies 
> criteria based on row keys and/or column data values.
> Null values are little tricky since the resultSet in that loop may represent 
> nulls as absent columns or as DELETED_BYTES.  Nevertheless null cases are 
> taken care of by the filter and you can for example retrieve all rows where 
> column X = null.
> The initial RowFilter implementation is limited in several ways:
> * Equality test only with literal values. No !=, <, >, etc. No col1 == col2. 
> This is a straight-up byte[] comparison.
> * Multiple column criteria are treated as an implicit conjunction, no 
> disjunction possible.
> * row key criteria is a regular expression only
> * row key criteria is independent of column criteria. No "if 
> rowkey.matches(A)  and col1==B"  although the interface is created to allow 
> for that.
> But it should be easy to write an improved RowFilterInterface implementation 
> to take care of most of the above without having to change code elsewhere.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1531) Add RowFilter to HRegion.HScanner

Reply via email to