[ 
https://issues.apache.org/jira/browse/ACCUMULO-403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-403:
----------------------------------

    Fix Version/s:     (was: 1.5.0)
                   1.4.1
    
> Create general row selection iterator
> -------------------------------------
>
>                 Key: ACCUMULO-403
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-403
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client, tserver
>            Reporter: Keith Turner
>            Assignee: Billie Rinaldi
>             Fix For: 1.4.1
>
>
> The WholeRowIterator support filtering rows that meet a certain criteria.  
> However it reads the entire row into memory.  It is possible to efficiently 
> select rows w/o reading them into memory by using two iterators.  One 
> iterator for selection, one for reading.  When its determined that a row is 
> not needed using the selection iterator, then seek the read iterator over the 
> row.  
> This pattern could be made into an easy to use iterator that users extend.  
> The iterator could have an abstract method that user implement to decide if 
> they want to select or filter a row.  Could look something like the following.
> {noformat}
> class RowSelectionIterator extends WrappingIterator {
>    public abstract boolean selectRow(SortedKeyValueIterator row);
> }
> {noformat}
> Below is a simple example of a row selection iterator that returns rows that 
> have the columns foo and bar.
> {noformat}
> class FooBarRowSelector extends  RowSelectionIterator {
>    public boolean selectRow(SortedKeyValueIterator row){
>       
>       Text row = row.getTopKey().getRow();
>       //seek instead of scanning, this more efficient for large rows w/ lots 
> of columns... 
>       //if the row only has a few columns scanning is probably faster... also 
> seeking the 
>       //columns in sorted order is more efficient.
>       row.seek(Range.exact(row, 'bar');
>       boolean sawBar = row.hasTop();
>       if(!sawBar)
>         return false;
>       row.seek(Range.exact(row, 'foo'));
>       boolean sawFoo = row.hasTop();
>       return sawFoo;
>    }
> }
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to