[ 
https://issues.apache.org/jira/browse/ACCUMULO-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552917#comment-13552917
 ] 

Keith Turner commented on ACCUMULO-625:
---------------------------------------

For the unique column case, I think it would be ok if the iterator considered 
more than a row.  The iterator could drop any key that contains a column it has 
seen before.  It would start w/ an empty set of seen columns each time its 
initialized. The M/R job would still need to do the final unique.  The iterator 
would just do a lot of filtering.  If we supported stateful iterators, then 
maybe the 100 most recently seen columns could be maintained in that state 
across iterator sessions.
                
> consider augmenting session state with "breadcrumbs"
> ----------------------------------------------------
>
>                 Key: ACCUMULO-625
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-625
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Eric Newton
>            Assignee: Keith Turner
>
> Presently, the iterator stack can be created and destroyed at the whim of the 
> tserver and its buffering needs.  In complex iterations, lower-level 
> iterators can make significant progress which is not inherently obvious in 
> any returned key.  When the iterator stack is re-created to continue a query, 
> the last key returned is used to {{seek()}} the iterators.  Lower-level 
> iterators must re-scan their data to move back to the old position.
> Consider a mechanism to save progress beyond the last key returned.
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to