[ 
https://issues.apache.org/jira/browse/ACCUMULO-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864219#comment-15864219
 ] 

Christopher Tubbs commented on ACCUMULO-4586:
---------------------------------------------

Currently, {{RowIterator}} provides grouping in the same way that Linux command 
{{uniq}} provides grouping. That is to say, it only groups adjacent items which 
match, rather than group all matching items in the stream. I think that, just 
like for {{uniq}}, partial groupings on unsorted data is still a valid use case.

The proposed "fix" removes a perfectly valid use of {{RowIterator}}, changing 
the behavior to favor a different use case. This is exactly the kind of thing 
we complain about Thrift doing, as in THRIFT-1805. I don't consider the current 
behavior to be a bug, but I do recognize that it is prone to being used 
incorrectly, or with invalid assumptions.

Rather than change the behavior of the existing API to accommodate a subset of 
use cases, I would prefer a new, alternate API to replace it, which imposes the 
desired restrictions.

> Make rowiterator fail when unsorted data is observed
> ----------------------------------------------------
>
>                 Key: ACCUMULO-4586
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4586
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.6.6, 1.7.1, 1.8.0
>            Reporter: Keith Turner
>             Fix For: 2.0.0
>
>
> A batchscanner was used as a row iterator data source.  The rowiterator 
> expects data in sorted order and the batch scanner does not supply data in 
> sorted order.  The row iterator should have a sanity check to ensure source 
> data is in sorted order.
> https://lists.apache.org/thread.html/c24448d171d8414321bccfc778c7fc8b53e45892cae9daafa220503f@%3Cuser.accumulo.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to