Dylan,
If I recall correctly (which I give about 30% odds), the original purpose
of the side channel was to split up things like delete tombstone entries
from regular entries so that other iterators sitting on top of a
bifurcating iterator wouldn't have to handle the special tombstone
why you want to use a side channel instead of implementing the merge in
your own iterator
Here is a picture showing the difference--
Fig. A: Using a side channel to add a top-level iterator.
RfileIter1 RfileIter2 InjectIterator ...
| / /
|_/ /
o__*(3-way
top-level with respect to the side channel description is inverted with
respect to your diagram. Fig. A should be more like this:
RfileIter1 RfileIter2
| /
|_/
Merge
|
VersioningIterator
|
OtherIterators InjectIterator
| /
|__/
Merge
|
v
Thus,
If you can do a merge sort insertion, then you can guarantee order and
it's fine.
Yep, I guarantee the iterator we add as a side channel will emit tuples in
sorted order.
On a suggestion from David Medinets, I modified my testing code to use a
MiniAccumuloCluster set to 2 tablet servers. I
Hello all,
I've been toying with the registerSideChannel(iter)
https://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/IteratorEnvironment.html#registerSideChannel(org.apache.accumulo.core.iterators.SortedKeyValueIterator)
method
on the IteratorEnvironment passed to iterators
The main issue with adding data in an iterator is order. If you have can do
a merge sort insertion, then you can guarantee order and its fine. But if
you are inserting base on input you cannot guarantee order, and it can only
be on scan iterator.
On Feb 15, 2015 8:03 PM, Dylan Hutchison