Re: Maintaining State in an Iterator

Keith Turner Wed, 12 Jul 2023 09:51:03 -0700

Made an inline comment below in an attempt to improve one of my comments.

On Wed, Jul 12, 2023 at 12:31 PM Keith Turner <ke...@deenlo.com> wrote:


> Responses inline below.
>
> On Thu, Jul 6, 2023 at 7:02 AM Logan Jones <lo...@codescratch.com> wrote:
>
>> Hi Keith:
>>
>> Thanks so much for the response. We will base things off the
>> RowEncodingIterator then.
>>
>> A few follow up questions out of curiosity:
>>
>>
>>    1. It is likely that my iterator will not return very many records
>>    because I"m hoping we don't have much invalid data. Should I be worried
>>    about the fact that it's not going to return much data? I guess I
>> should
>>
>
> When the in memory data is flushed/minor compacted if there are long
> running scans, Accumulo may copy the in memory data verbatim to a tmp file
> and transparently switch the scan to that in lieu of the in memory
> snapshot.  The scan can not be switched to the minor compacted files
> because iterators may have run on it and the snapshot behavior could not be
> maintained.
>
>
>    expect long running scans. And then, if a t-server dies, it just won't
>> know
>>    where to pick up from so entries might get re-scanned?
>>
>
> Correct, any progress would be lost.
>
>
>>    2. What are the criteria Accumulo uses to decide it's time to re-build
>>    an entire iterator stack? Feel free to point me at code and I can read
>> it
>>    from there.
>>
>
> There are three conditions I can think of.
>
> One condition is that Accumulo places a SourceSwitching[1] iterator at the
> lowest levels of the iterator stack which uses the ScanDataSource[2] to
> determine when to switch.  That in turn uses an atomic counter[3] that is
> incremented when files or the in memory map changes to determine if a
> switch is needed[4].
>

I did not describe that very well, the terms lowest/highest/top/bottom/etc
could be ambiguous.  When a tserver reads data is has an iterator stack
that looks like this
"sourceSwitchingIter(userIters(systemIters(dataSources())))" in terms of
wrapping.  Data is read from the outer sourceSwitchingIter.  The outer
sourceSwitchingIter could possibly rebuild the inner
"userIters(systemIters(dataSources()))" that it wraps after they return a
key value.


>
> Another condition is that Accumulo buffers scan data for scan[5] and batch
> scan[6] and when the buffer fills up it will send the batch of key values
> back to the client.  When the client gets the batch and requests another
> batch that will create a new iterator stack, unless it's an isolated
> scan[7]. When getting the next batch I think Accumulo will create a new
> Range where the first key is the last key seen non inclusive.
>
> Another condition is when tserver dies mid scan.
>
> [1]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/SourceSwitchingIterator.java#L45
> [2]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java#L56
> [3]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java#L158
> [4]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java#L113-L115
> [5]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/server/tserver/src/main/java/org/apache/accumulo/tserver/scan/NextBatchTask.java#L78
> [6]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/server/tserver/src/main/java/org/apache/accumulo/tserver/scan/LookupTask.java#L77
> [7]:
> https://github.com/apache/accumulo/blob/d4846d407e5b28482394e2c0baa16932ae35e086/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Scanner.java#L101-L110
>
>
>
>>
>> Thanks,
>>
>> - Logan
>>
>> On Wed, Jul 5, 2023 at 7:13 PM Keith Turner <ke...@deenlo.com> wrote:
>>
>> > There are two options for this.  One is to buffer the row in memory and
>> > encode it in your iterator like the whole row iterator does.  The other
>> is
>> > to use the isolated scanner[1][2], but this does not work for batch
>> scans.
>> >
>> > Accumulo should not tear iterators down until after they return
>> something,
>> > this is the behavior that the wholerowiterator relies on.  So if your
>> > iterator reads the entire row from its source iterator without returning
>> > anything then Accumulo will not do anything to the iterator or its data
>> > sources.  An iterators data sources are the files and optionally a
>> snapshot
>> > of the in memory map.   After the top level iterator has returned a key
>> > value, its possible that Accumulo could rebuild the iterator stack with
>> new
>> > data sources (like new files that arrived or a new snapshot of the in
>> > memory map).  This means you can use the trick of having the top level
>> > iterator not return anything until a row boundary is seen.
>> >
>> > For isolated scans Accumulo will only tear down iterators and use new
>> data
>> > sources on row boundaries.   Enabling isolation on scanner[2] will cause
>> > the scanner to throw an isolation exception if a tablet server dies
>> while
>> > the client scanner is in the middle of reading a row.  The
>> > IsolatedScanner[3] wraps a scanner and hides the isolation exception by
>> > buffering rows and rereading them when an isolation exception occurs,
>> > making it easy to use isolated scans.
>> >
>> > The wholerowiterator handles a tablet server dying or data source
>> changing
>> > well because it encodes the entire row as a single key value, so if the
>> > client gets it then it will not request that row again.
>> >
>> > [1]:
>> >
>> >
>> https://accumulo.apache.org/docs/2.x/apidocs/org/apache/accumulo/core/client/Scanner.html#enableIsolation()
>> > [2]:
>> >
>> >
>> https://accumulo.apache.org/docs/2.x/apidocs/org/apache/accumulo/core/client/IsolatedScanner.html
>> >
>> >
>> >
>> > On Wed, Jul 5, 2023 at 10:53 AM Logan Jones <lo...@codescratch.com>
>> wrote:
>> >
>> > > Hello Mailing List:
>> > >
>> > > I have an iterator that will scan an entire table and only return keys
>> > that
>> > > match these criteria:
>> > >
>> > >    1. If a specific CF is "invalid" according to some criteria
>> > >    2. If a specific CF is missing on a row
>> > >    3. If there are multiple entries for a specific CF
>> > >
>> > > #1 would be easy to accomplish with a Filter, however #2 and #3 have
>> > proven
>> > > to be more tricky. As I understand the problem, Accumulo can, at any
>> > point,
>> > > destroy an iterator and re-call init. I am keeping some internal state
>> > > related to a row (namely a count of how many times I've seen that
>> > specific
>> > > CF).
>> > >
>> > > How can I keep the state I need for an entire row?
>> > >
>> > > I've looked at the RowEncodingIterator along with the
>> WholeRowIterator,
>> > but
>> > > based on my understanding, it feels like Accumulo should be allowed to
>> > > destroy their state at any time and cause them to effectively break.
>> Is
>> > > there a guarantee that an iterator won't get destroyed mid row?
>> > >
>> > > Thanks,
>> > >
>> > > - Logan
>> > >
>> >
>>
>

Re: Maintaining State in an Iterator

Reply via email to