OK - thanks for the update.

So, just to see if I understand - except in cases of failure, regular ole 
Scanners will provide a consistent view of atomic mutations, and if consistent 
rows are required in the presence of failures then one should use the 
IsolatedScanner which includes restart semantics upon detection of a failure 
that could threaten row consistency?

On Dec 21, 2011, at 12:24 PM, Adam Fuchs wrote:

> We have a bunch of rows that don't fit into memory when using some of the
> table design patterns we like to use on Accumulo. Having row-level
> isolation without requiring rows to fit in memory was important to us.
> However, this is not trivial, especially under failures.
> 
> The basic technique we use involves keeping a mutation counter for all
> active scans on a tablet, writing the mutation counter with entries in the
> in-memory map, and keeping all of the data we need to provide a snapshot
> isolation view for the existing scans. The tricky part here is that if a
> tablet server fails then the recovery of a tablet on another tablet server
> doesn't include a recovery of the list of active scans. The tablet server
> might decide to minor compact, and the data needed to provide the row-level
> snapshot-isolation view might be lost when the entries flow through the
> iterator tree.
> 
> We allow for many ways of dealing with this isolation fault. The Scanner
> ignores it by default. Users can also turn on the isolation exception via
> Scanner.enableIsolation(), resulting in the possibility of an
> IsolationException (subclass of RuntimeException) being thrown by the
> ScannerIterator. The IsolatedScanner wraps a Scanner, enables isolation on
> that scanner, buffers rows on the client side (possibly on disk), and can
> handle the IsolationException by restarting at the beginning of a row.
> Handling isolation without buffering is also possible by using a checkpoint
> and restart design that propagates through the application code, so we
> wanted to support that behavior by letting applications handle the
> exception in their own way.
> 
> Sorry about the lack of documentation! We'll get working on it.
> 
> Adam
> 
> 
> On Wed, Dec 21, 2011 at 11:45 AM, Aaron Cordova <[email protected]> wrote:
> 
>> I'm looking over the IsolatedScanner and wondering, since you've all
>> probably thought more about it than I, whether loading a row entirely into
>> memory is required to provide row isolation, or whether it simply makes it
>> easier to implement.
>> 
>> The BigTable paper says it makes the rows in the memtable copy-on-write.
>> Does this imply copying the entire row into memory first? That would seem
>> to make read-modify-write operations simpler, but it doesn't seem a
>> necessary condition for just writes ...
>> 
>> In the future, is the intention to provide row-isolation upon request (via
>> using the IsolatedScanner), thereby making non-atomic reads (via the
>> Scanner) the default?
>> 
>> Aaron

Reply via email to