This is related to https://github.com/apache/fluo/issues/660

I've noticed this error crop up a handful of times on smaller development
clusters, but is happening increasingly on larger, bare metal clusters
(think hundreds of CPUs, terabytes of memory running dozens of workers).
It's difficult to reproduce without some manual agitation, but I noticed if
a transaction gets aborted in a few spots, it won't rollback any locks it
held. There are some steps in `commitAsync` that won't roll back anything
on failure, but it's also possible an error could abort that flow
prematurely. It's also possible that JVM failure in there would stop a
transaction in its place without any recovery/rollback.

I think, more importantly for my use case, it is possible that state will
raise an IllegalStateException which will kill the worker process and
restart it, meaning that all further writes/scans will fail if they
encounter a transaction in an UNKNOWN state.

I added a quick little step at the end of `DeleteLockStep` that has a 1%
chance of failing a transaction (
https://gist.github.com/wjsl/01000d7c3efe5cf271d47547e0320bd4). Eventually
I'll run into an error similar to the one described in #660. This blocks
pretty much all reads and writes into my cluster until I go in and remove
the underlying Accumulo keys that represent the lock graph.

What should we do in this scenario? The two things that jump out to me are:

1. Always rolling back locks on failure. This doesn't appear to happen in
some default implementations of BatchWriterStep (DeleteLocksStep,
WriteNotificationsStep).
 LockOtherStep also doesn't seem to handle unknowns given the comments. I
think this leads into #2.

2. Other transactions notice a dangling or dead transaction. If a JVM goes
away, how do I go about resolving/rolling back all the locks that the dead
transaction held? We clearly halt when we can't find the primary, but we
need to go through and resolve all the locks that are pointing to that
primary. Would this require a full table scan of the underlying Accumulo
table?

Part of our design may be an issue in that certain pieces of transactions
seem to update the same portions of a table (we keep a per-partition count
around), which could exacerbate this issue.

Any advice is appreciated!

Thanks,
Bill

Reply via email to