[OT] 1%, two-phase commits, etc.

Eric Strovink Wed, 06 Dec 2000 15:17:01 -0800
"Jeffrey W. Baker" wrote:

> Machine A is controlling a transaction across Machine X and Machine Y.  A
> modifies a row in X and adds a row to Y.  A commits X, which succeeds.  A
> commits Y, which fails.
>
> A cannot guarantee a recovery on machine X because there might already be
> other transactions in flight on that record in that database.  A cannot
> just try to put the record back the way it used to be, because now the
> commit might fail on X.  The data is inconsistent.

As a couple others have noted, two phase (prepare-commit) commits solve the above
problem.  But two phase commits are not a panacea.  They just move Jeffrey's
problem elsewhere.  Suppose A (in a two phase commit implementation, the
transaction coordinator) prepares X and Y, but then dies just after committing X,
but before committing Y.  How does X know that Y hasn't committed, and that he
should roll back?  He doesn't, unless we cons up some magic secondary
communication channel between X and Y.

One way to recover from the "A dies after telling X to commit" problem is to
shadow A with a mirror, A'.  A' takes over for A, interrogates the servers to
find out who has committed the last transaction and who hasn't, and if necessary
completes the transaction or rolls it back (assuming all writes are serialized
through {A, A'}, so there hasn't been any intervening activity to confuse
things).  Steady-state synchronization between A and A' is straightforward, as is
failure detection and recovery fall-over.  In fact, if this is properly
implemented, the transaction processing system can keep going without a hitch
through any one failure, and the distributed dataset stays consistent.

Unfortunately, in many transaction middleware systems, you discover in the fine
print that A' is actually a semi-automated recovery process under the control of
a human administrator.  Human?  That would be Dork, down the hall, the Certified
Microsoft Solutions Fuckwad.  Feel safe.

But let's go back to the example, and stipulate that a reasonable A' exists.  Are
we now 100% consistent 100% of the time?  No, in fact we're not.  Because after X
commits and A dies, but before A' realizes that A has died and patches things up,
any reader of X and Y could potentially see an inconsistent view of the data. Do
we therefore serialize our reads through the transaction monitor, too?  With a
distributed database, we have to, if we want a guaranteed-consistent view.  Of
course, we could choose not to, for performance reasons.

Does all of this make your head spin just a bit?  Hence Jeffrey's point.  There's
a lot of margin for error, and the more that's buried in mysterious middleware,
the less confident you should be.  If you can get away with a single server, you
dodge all these bullets.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
[OT] 1%, two-phase commits, etc.

Reply via email to