On 10/31/2014 2:31 PM, H. S. Teoh via Digitalmars-d wrote:
On Fri, Oct 31, 2014 at 09:11:53PM +0000, Kagamin via Digitalmars-d wrote:
On Friday, 31 October 2014 at 20:33:54 UTC, H. S. Teoh via Digitalmars-d
wrote:
You are misrepresenting Walter's position. His whole point was that
once a single component has detected a consistency problem within
itself, it can no longer be trusted to continue operating and
therefore must be shutdown. That, in turn, leads to the conclusion
that your system design must include multiple, redundant, independent
modules that perform that one function. *That* is the real answer to
system reliability.

In server software such component is a transaction/request. They are
independent.

You're using a different definition of "component". An inconsistency in
a transaction is a problem with the input, not a problem with the
program logic itself. If something is wrong with the input, the program
can detect it and recover by aborting the transaction (rollback the
wrong data). But if something is wrong with the program logic itself
(e.g., it committed the transaction instead of rolling back when it
detected a problem) there is no way to recover within the program
itself.


Pretending that a failed component can somehow fix itself is a
fantasy.

Traditionally a failed transaction is indeed rolled back. It's more a
business logic requirement because a partially completed operation
would confuse the user.

Again, you're using a different definition of "component".

A failed transaction is a problem with the data -- this is recoverable
to some extent (that's why we have the ACID requirement of databases,
for example). For this purpose, you vet the data before trusting that it
is correct. If the data verification fails, you reject the request. This
is why you should never use assert to verify data -- assert is for
checking the program's own consistency, not for checking the validity of
data that came from outside.

A failed component, OTOH, is a problem with program logic. You cannot
recover from that within the program itself, since its own logic has
been compromised. You *can* rollback the wrong changes made to data by
that malfunctioning program, of course, but the rollback must be done by
a decoupled entity outside of that program. Otherwise you might end up
causing even more problems (for example, due to the compromised /
malfunctioning logic, the program commits the data instead of reverting
it, thus turning an intermittent problem into a permanent one).

This is a good summation of the situation.

Reply via email to