dsimcha wrote:
But the point is that redundancy is probably the **cheapest, most efficient** 
way
to get ultra-high reliability.

It also works incredibly well. Airliners use a dual path system, which means that no single failure can bring it down. If it didn't work, the skies would be raining airplanes.

Triple path systems add a great deal of cost, with only a negligible increase in safety.


Yes, cost matters even when people's lives are at stake.

If the FAA required a triple path system in airliners, likely the airplane would be so heavy it could barely lift itself, let alone have a payload.


If, on the other hand, you have redundancy, you're at least as strong as your
strongest link because only one system needs to work.  Assuming the systems were
designed by independent teams and have completely different weak points, we can
assume their failures are statistically independent.  Assume we have m redundant
systems each with probability p_s of failing in some way or another.  Then the
probability of the whole thing failing is:

product_i = 1 to m(p_s), or the probability that ALL of your redundant systems
fail.  Assuming they're all decent and designed independently, with different 
weak
points, they probably aren't going to fail at the same time.

For example, if each redundant system really sucks and has a 5% chance of 
failure,
then the probability that they both fail and you're up the creek is only 0.25%.

Yup. The important thing to make this work is to avoid coupling, where a failure in one path causes the other path to fail as well. This can be tricky to get right.

Coupling is why a process attempting to diagnose and fix its own bugs is doomed.

Reply via email to