On Monday, 29 September 2014 at 03:04:11 UTC, Walter Bright wrote:
You've clearly got a tough job to do, and I understand you're doing the best you can with it. I know I'm hardcore and uncompromising on this issue, but that's where I came from (the aviation industry).

I know what works (airplanes are incredibly safe) and what doesn't work (Toyota's approach was in the news not too long ago). Deepwater Horizon and Fukushima are also prime examples of not dealing properly with modest failures that cascaded into disaster.

Do you interpret airplane safety right? As I understand, airplanes are safe exactly because they recover from assert failures and continue operation. Your suggestion is when seat 2A creaks, shut down the whole airplane. In reality airplanes continue to operate until there's zero physical resource to operate. Fukushima caused disaster because it didn't try to handle failure. But this is your idea that one can do nothing meaningful on failure, and Fukushima did just that: nothing.

Termination of the process is the safe default, especially in the case of client software, but servers should probably terminate failed request, gracefully clean up and continue operation, like airplanes.

Reply via email to