On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
For example:

int[3] arr;
arr[3] = 5;


Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served.

In this case it is fairly obvious where the bad index is coming from... but in general it is impossible to say.

So how much of your program is mad?

You need to reset to some safe / correct point to continue.

Which point?

It is impossible for the compiler to determine that.

Personally I would say the design fault is trying to build _everything_ into a single OS process.

The mechanism that is guaranteed, enforced by the hardware, to recover all resources and reset to a sane point is OS process exit.

ie. If you need "bug" tolerance, decompose your system into multiple processes. This actually has a large number of other benefits. (eg. Automagically concurrent)

Of course, you then need to encode some common sense in the harness... if something keeps on starting up and dying within a very short period of time.... stop restarting it.

Of course, this is just one (of many) ways that a program bug can screw up a system. For example it can start chewing way too many resources.

So your harness needs to be able to limit that.

And of course if you are going to decompose in processes, a process may spawn many more, so you need to shepherd all the subprocesses sanely.....

...and start the herd of processes in appropriate order, and shut them down appropriately....

Sounds like quite an intelligent harness...

Fortunately one exists and has really carefully thought through all these issues.

It's called systemd and works very well.

Reply via email to