Tom Lane wrote:
Right.  Depending on your OS you may be able to catch a signal that
would kill a thread and keep it from killing the whole process, but
this still leaves you with a process memory space that may or may not
be corrupted.  Continuing in that situation is not cool, at least not
according to the Postgres project's notions of reliable software design.

There can't be any "may or may not" involved. You must of course know what went wrong.

It is very common that you either get a null pointer exception (attempt to access address zero), that your stack will hit a write protected page (stack overflow), or that you get some sort of arithemtic exception. These conditions can be trapped and gracefully handled. The signal handler must be able to check the cause of the exception. This usually involves stack unwinding and investingating the state of the CPU at the point where the signal was generated. The process must be terminated if the reason is not a recognized one.

Out of memory can be managed using thread local allocation areas (similar to MemoryContext) and killing a thread based on some criteria when no more memory is available. A criteria could be the thread that encountered the problem, the thread that consumes the most memory, the thread that was least recently active, or something else.

It should be pointed out that when we get a hard backend crash, Postgres
will forcibly terminate all the backends and reinitialize; which means
that in terms of letting concurrent sessions keep going, we are not any
more forgiving than a single-address-space multithreaded server.  The
real bottom line here is that we have good prospects of confining the
damage done by the failed process: it's unlikely that anything bad will
happen to already-committed data on disk or that any other sessions will
return wrong answers to their clients before we are able to kill them.
It'd be a lot harder to say that with any assurance for a multithreaded
server.

I'm not sure I follow. You will be able to bring all threads of one process to a halt much faster than you can kill a number of external processes. Killing the multithreaded process is more like pulling the plug.

Regards,
Thomas Hallgren

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match

Reply via email to