I wrote:
> ... But Robert's question remains: how does
> PANIC'ing after awhile make anything better?  I flat out don't
> believe the idea that having a backend stuck on a spinlock
> would otherwise go undetected.

Oh, wait.  After thinking a bit longer I believe I recall the argument
for this behavior: it automates recovery from a genuinely stuck
spinlock.  If we waited forever, the only way out of that is for a
DBA to kill -9 the stuck process, which has exactly the same end
result as a PANIC, except that it takes a lot longer to put the system
back in service and perhaps rousts somebody, or several somebodies,
out of their warm beds to fix it.  If you don't have a DBA on-call
24x7 then that answer looks even worse.

So there's that.  But that's not an argument that we need to be in a
hurry to timeout; if the built-in reaction time is less than perhaps
10 minutes you're still miles ahead of the manual solution.

On the third hand, it's still true that we have no comparable
behavior for any other source of system lockups, and it's difficult
to make a case that stuck spinlocks really need more concern than
other kinds of bugs.

                        regards, tom lane


Reply via email to