Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

Andres Freund Thu, 05 May 2016 13:23:46 -0700

Hi,

On 2016-05-05 15:56:45 -0400, Tom Lane wrote:
> Andres Freund <[email protected]> writes:
> >> #0  0x00000008014321d7 in sbrk () from /lib/libc.so.7
> >> #1  0x0000000801431ddd in sbrk () from /lib/libc.so.7
> >> #2  0x000000080142e5bb in sbrk () from /lib/libc.so.7
> >> #3  0x000000080142e085 in sbrk () from /lib/libc.so.7
> >> #4  0x000000080142de28 in sbrk () from /lib/libc.so.7
> >> #5  0x000000080142e1cf in sbrk () from /lib/libc.so.7
> >> #6  0x0000000801439815 in free () from /lib/libc.so.7
> >> #7  0x000000080149e3d6 in nsdispatch () from /lib/libc.so.7
> >> #8  0x00000008014a41c6 in __cxa_finalize () from /lib/libc.so.7
> >> #9  0x000000080144525c in exit () from /lib/libc.so.7
> >> #10 0x00000000008e1bc2 in quickdie (postgres_signal_arg=3) at 
> >> postgres.c:2623
> >> #11 <signal handler called>
> >> #12 0x0000000801431847 in sbrk () from /lib/libc.so.7
> 
> > That looks like independent issue, namely that we're trigger memory
> > allocations from a signal handler (see frames 12, 11, 10, 9). Presumably
> > due to system registered atexit handlers.  I suspect we should be using
> > _exit() here?  Tom?
> 
> I don't think that would improve matters.  In the first place, if we use
> _exit() here that might encourage third-party extension authors to believe
> they should use _exit(), which would be bad.


The sourcetree already has a number of _exit() calls, so I don't think
that'd make a meaningfull difference.


> In the second place, we don't know what it is we're skipping by not
> running atexit handlers, and again that could be bad.

I've a hard time coming up with a scenario where that'd be a problem in
a PANIC case. Isn't it pretty common to use _exit after fatal errors
(and forks)?


> In the third place, by the time we
> get to the exit() call we've already exposed ourselves to a whole lot of
> such hazards by running ereport() (including sending a message to the
> client!).

True. And that's not good. But the magic of ErrorContext shields us from
a fair amount of issues.


> In the fourth place, if we've received a quickdie interrupt,
> it doesn't actually matter if the process crashes; we just want it to
> quit ASAP.

If it always were crashing, that'd be somewhat fine. But sbrk internally
uses mutexes, so this can result in processes getting stuck. And that is
a problem.  There've actually been reports about that every now and then.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

Reply via email to