So, the reason I started the thread about postmaster dying on OOM is that somebody asked me on IM what could have caused a backend to die with this backtrace:
libc.so.1`_ndoprnt+0x14() libc.so.1`fprintf+0x11d() AllocSetStats+0x15d() MemoryContextStatsInternal+0x1c() MemoryContextStats+0xb() AllocSetAlloc+0x1c0() MemoryContextAllocZeroAligned+0x57() makeTypeNameFromNameList+0x20() SystemTypeName+0x40() base_yyparse+0xcd42() raw_parser+0x29() pg_parse_query+0x23() exec_simple_query+0x6d() PostgresMain+0xf6a() BackendRun+0x254() BackendStartup+0xf8() ServerLoop+0x116() PostmasterMain+0xd98() main+0x18a() 0x4e08ec() Postmaster only logged this one with 2009-04-06 16:33:48 EDT::@:[13741]: LOG: server process (PID 12146) was terminated by signal 11 and there's no indication of any activity from that process in the log at all. Several other processes seem to be exiting or terminating transactions with errno "Not enough space". His question was: is it possible that we're handing a NULL pointer to a %s on fprintf? The involved code looks like this: fprintf(stderr, "%s: %lu total in %ld blocks; %lu free (%ld chunks); %lu used\n", set->header.name, totalspace, nblocks, freespace, nchunks, totalspace - freespace); And since this is being called from AllocSetAlloc, which is always handed a complete memory context (and not something that has only been partially set), I think the answer is that it's not possible, and that the bug must be on libc which is perhaps not handling out-of-memory very cleanly in its fprintf implementation. Am I all wet? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers