[BUGS] Postgresql 8.4.1 segfault, backtrace

Richard Neill Wed, 23 Sep 2009 23:21:49 -0700

Dear All,

I've just upgraded from 8.4.0 to 8.4.1 because of a segfault in 8.4, andwe've found that this is still happening repeatedly in 8.4.1. We're in abit of a bind, as this is a production system, and we get segfaultsevery few hours.

[It's a testament to how good the postgres crash recovery is that, witha reasonably small value of checkpoint_segments = 4, recovery happens in30 seconds, and the warehouse systems seem to continue OK].

The version I'm using is 8.4.1, in the source package provided forUbuntu Karmic, compiled by me on a 64-bit server (running Ubuntu Jaunty).

I'm not sufficiently expert to debug it very far, but I wonder whetherthe following info from GDB would help one of the hackers here (I'vetrimmed out the uninteresting bits):


------------
$ gdb /usr/lib/postgresql/8.4/bin/postgres core.200909030901
GNU gdb 6.8-debian

This GDB was configured as "x86_64-linux-gnu"...

Core was generated by `postgres: fensys fswcs [local] startup'.

Program terminated with signal 11, Segmentation fault.
[New process 14965]
#0  RelationCacheInitializePhase2 () at relcache.c:2654

2654 if (relation->rd_rel->relhasrules &&relation->rd_rules == NULL)

(gdb) bt
#0  RelationCacheInitializePhase2 () at relcache.c:2654

#1 0x00007f61355a1021 in InitPostgres (in_dbname=0x7f613788c610"fswcs", dboid=0, username=0x7f6137889450 "fensys", out_dbname=0x0) atpostinit.c:576#2 0x00007f61354dbcc5 in PostgresMain (argc=4, argv=0x7f6137889480,username=0x7f6137889450 "fensys") at postgres.c:3334

#3  0x00007f61354aefdd in ServerLoop () at postmaster.c:3447

#4 0x00007f61354afecc in PostmasterMain (argc=5, argv=0x7f6137885140)at postmaster.c:1040

#5  0x00007f61354568ce in main (argc=5, argv=0x7f6137885140) at main.c:188
(gdb) quit
-------------

A few more bits of info:

The backtrace points to line 2654 in relcache.c, in
  RelationCacheInitializePhase2()

There is a NULL dereference of "relation"

 => needNewCacheFile = false
    criticalRelcachesBuilt = true

=> nothing is happening before it enters the failure code block.

I can give you a core dump if anyone would like to see it, but it's 405MB after bzipping.

One last observation: a dump and restore of the DB seems to prevent itcrashing for about a day.


Thank you for your help,

Richard

--
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] Postgresql 8.4.1 segfault, backtrace

Reply via email to