[HACKERS] Postgres server goes in recovery mode repeteadly

2009-09-29 Thread kunal sharma
Hi ,
We are using Postgres 8.4 and its been found going into recovery
mode couple of times. The server process seems to fork another child process
which is another postgres server running under same data directory and after
some time it goes away while the old server is still running. There were few
load issues on the server but the load didnt went above 32.

   We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
Checking the logs I found that theres a segmentation fault ,


Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0030
rip 0066ba8c rsp 7fffd364da30 error 4

gdb dump shows this

Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
0x2ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
(gdb)


  Any suggestions what is causing this segmentation fault?


Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-09-29 Thread kunal sharma
gdb backtrce-


(gdb) bt full
#0  0x2ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
No symbol table info available.
#1  0x005a39bc in ServerLoop () at postmaster.c:1304
timeout = {tv_sec = 55, tv_usec = 352000}
rmask = {fds_bits = {24, 0 repeats 15 times}}
selres = value optimized out
readmask = {fds_bits = {24, 0 repeats 15 times}}
nSockets = 5
now = 1254241068
last_touch_time = 1254238950
__func__ = ServerLoop
#2  0x005a4dba in PostmasterMain (argc=3, argv=0xb1e3d0) at
postmaster.c:1040
fpidfile = (FILE *) 0x3
opt = value optimized out
status = value optimized out
userDoption = 0x1 Address 0x1 out of bounds
__func__ = PostmasterMain
#3  0x00553b5e in main (argc=3, argv=0xb1e3d0) at main.c:188
No locals.
(gdb)

2009/9/29 Andrew Dunstan and...@dunslane.net



 kunal sharma wrote:

 Hi ,
We are using Postgres 8.4 and its been found going into recovery
 mode couple of times. The server process seems to fork another child process
 which is another postgres server running under same data directory and after
 some time it goes away while the old server is still running. There were few
 load issues on the server but the load didnt went above 32.

   We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
 Checking the logs I found that theres a segmentation fault ,

 Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0030
 rip 0066ba8c rsp 7fffd364da30 error 4

 gdb dump shows this

 Reading symbols from /lib64/libdl.so.2...done.
 Loaded symbols for /lib64/libdl.so.2
 Reading symbols from /lib64/libm.so.6...done.
 Loaded symbols for /lib64/libm.so.6
 Reading symbols from /lib64/libc.so.6...done.
 Loaded symbols for /lib64/libc.so.6
 Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
 Loaded symbols for /lib64/ld-linux-x86-64.so.2
 Reading symbols from /lib64/libnss_files.so.2...done.
 Loaded symbols for /lib64/libnss_files.so.2
 0x2ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
 (gdb)





 Please try to get a backtrace from gdb.

 cheers

 andrew