[this time from the right address - apologies if we get a duplicate]
PostgreSQL Experts Inc has a client with a 9.0.2 streaming replication server that somehow becomes wedged after running for some time. The server is running as a warm standby, and the client's application tries to connect to both the master and the slave, accepting whichever lets it connect (hence hot standby is not turned on). Archive files are being shipped as well as WAL streaming. The symptom is that the recovery process blocks forever on a semaphore. We've crashed it and got the following backtrace: #0 0x0000003493ed5337 in semop () from /lib64/libc.so.6 #1 0x00000000005bd103 in PGSemaphoreLock (sema=0x2b14986aec38, interruptOK=1 '\001') at pg_sema.c:420 #2 0x00000000005de645 in LockBufferForCleanup () at bufmgr.c:2432 #3 0x0000000000463733 in heap_xlog_clean (lsn=<value optimized out>, record=0x1787e1c0) at heapam.c:4168 #4 heap2_redo (lsn=<value optimized out>, record=0x1787e1c0) at heapam.c:4858 #5 0x0000000000488780 in StartupXLOG () at xlog.c:6250 #6 0x000000000048a888 in StartupProcessMain () at xlog.c:9254 #7 0x00000000004a11ef in AuxiliaryProcessMain (argc=2, argv=<value optimized out>) at bootstrap.c:412 #8 0x00000000005c66c9 in StartChildProcess (type=StartupProcess) at postmaster.c:4427 #9 0x00000000005c8ab7 in PostmasterMain (argc=1, argv=0x17858bb0) at postmaster.c:1088 #10 0x00000000005725fe in main (argc=1, argv=<value optimized out>) at main.c:188 The platform is CentOS 5.5 x86-64, kernel version 2.6.18-194.11.4.el5 I'm not quite sure where to start digging. Has anyone else seen something similar? Our consultant reports having seen a similar problem elsewhere, at a client who was running hot standby on 9.0.1, but the problem did not recur, as it does fairly regularly with this client. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers