We use postgresql 7.4 running on a modified redhat linux system as our database to store network related data. The tables have millions of rows and several joins on these tables are typically done in response to user queries. The database itself takes about 40Gb of disk space. Our application uses libpq++.
Recently I found what appears to be a postgres bug. The database was running fine and at some point stopped accepting connections. I logged onto the system to find 2 postmaster processes in S state. [root]# ps -aeuwx | grep post postgres 2080 0.0 0.0 26752 2360 ? S Jan28 0:00 postgres: postgres dbname 1.0.0.5 idle postgres 2081 0.0 0.0 26744 2380 ? S Jan28 0:00 postgres: postgres dbname 1.0.0.5 idle Both postmaster processes had the same stack trace on doing a gdb attach (see below) and they were both child processes of init (how could it have started twice ?). After shutting off all possible clients, I tried to do a postgresql stop. That didn't work. Neither did pg_ctl (using fast or immediate). Then a killall -9 postmaster followed by a postgresql start, got it to reading XLOGS for 5mins or so, after which it was back up without any loss/corruption of data. Any ideas ? Is it possible that our application (through libpq++) somehow caused postmaster to hang ? Thanks Prem. [root]# su -l postgres -s /bin/sh -c "/usr/bin/pg_ctl stop -D /var/lib/pgsql/data -s -m fast" /usr/bin/pg_ctl: line 274: kill: (1066) - No such process pg_ctl: postmaster does not shut down #0 0x18364c26 in recv () from /lib/libc.so.6 #1 0x080feaa5 in secure_read (port=0x82873b8,ptr=0x8240580, len=8192) at /root/src/postgres/src/backend/libpq/be-secure.c:304 #2 0x08103b83 in pq_recvbuf () at /root/src/postgres/src/backend/libpq/pqcomm.c:662 #3 0x08103c59 in pq_getbyte () at /root/src/postgres/src/backend/libpq/pqcomm.c:704 #4 0x0814c935 in SocketBackend (inBuf=0xbfffec10) at /root/src/postgres/src/backend/tcop/postgres.c:275 #5 0x0814cb17 in ReadCommand (inBuf=0xfffffe00) at /root/src/postgres/src/backend/tcop/postgres.c:397 #6 0x0814f018 in PostgresMain (argc=4, argv=0x8279590, username=0x8279560 "postgres") at /root/src/postgres/src/backend/tcop/postgres.c:2832 #7 0x0812f24b in BackendFork (port=0x82873b8) at /root/src/postgres/src/backend/postmaster/postmaster.c:2558 #8 0x0812ed3e in BackendStartup (port=0x82873b8) at /root/src/postgres/src/backend/postmaster/postmaster.c:2201 #9 0x0812d5ff in ServerLoop () at /root/src/postgres/src/backend/postmaster/postmaster.c:1113 #10 0x0812cfa4 in PostmasterMain (argc=4, argv=0x82786e8) at /root/src/postgres/src/backend/postmaster/postmaster.c:891 #11 0x08104d74 in main (argc=4, argv=0xbffffb94) at /root/src/postgres/src/backend/main/main.c:214 ---Type <return> to continue, or q <return> to quit--- #12 0x182a55cd in __libc_start_main () from /lib/libc.so.6 __________________________________ Do you Yahoo!? Yahoo! Finance: Get your refund fast by filing online. http://taxes.yahoo.com/filing.html ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html