Alvaro Herrera <alvhe...@commandprompt.com> writes:
> Today we got a report in the spanish list about the message in $subject.
> The server is 8.4 running on Windows.

I accidentally managed to reproduce this in HEAD just now, by kill -9'ing
a backend that was in the midst of a COPY IN operation (I was trying to
reproduce Neil Best's unrelated issue...)  The server log is

LOG:  server process (PID 23846) was terminated by signal 9
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted; last known up at 2009-08-07 11:27:36 EDT
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 0/1B9D7790
LOG:  unexpected pageaddr 0/1532E000 in log file 0, segment 28, offset 3334144
LOG:  redo done at 0/1C32D200
PANIC:  cannot make new WAL entries during recovery
LOG:  startup process (PID 23883) was terminated by signal 6
LOG:  aborting startup due to startup process failure

and the stack trace of the panic'd startup process looks like

#4  0x4b6e20 in errfinish (dummy=1) at elog.c:503
#5  0x4b86a0 in elog_finish (elevel=1073803952, fmt=0x7b0394b0 "") at 
elog.c:1142
#6  0x1f722c in XLogInsert (rmid=11 '\013', info=114 'r', rdata=0xc004d07c) at 
xlog.c:555
#7  0x1df290 in _bt_insertonpg (rel=0x4006cf28, buf=70, stack=0x3, 
itup=0x4006d150, newitemoff=38, 
    split_only_page=0) at nbtinsert.c:833
#8  0x1e0898 in _bt_insert_parent (rel=0x4006cf28, buf=304, rbuf=854, 
stack=0x7b03b9d8, is_root=0, is_only=0)
    at nbtinsert.c:1627
#9  0x1ef098 in btree_xlog_cleanup () at nbtxlog.c:927
#10 0x201c44 in StartupXLOG () at xlog.c:5767
#11 0x206134 in StartupProcessMain () at xlog.c:8034
#12 0x228d0c in AuxiliaryProcessMain (argc=2, argv=0x7b03b6d8) at 
bootstrap.c:433
#13 0x39bb68 in StartChildProcess (type=StartupProcess) at postmaster.c:4243

So that confirms my speculation that btree index cleanup is the source
of the message.  We have two basic approaches to dealing with it:

1. Decide that the check added to XLogInsert is wrong and take it out.

2. Arrange for some sort of explicit state transition between the
WAL-reading and cleanup phases of recovery, and make sure XLogInsert
knows about it.

Thoughts?

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to