No, it's super frustrating. While I do the recovery, it says it reaches a
consistent recovery state, and i just cannot find a way how to convince pg
to stop at that state:



2013-08-02 09:23:25 GMT DEBUG:  postgres: PostmasterMain: initial
environment dump:
2013-08-02 09:23:25 GMT DEBUG:  -----------------------------------------
2013-08-02 09:23:25 GMT DEBUG:          PG_GRANDPARENT_PID=9077
2013-08-02 09:23:25 GMT DEBUG:          PGLOCALEDIR=/usr/share/locale
2013-08-02 09:23:25 GMT DEBUG:          PGSYSCONFDIR=/etc/postgresql-common
2013-08-02 09:23:25 GMT DEBUG:          LANG=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:          PWD=/var/lib/postgresql
2013-08-02 09:23:25 GMT DEBUG:          PGDATA=/var/lib/postgresql/9.1/main
2013-08-02 09:23:25 GMT DEBUG:          LC_COLLATE=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:          LC_CTYPE=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:          LC_MESSAGES=en_US.utf8
2013-08-02 09:23:25 GMT DEBUG:          LC_MONETARY=C
2013-08-02 09:23:25 GMT DEBUG:          LC_NUMERIC=C
2013-08-02 09:23:25 GMT DEBUG:          LC_TIME=C
2013-08-02 09:23:25 GMT DEBUG:  -----------------------------------------


2013-08-02 11:23:26 CEST DEBUG:  invoking IpcMemoryCreate(size=32399360)
2013-08-02 11:23:26 CEST DEBUG:  removing file "pg_notify/0000"
2013-08-02 11:23:26 CEST DEBUG:  max_safe_fds = 982, usable_fds = 1000,
already_open = 8
2013-08-02 11:23:26 CEST LOG:  database system was interrupted while in
recovery at log time 2013-07-29 11:45:24 CEST
2013-08-02 11:23:26 CEST HINT:  If this has occurred more than once some
data might be corrupted and you might need to choose an earlier recovery
target.
2013-08-02 11:23:26 CEST DEBUG:  restore_command = 'cp
/home/validad-pg-backups/pgmaster/wal_files/%f.gz /tmp/%f.gz && gunzip
/tmp/%f.gz && mv /tmp/%f %p'
2013-08-02 11:23:26 CEST DEBUG:  trigger_file =
'/var/lib/postgresql/9.1/main/stop_replication_trigger'
2013-08-02 11:23:26 CEST LOG:  starting archive recovery
2013-08-02 11:23:26 CEST DEBUG:  executing restore command "cp
/home/validad-pg-backups/pgmaster/wal_files/000000010000027A0000002C.gz
/tmp/000000010000027A0000002C.gz && gunzip /tmp/000000010000027A0000002C.gz
&& mv /tmp/000000010000027A0000002C pg_xlog/RECOVERYXLOG"
2013-08-02 11:23:26 CEST DEBUG:  forked new backend, pid=9090 socket=9
2013-08-02 11:23:26 CEST LOG:  incomplete startup packet
2013-08-02 11:23:26 CEST DEBUG:  shmem_exit(0): 0 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  proc_exit(0): 1 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  exit(0)
2013-08-02 11:23:26 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  proc_exit(-1): 0 callbacks to make
2013-08-02 11:23:26 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:26 CEST DEBUG:  server process (PID 9090) exited with exit
code 0
2013-08-02 11:23:26 CEST LOG:  restored log file "000000010000027A0000002C"
from archive
2013-08-02 11:23:26 CEST DEBUG:  got WAL segment from archive
2013-08-02 11:23:26 CEST DEBUG:  checkpoint record is at 27A/2CB77750
2013-08-02 11:23:26 CEST DEBUG:  redo record is at 27A/2CB77750; shutdown
TRUE
2013-08-02 11:23:26 CEST DEBUG:  next transaction ID: 0/381985248; next
OID: 1201662
2013-08-02 11:23:26 CEST DEBUG:  next MultiXactId: 130079; next
MultiXactOffset: 272843
2013-08-02 11:23:26 CEST DEBUG:  oldest unfrozen transaction ID: 197713560,
in database 331065
2013-08-02 11:23:26 CEST DEBUG:  transaction ID wrap limit is 2345197207,
limited by database with OID 331065
2013-08-02 11:23:26 CEST DEBUG:  resetting unlogged relations: cleanup 1
init 0
2013-08-02 11:23:26 CEST LOG:  redo starts at 27A/2CB777A8
2013-08-02 11:23:26 CEST DEBUG:  executing restore command "cp
/home/validad-pg-backups/pgmaster/wal_files/000000010000027A0000002D.gz
/tmp/000000010000027A0000002D.gz && gunzip /tmp/000000010000027A0000002D.gz
&& mv /tmp/000000010000027A0000002D pg_xlog/RECOVERYXLOG"
2013-08-02 11:23:27 CEST DEBUG:  forked new backend, pid=9098 socket=9
2013-08-02 11:23:27 CEST FATAL:  the database system is starting up
2013-08-02 11:23:27 CEST DEBUG:  shmem_exit(1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  proc_exit(1): 1 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  exit(1)
2013-08-02 11:23:27 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  proc_exit(-1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:27 CEST DEBUG:  server process (PID 9098) exited with exit
code 1
2013-08-02 11:23:27 CEST LOG:  restored log file "000000010000027A0000002D"
from archive
2013-08-02 11:23:27 CEST DEBUG:  got WAL segment from archive
2013-08-02 11:23:27 CEST DEBUG:  executing restore command "cp
/home/validad-pg-backups/pgmaster/wal_files/000000010000027A0000002E.gz
/tmp/000000010000027A0000002E.gz && gunzip /tmp/000000010000027A0000002E.gz
&& mv /tmp/000000010000027A0000002E pg_xlog/RECOVERYXLOG"
2013-08-02 11:23:27 CEST DEBUG:  forked new backend, pid=9105 socket=9
2013-08-02 11:23:27 CEST FATAL:  the database system is starting up
2013-08-02 11:23:27 CEST DEBUG:  shmem_exit(1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  proc_exit(1): 1 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  exit(1)
2013-08-02 11:23:27 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  proc_exit(-1): 0 callbacks to make
2013-08-02 11:23:27 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:27 CEST DEBUG:  server process (PID 9105) exited with exit
code 1
2013-08-02 11:23:28 CEST LOG:  restored log file "000000010000027A0000002E"
from archive
2013-08-02 11:23:28 CEST DEBUG:  got WAL segment from archive
2013-08-02 11:23:28 CEST LOG:  consistent recovery state reached at
27A/2E3F42E8
2013-08-02 11:23:28 CEST PANIC:  _bt_restore_page: cannot add item to page
2013-08-02 11:23:28 CEST CONTEXT:  xlog redo split_r: rel 1663/16405/797541
left 4743, right 18008, next 9681, level 0, firstright 194
2013-08-02 11:23:28 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:28 CEST LOG:  startup process (PID 9086) was terminated by
signal 6: Aborted
2013-08-02 11:23:28 CEST LOG:  terminating any other active server processes
2013-08-02 11:23:28 CEST DEBUG:  sending SIGQUIT to process 9092
2013-08-02 11:23:28 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:28 CEST DEBUG:  proc_exit(-1): 0 callbacks to make
2013-08-02 11:23:28 CEST DEBUG:  reaping dead processes
2013-08-02 11:23:28 CEST DEBUG:  shmem_exit(1): 3 callbacks to make
2013-08-02 11:23:28 CEST DEBUG:  proc_exit(1): 3 callbacks to make
2013-08-02 11:23:28 CEST DEBUG:  exit(1)
2013-08-02 11:23:28 CEST DEBUG:  shmem_exit(-1): 0 callbacks to make
2013-08-02 11:23:28 CEST DEBUG:  proc_exit(-1): 0 callbacks to make


now with debug5 setting.

I have like 3 different wal_files, but they were all generated by pg in
this way. The backup script actually always takes the original and uses
this as source to copy it to 3 different locations. But the sha1sum of the
files is always the same :(

I don't so much mind that, we have recovered some backups and replayed from
another standby system, but i need to understand why i cannot reactivate
this cluster on this state:

'2013-08-02 11:23:28 CEST LOG:  consistent recovery state reached at
27A/2E3F42E8'

I am posting to bugs, too, as i start thinking this is one.

lg,k

Reply via email to