OK,

So what am I doing wrong here?

Installed PG 8.3.7 on Slave machine

Restored from last evening's backup from the master DB to make the rsync across 
the network finish sooner.

Shut down the PG instance on the slave machine

Ran a script that does the following:

select pg_start_backup('Master_Backup');
rsync -rvlpogtz ${masterdb}/* ${slave_dbus...@${slave_host}:${slavedb}
select pg_stop_backup();
ssh ${slave_dbus...@${slave_host} rm ${slavedb}/postmaster.pid 2>/dev/null
ssh ${slave_dbus...@${slave_host} rm ${slave_backup_path}/0* 2>/dev/null
ssh ${slave_dbus...@${slave_host} ${PSQL_BIN}/pg_ctl -D ${slavedb} -l logfile 
start

When the slave PG database attempts to come up in recovery mode, it aborts 
because it is looking for a log file that is extremely old and does not exist 
on the master DB server.  I believe the Master PG instance was restarted on 
12/28/09 and has been running ever since.  Is there a way to reset the "last 
completed transaction" on a DB?  Why is PG looking so far back for a WAL log to 
begin recovery when so much has been done since the 28th including daily 
backups?

<2010-01-07 10:54:23 MST>LOG:  received immediate shutdown request
/mxl/var/pgsql/data/ha_copy.sh: line 103: 13976 Quit                    sleep 5
File /mxl/var/pgsql/data/stopslave found. Aborting Process.
<2010-01-07 10:54:28 MST>LOG:  could not open file 
"pg_xlog/00000001000000F6000000E9" (log file 246, segment 233): N
o such file or directory
<2010-01-07 10:54:28 MST>LOG:  redo done at F6/E8FFE378
<2010-01-07 10:54:28 MST>LOG:  last completed transaction was at log time 
2009-12-28 10:18:04.893307-07
Waiting for log: 00000001000000F6000000E8
<2010-01-07 11:24:49 MST>FATAL:  could not restore file 
"00000001000000F6000000E8" from archive: return code 15

Again, nothing was changed with the scripts or the replication process and this 
worked just fine under 8.1.4.

Thanks!



On 1/8/10 8:10 AM, "Keaton Adams" <kad...@mxlogic.com> wrote:

I did find some references to a fix of last-completed transaction time and I 
looked in the postgresql-bugs archive, but I'm not having any luck confirming 
that this is a problem in 8.3.7 and an upgrade to 8.3.9 would fix the issue.

postgresql 8.3.7 .... Fix incorrect logging of last-completed-transaction time 
during PITR ..... Last transaction end time is now logged at end of recovery 
and at each logged restart point (Simon) ...


On 1/7/10 12:53 PM, "Keaton Adams" <kad...@mxlogic.com> wrote:

We had WAL Log shipping (warm standby) working fine under 8.1.4 but under 8.3.7 
we can't get the slave to come up properly.  Nothing has changed in our process 
with regard to start_backup, rsync, stop_backup, bring up the warm standby 
server in continuous recovery mode, but the failover DB won't start with the 
following error:

<2010-01-07 10:54:23 MST>LOG:  received immediate shutdown request
/mxl/var/pgsql/data/ha_copy.sh: line 103: 13976 Quit                    sleep 5
File /mxl/var/pgsql/data/stopslave found. Aborting Process.
<2010-01-07 10:54:28 MST>LOG:  could not open file 
"pg_xlog/00000001000000F6000000E9" (log file 246, segment 233): No such file or 
directory
<2010-01-07 10:54:28 MST>LOG:  redo done at F6/E8FFE378
<2010-01-07 10:54:28 MST>LOG:  last completed transaction was at log time 
2009-12-28 10:18:04.893307-07
Waiting for log: 00000001000000F6000000E8
<2010-01-07 11:24:49 MST>FATAL:  could not restore file 
"00000001000000F6000000E8" from archive: return code 15

The log file in reference is very old and is not on the Master PG server in 
pg_xlogs and the "last completed transaction" can't be right either.  Is this a 
bug or it is something we are doing wrong?

Thanks,

Keaton


psql (PostgreSQL) 8.3.7
contains support for command-line editing
RHEL 5 64 Bit
Linux ourservername 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 
x86_64 x86_64 GNU/Linux



Reply via email to