On 31.10.2010 23:31, Greg Smith wrote:
LOG: replication connection authorized: user=rep host=127.0.0.1 port=52571
FATAL: requested WAL segment 000000010000000000000000 has already been
removed
Which is confusing because that file is certainly on the master still,
and hasn't even been considered archived yet much less removed:
[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog
-rw------- 1 master master 16777216 Oct 31 16:29 000000010000000000000000
drwx------ 2 master master 4096 Oct 4 12:28 archive_status
[mas...@pyramid pg_log]$ ls -l $PGDATA/pg_xlog/archive_status/
total 0
So why isn't SR handing that data over? Is there some weird unhandled
corner case this exposes, but that wasn't encountered by the systems the
tutorial was tried out on?
Yes, indeed there is a corner-case bug when you try to stream the very
first WAL segment, with log==seg==0. We keep track of the last removed
WAL segment, and before a piece of WAL is sent to the standby, walsender
checks that the requested WAL segment is > the last removed. Before any
WAL segments have been removed since postmaster startup, the latest
removed segment is initialized to 0/0, with the idea that 0/0 precedes
any valid WAL segment. That's clearly not true though, it does not
precede the very first WAL segment after initdb, 0/0.
Seems that we need to change the meaning of the last removed WAL segment
to avoid the ambiguity of 0/0. Let's store the (last removed)+1 in the
global variable instead.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers