On Fri, Sep 12, 2008 at 7:41 PM, Simon Riggs <[EMAIL PROTECTED]> wrote:
>
> On Thu, 2008-09-11 at 18:17 +0300, Heikki Linnakangas wrote:
>> Fujii Masao wrote:
>> > I think that this case would often happen. So, we should establish a 
>> > certain
>> > solution or procedure to the case where TLI of the master doesn't match
>> > TLI of the slave. If we only allow the case where TLI of both servers is 
>> > the
>> > same, the configuration after failover always needs to get the base backup
>> > on the new master. It's unacceptable for many users. But, I think that it's
>> > the role of admin or external tools to copy history files to the slave from
>> > the master.
>>
>> Hmm. There's more problems than the TLI with that. For the original
>> master to catch up by replaying WAL from the new slave, without
>> restoring from a full backup, the original master must not write to disk
>> *any* WAL that hasn't made it to the slave yet. That is certainly not
>> true for asynchronous replication, but it also throws off the idea of
>> flushing the WAL concurrently to the local disk and to the slave in
>> synchronous mode.
>>
>> I agree that having to get a new base backup to get the old master catch
>> up with the new master sucks, so I hope someone sees a way around that.
>
> If we were going to recover from failed-over standby back to original
> master just via WAL logs we would need all of the WAL files from the
> point of failover. So you'd need to be storing all WAL file just in case
> the old master recovers. I can't believe doing that would be the common
> case, because its so impractical and most people would run out of disk
> space and need to delete WAL files.

No. The original master doesn't need all WAL files. It needs WAL file which
its pg_control points as latest checkpoint location and subsequent files.

> It should be clear that to make this work you must run with a base
> backup that was derived correctly on the current master. You can do that
> by re-copying everything, or you can do that by just shipping changed
> blocks (rsync etc). So I don't see a problem in the first place.

PITR doesn't always need a base backup. We can do PITR from the data
files just after crash if they aren't corrupted (i.e. not media crash).

As the situation demands, most users would like to choose the setup
procedure that bad influence on the cluster is smaller. They would choose
the procedure without a base backup if there are few WAL files to be
replayed. Meanwhile, they would use a base backup if the indispensable
WAL files have already been deleted. But, in that case, they might not take
new base backup and use old one (e.g. taken 2 days before).

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to