On 2013-04-10 20:39:25 +0200, Boszormenyi Zoltan wrote: > 2013-04-10 18:46 keltezéssel, Fujii Masao írta: > >On Wed, Apr 10, 2013 at 11:16 PM, Andres Freund <and...@2ndquadrant.com> > >wrote: > >>On 2013-04-10 10:10:31 -0400, Tom Lane wrote: > >>>Amit Kapila <amit.kap...@huawei.com> writes: > >>>>On Wednesday, April 10, 2013 3:42 PM Samrat Revagade wrote: > >>>>>Sorry, this is incorrect. Streaming replication continuous, master is not > >>>>>waiting, whenever the master writes the data page it checks that the WAL > >>>>>record is written in standby till that LSN. > >>>>I am not sure it will resolve the problem completely as your old-master > >>>>can > >>>>have some WAL extra then new-master for same timeline. I don't remember > >>>>exactly will timeline switch feature > >>>>take care of this extra WAL, Heikki can confirm this point? > >>>>Also I think this can serialize flush of data pages in checkpoint/bgwriter > >>>>which is currently not the case. > >>>Yeah. TBH this entire discussion seems to be "let's cripple performance > >>>in the normal case so that we can skip doing an rsync when resurrecting > >>>a crashed, failed-over master". This is not merely optimizing for the > >>>wrong thing, it's positively hazardous. After a fail-over, you should > >>>be wondering whether it's safe to resurrect the old master at all, not > >>>about how fast you can bring it back up without validating its data. > >>>IOW, I wouldn't consider skipping the rsync even if I had a feature > >>>like this. > >>Agreed. Especially as in situations where you fall over in a planned > >>way, e.g. for a hardware upgrade, you can avoid the need to resync with > >>a littlebit of care. > >It's really worth documenting that way. > > > >>So its mostly in catastrophic situations this > >>becomes a problem and in those you really should resync - and its a good > >>idea not to use a normal rsync but a rsync --checksum or similar. > >If database is very large, rsync --checksum takes very long. And I'm > >concerned > >that most of data pages in master has the different checksum from those in > >the > >standby because of commit hint bit. I'm not sure how rsync --checksum can > >speed up the backup after failover.
Its not about speed, its about correctness. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers