>What Samrat is proposing here is that WAL is not flushed to the OS before
>it is acked by a synchronous replica so recovery won't go past the >timeline change made in failover, making it necessary to take a new >base backup to resync with the new master. Actually we are proposing that the data page on the master is not committed till master receives ACK from the standby. The WAL files can be flushed to the disk on both the master and standby, before standby generates ACK to master. The end objective is the same of avoiding to take base backup of old master to resync with new master. >Why do you think that the inconsistent data after failover happens is >problem? Because >it's one of the reasons why a fresh base backup is required when >starting old master as >new standby? If yes, I agree with you. I've often heard the complaints >about a backup >when restarting new standby. That's really big problem. Yes, taking backup is major problem when the database size is more than several TB. It would take very long time to ship backup data over the slow WAN network. >> One solution to avoid this situation is have the master send WAL records to standby and wait for ACK from standby committing WAL files to disk and only after that commit data page related to this transaction on master. >You mean to make the master wait the data page write until WAL has been not only >flushed to disk but also replicated to the standby? Yes. Master should not write the data page before corresponding WAL records have been replicated to the standby. The WAL records have been flushed to disk on both master and standby. >> The main drawback would be increased wait time for the client due to extra round trip to standby before master sends ACK to client. Are there any other issues with this approach? >I think that you can introduce GUC specifying whether this extra check >is required to avoid a backup when failback That would be better idea. We can disable it whenever taking a fresh backup is not a problem. Regards, Samrat On Mon, Apr 8, 2013 at 10:40 PM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Mon, Apr 8, 2013 at 7:34 PM, Samrat Revagade > <revagade.sam...@gmail.com> wrote: > > > > Hello, > > > > We have been trying to figure out possible solutions to the following > problem in streaming replication Consider following scenario: > > > > If master receives commit command, it writes and flushes commit WAL > records to the disk, It also writes and flushes data page related to this > transaction. > > > > The master then sends WAL records to standby up to the commit WAL > record. But before sending these records if failover happens then, old > master is ahead of standby which is now the new master in terms of DB data > leading to inconsistent data . > > Why do you think that the inconsistent data after failover happens is > problem? Because > it's one of the reasons why a fresh base backup is required when > starting old master as > new standby? If yes, I agree with you. I've often heard the complaints > about a backup > when restarting new standby. That's really big problem. > > The timeline mismatch after failover was one of the reasons why a > backup is required. > But, thanks to Heikki's recent work, that's solved, i.e., the timeline > mismatch would be > automatically resolved when starting replication in 9.3. So, the > remaining problem is an > inconsistent database. > > > One solution to avoid this situation is have the master send WAL records > to standby and wait for ACK from standby committing WAL files to disk and > only after that commit data page related to this transaction on master. > > You mean to make the master wait the data page write until WAL has been > not only > flushed to disk but also replicated to the standby? > > > The main drawback would be increased wait time for the client due to > extra round trip to standby before master sends ACK to client. Are there > any other issues with this approach? > > I think that you can introduce GUC specifying whether this extra check > is required to > avoid a backup when failback. > > Regards, > > -- > Fujii Masao >