On Sun, Jul 7, 2013 at 4:19 PM, Sawada Masahiko <sawada.m...@gmail.com> wrote: > On Mon, Jun 17, 2013 at 8:48 PM, Simon Riggs <si...@2ndquadrant.com> wrote: >> On 17 June 2013 09:03, Pavan Deolasee <pavan.deola...@gmail.com> wrote: >> >>> I agree. We should probably find a better name for this. Any suggestions ? >> >> err, I already made one... >> >>>> But that's not the whole story. I can see some utility in a patch that >>>> makes all WAL transfer synchronous, rather than just commits. Some >>>> name like synchronous_transfer might be appropriate. e.g. >>>> synchronous_transfer = all | commit (default). >> >>> Since commits are more foreground in nature and this feature >>> does not require us to wait during common foreground activities, we want a >>> configuration where master can wait for synchronous transfers at other than >>> commits. May we can solve that by having more granular control to the said >>> parameter ? >>> >>>> >>>> The idea of another slew of parameters that are very similar to >>>> synchronous replication but yet somehow different seems weird. I can't >>>> see a reason why we'd want a second lot of parameters. Why not just >>>> use the existing ones for sync rep? (I'm surprised the Parameter >>>> Police haven't visited you in the night...) Sure, we might want to >>>> expand the design for how we specify multi-node sync rep, but that is >>>> a different patch. >>> >>> >>> How would we then distinguish between synchronous and the new kind of >>> standby ? >> >> That's not the point. The point is "Why would we have a new kind of >> standby?" and therefore why do we need new parameters? >> >>> I am told, one of the very popular setups for DR is to have one >>> local sync standby and one async (may be cascaded by the local sync). Since >>> this new feature is more useful for DR because taking a fresh backup on a >>> slower link is even more challenging, IMHO we should support such setups. >> >> ...which still doesn't make sense to me. Lets look at that in detail. >> >> Take 3 servers, A, B, C with A and B being linked by sync rep, and C >> being safety standby at a distance. >> >> Either A or B is master, except in disaster. So if A is master, then B >> would be the failover target. If A fails, then you want to failover to >> B. Once B is the target, you want to failback to A as the master. C >> needs to follow the new master, whichever it is. >> >> If you set up sync rep between A and B and this new mode between A and >> C. When B becomes the master, you need to failback from B from A, but >> you can't because the new mode applied between A and C only, so you >> have to failback from C to A. So having the new mode not match with >> sync rep means you are forcing people to failback using the slow link >> in the common case. >> >> You might observe that having the two modes match causes problems if A >> and B fail, so you are forced to go to C as master and then eventually >> failback to A or B across a slow link. That case is less common and >> could be solved by extending sync transfer to more/multi nodes. >> >> It definitely doesn't make sense to have sync rep on anything other >> than a subset of sync transfer. So while it may be sensible in the >> future to make sync transfer a superset of sync rep nodes, it makes >> sense to make them the same config for now. > I have updated the patch. > > we support following 2 cases. > 1. SYNC server and also make same failback safe standby server > 2. ASYNC server and also make same failback safe standby server > > 1. changed name of parameter > give up 'failback_safe_standby_names' parameter from the first patch. > and changed name of parameter from 'failback_safe_mode ' to > 'synchronous_transfer'. > this parameter accepts 'all', 'data_flush' and 'commit'. > > -'commit' > 'commit' means that master waits for corresponding WAL to flushed > to disk of standby server on commits. > but master doesn't waits for replicated data pages. > > -'data_flush' > 'data_flush' means that master waits for replicated data page > (e.g, CLOG, pg_control) before flush to disk of master server. > but if user set to 'data_flush' to this parameter, > 'synchronous_commit' values is ignored even if user set > 'synchronous_commit'. > > -'all' > 'all' means that master waits for replicated WAL and data page. > > 2. put SyncRepWaitForLSN() function into XLogFlush() function > we have put SyncRepWaitForLSN() function into XLogFlush() function, > and change argument of XLogFlush(). > > they are setup case and need to set parameters. > > - SYNC server and also make same failback safe standgy server (case 1) > synchronous_transfer = all > synchronous_commit = remote_write/on > synchronous_standby_names = <ServerName> > > - ASYNC server and also make same failback safe standgy server (case 2) > synchronous_transfer = data_flush > (synchronous_commit values is ignored) > > - default SYNC replication > synchronous_transfer = commit > synchronous_commit = on > synchronous_standby_names = <ServerName> > > - default ASYNC replication > synchronous_transfer = commit > > ToDo > 1. currently this patch supports synchronous transfer. so we can't set > different synchronous transfer mode to each server. > we need to improve the patch for support following cases. > - SYNC standby and make separate ASYNC failback safe standby > - ASYNC standby and make separate ASYNC failback safe standby > > 2. we have not measure performance yet. we need to measure perfomance. > > please give me your feedback. > > Regards, > > ------- > Sawada Masahiko
I'm sorry. I forgot attached the patch. Please see the attached file. Regards, ------- Sawada Masahiko
failback_safe_standby_v2.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers