Re: [HACKERS] Configuring synchronous replication
On 09/23/2010 10:09 PM, Robert Haas wrote: I think maybe you missed Tom's point, or else you just didn't respond to it. If the master is wedged because it is waiting for a standby, then you cannot commit transactions on the master. Therefore you cannot update the system catalog which you must update to unwedge it. Failing over in that situation is potentially a huge nuisance and extremely undesirable. Well, Simon is arguing that there's no need to wait for a disconnected standby. So that's not much of an issue. Regrads Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Simon, On 09/24/2010 12:11 AM, Simon Riggs wrote: As I keep pointing out, waiting for an acknowledgement from something that isn't there might just take a while. The only guarantee that provides is that you will wait a long time. Is my data more safe? No. By now I agree that waiting for disconnected standbies is useless in master-slave replication. However, it makes me wonder where you draw the line between just temporarily unresponsive and disconnected. To get zero data loss *and* continuous availability, you need two standbys offering sync rep and reply-to-first behaviour. You don't need standby registration to achieve that. Well, if your master reaches the false conclusion that both standbies are disconnected and happily continues without their ACKs (and the idiot admin being happy about having boosted database performance with whatever measure he recently took) you certainly don't have no zero data loss guarantee anymore. So for one, this needs a big fat warning that gets slapped on the admin's forehead in case of a disconnect. And second, the timeout for considering a standby to be disconnected should rather be large enough to not get false negatives. IIUC the master still waits for an ACK during that timeout. An infinite timeout doesn't have either of these issues, because there's no such distinction between temporarily unresponsive and disconnected. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 24/09/10 01:11, Simon Riggs wrote: But that's not what I call synchronous replication, it doesn't give you the guarantees that textbook synchronous replication does. Which textbook? I was using that word metaphorically, but for example: Wikipedia http://en.wikipedia.org/wiki/Replication_%28computer_science%29 (includes a caveat that many commercial systems skimp on it) Oracle docs http://download.oracle.com/docs/cd/B10500_01/server.920/a96567/repoverview.htm Scroll to Synchronous Replication Googling for synchronous replication textbook also turns up this actual textbook: Database Management Systems by R. Ramakrishnan others which uses synchronous replication with this meaning, although in the context of multi-master replication. Interestingly, Transaction Processing: Concepts and techniques by Grey, Reuter, chapter 12.6.3, defines three levels: 1-safe - what we call asynchronous 2-safe - commit is acknowledged after the slave acknowledges it, but if the slave is down, fall back to asynchronous mode. 3-safe - commit is acknowledged only after slave acknowledges it. If it is down, refuse to commit In the context of multi-master replication, eager replication seems to be commonly used to mean synchronous replication. If we just want *something* that's useful, and want to avoid the hassle of registration and all that, I proposed a while back (http://archives.postgresql.org/message-id/4c7e29bc.3020...@enterprisedb.com) that we could aim for behavior that would be useful for distributing read-only load to slaves. The use case is specifically that you have one master and one or more hot standby servers. You also have something like pgpool that distributes all read-only queries across all the nodes, and routes updates to the master server. In this scenario, you want that the master node does not acknowledge a commit to the client until all currently connected standby servers have replayed the commit. Furthermore, you want a standby server to stop accepting queries if it loses connection to the master, to avoid giving out-of-date responses. With suitable timeouts in the master and the standby, it seems possible to guarantee that you can connect to any node in the system and get an up-to-date result. It does not give zero data loss like synchronous replication does, but it keeps hot standby servers trustworthy for queries. It bothers me that no-one seems to have a clear use case in mind. People want synchronous replication, but don't seem to care much what guarantees it should provide. I wish the terminology was better standardized in this area. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 24/09/10 01:11, Simon Riggs wrote: On Thu, 2010-09-23 at 20:42 +0300, Heikki Linnakangas wrote: If you want the behavior where the master doesn't acknowledge a commit to the client until the standby (or all standbys, or one of them etc.) acknowledges it, even if the standby is not currently connected, the master needs to know what standby servers exist. *That's* why synchronous replication needs a list of standby servers in the master. If you're willing to downgrade to a mode where commit waits for acknowledgment only from servers that are currently connected, then you don't need any new configuration files. As I keep pointing out, waiting for an acknowledgement from something that isn't there might just take a while. The only guarantee that provides is that you will wait a long time. Is my data more safe? No. It provides zero data loss, at the expense of availability. That's what synchronous replication is all about. To get zero data loss *and* continuous availability, you need two standbys offering sync rep and reply-to-first behaviour. Yes, that is a good point. I'm starting to understand what your proposal was all about. It makes sense when you think of a three node system configured for high availability with zero data loss like that. The use case of keeping hot standby servers up todate in a cluster where read-only queries are distributed across all nodes seems equally important though. What's the simplest method of configuration that supports both use cases? You don't need standby registration to achieve that. Not necessarily I guess, but it creeps me out that a standby can just connect to the master and act as a synchronous slave, and there is no controls in the master on what standby servers there are. More complicated scenarios with quorums and different number of votes get increasingly complicated if there is no central place to configure it. But maybe we can ignore the more complicated setups for now. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 14:26 +0200, Csaba Nagy wrote: Unfortunately it was quite long time ago we last tried, and I don't remember exactly what was bottlenecked. Our application is quite write-intensive, the ratio of writes to reads which actually reaches the disk is about 50-200% (according to the disk stats - yes, sometimes we write more to the disk than we read, probably due to the relatively large RAM installed). If I remember correctly, the standby was about the same regarding IO/CPU power as the master, but it was not able to process the WAL files as fast as they were coming in, which excludes at least the network as a bottleneck. What I actually suppose happens is that the one single process applying the WAL on the slave is not able to match the full IO the master is able to do with all it's processors. If you're interested, I could try to set up another try, but it would be on 8.3.7 (that's what we still run). On 9.x would be also interesting... Substantial performance improvements came in 8.4 with bgwriter running in recovery. That meant that the startup process didn't need to spend time doing restartpoints and could apply changes continuously. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 16:09 -0400, Robert Haas wrote: On Thu, Sep 23, 2010 at 3:46 PM, Simon Riggs si...@2ndquadrant.com wrote: Well, its not at all hard to see how that could be configured, because I already proposed a simple way of implementing parameters that doesn't suffer from those problems. My proposal did not give roles to named standbys and is symmetrical, so switchovers won't cause a problem. I know you proposed a way, but my angst is all around whether it was actually simple. I found it somewhat difficult to understand, so possibly other people might have the same problem. Let's go back to Josh's 12 server example. This current proposal requires 12 separate and different configuration files each containing many parameters that require manual maintenance. I doubt that people looking at that objectively will decide that is the best approach. We need to arrange a clear way for people to decide for themselves. I'll work on that. Earlier you argued that centralizing parameters would make this nice and simple. Now you're pointing out that we aren't centralizing this at all, and it won't be simple. We'll have to have a standby.conf set up that is customised in advance for each standby that might become a master. Plus we may even need multiple standby.confs in case that we have multiple nodes down. This is exactly what I was seeking to avoid and exactly what I meant when I asked for an analysis of the failure modes. If you're operating on the notion that no reconfiguration will be necessary when nodes go down, then we have very different notions of what is realistic. I think that copy the new standby.conf file in place is going to be the least of the fine admin's problems. Earlier you argued that setting parameters on each standby was difficult and we should centralize things on the master. Now you tell us that actually we do need lots of settings on each standby and that to think otherwise is not realistic. That's a contradiction. The chain of argument used to support this as being a sensible design choice is broken or contradictory in more than one place. I think we should be looking for a design using the KISS principle, while retaining sensible tuning options. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Tom Lane t...@sss.pgh.pa.us writes: Oh, I thought part of the objective here was to try to centralize that stuff. If we're assuming that slaves will still have local replication configuration files, then I think we should just add any necessary info to those files and drop this entire conversation. We're expending a tremendous amount of energy on something that won't make any real difference to the overall complexity of configuring a replication setup. AFAICS the only way you make a significant advance in usability is if you can centralize all the configuration information in some fashion. +1, but for real usability you have to make it so that this central setup can be edited from any member of the replication. HINT: plproxy. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: If you want the behavior where the master doesn't acknowledge a commit to the client until the standby (or all standbys, or one of them etc.) acknowledges it, even if the standby is not currently connected, the master needs to know what standby servers exist. *That's* why synchronous replication needs a list of standby servers in the master. And this list can be maintained in a semi-automatic fashion: - adding to the list is done by the master as soon as a standby connects maybe we need to add a notion of fqdn in the standby setup? - service level and current weight and any other knob that comes from the standby are changed on the fly by the master if that changes on the standby (default async, 1, but SIGHUP please) - current standby position (LSN for recv, fsync and replayed) of the standby, as received in the feedback loop are changed on the fly by the master - removing a standby has to be done manually, using an admin function that's the only way to sort out permanent vs transient unavailability - checking the current values in this list is done on the master by using some system view based on a SRF, as already said Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, 2010-09-24 at 11:08 +0300, Heikki Linnakangas wrote: On 24/09/10 01:11, Simon Riggs wrote: But that's not what I call synchronous replication, it doesn't give you the guarantees that textbook synchronous replication does. Which textbook? I was using that word metaphorically, but for example: Wikipedia http://en.wikipedia.org/wiki/Replication_%28computer_science%29 (includes a caveat that many commercial systems skimp on it) Yes, I read that. The example it uses shows only one standby, which does suffer from the problem/caveat it describes. Two standbys resolves that problem, yet there is no mention of multiple standbys in Wikipedia. Oracle docs http://download.oracle.com/docs/cd/B10500_01/server.920/a96567/repoverview.htm Scroll to Synchronous Replication That document refers to sync rep *only* in the context of multimaster replication. We aren't discussing that here and so that link is not relevant at all. Oracle Data Guard in Maximum availability mode is roughly where I think we should be aiming http://download.oracle.com/docs/cd/B10500_01/server.920/a96653/concepts.htm#1033871 But I disagree with consulting other companies' copyrighted material, and I definitely don't like their overcomplicated configuration. And they have not yet thought of per-transaction controls. So I believe we should learn many lessons from them, but actually ignore and surpass them. Easily. Googling for synchronous replication textbook also turns up this actual textbook: Database Management Systems by R. Ramakrishnan others which uses synchronous replication with this meaning, although in the context of multi-master replication. Interestingly, Transaction Processing: Concepts and techniques by Grey, Reuter, chapter 12.6.3, defines three levels: 1-safe - what we call asynchronous 2-safe - commit is acknowledged after the slave acknowledges it, but if the slave is down, fall back to asynchronous mode. 3-safe - commit is acknowledged only after slave acknowledges it. If it is down, refuse to commit Which again is a one-standby viewpoint on the problem. Wikipedia is right that there is a problem when using just one server. 3-safe mode is not more safe than 2-safe mode when you have 2 standbys. If you want high availability you need N+1 redundancy. If you want a standby server that is N=1. If you want a highly available standby configuration then N+1 = 2. Show me the textbook that describes what happens with 2 standbys. If one exists, I'm certain it would agree with my analysis. (I'll read and comment on your other points later today.) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, 2010-09-24 at 11:43 +0300, Heikki Linnakangas wrote: To get zero data loss *and* continuous availability, you need two standbys offering sync rep and reply-to-first behaviour. Yes, that is a good point. I'm starting to understand what your proposal was all about. It makes sense when you think of a three node system configured for high availability with zero data loss like that. The use case of keeping hot standby servers up todate in a cluster where read-only queries are distributed across all nodes seems equally important though. What's the simplest method of configuration that supports both use cases? That is definitely the right question. (More later) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas robertmh...@gmail.com writes: I think maybe you missed Tom's point, or else you just didn't respond to it. If the master is wedged because it is waiting for a standby, then you cannot commit transactions on the master. Therefore you cannot update the system catalog which you must update to unwedge it. Failing over in that situation is potentially a huge nuisance and extremely undesirable. All Wrong. You might remember that Simon's proposal begins with per-transaction synchronous replication behavior? Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 24/09/10 13:57, Simon Riggs wrote: If you want high availability you need N+1 redundancy. If you want a standby server that is N=1. If you want a highly available standby configuration then N+1 = 2. Yep. Synchronous replication with one standby gives you zero data loss. When you add a 2nd standby as you described, then you have a reasonable level of high availability as well, as you can continue processing transactions in the master even if one slave dies. Show me the textbook that describes what happens with 2 standbys. If one exists, I'm certain it would agree with my analysis. I don't disagree with your analysis about multiple standbys and high availability. What I'm saying is that in a two standby situation, if you're willing to continue operation as usual in the master even if the standby is down, you're not doing synchronous replication. Extending that to a two standby situation, my claim is that if you're willing to continue operation as usual in the master when both standbys are down, you're not doing synchronous replication. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, 2010-09-24 at 14:12 +0300, Heikki Linnakangas wrote: What I'm saying is that in a two standby situation, if you're willing to continue operation as usual in the master even if the standby is down, you're not doing synchronous replication. Oracle and I disagree with you on that point, but I am more interested in behaviour than semantics. If you have two standbys and one is down, please explain how data loss has occurred. Extending that to a two standby situation, my claim is that if you're willing to continue operation as usual in the master when both standbys are down, you're not doing synchronous replication. Agreed. But you still need to decide how you will act. I choose pragmatism in that case. Others have voiced that they would like the database to shutdown or have all sessions hang. I personally doubt their employers would feel the same way. Arguing technical correctness would seem unlikely to allow a DBA to keep his job if they stood and watched the app become unavailable. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, Sep 24, 2010 at 6:37 AM, Simon Riggs si...@2ndquadrant.com wrote: Earlier you argued that centralizing parameters would make this nice and simple. Now you're pointing out that we aren't centralizing this at all, and it won't be simple. We'll have to have a standby.conf set up that is customised in advance for each standby that might become a master. Plus we may even need multiple standby.confs in case that we have multiple nodes down. This is exactly what I was seeking to avoid and exactly what I meant when I asked for an analysis of the failure modes. If you're operating on the notion that no reconfiguration will be necessary when nodes go down, then we have very different notions of what is realistic. I think that copy the new standby.conf file in place is going to be the least of the fine admin's problems. Earlier you argued that setting parameters on each standby was difficult and we should centralize things on the master. Now you tell us that actually we do need lots of settings on each standby and that to think otherwise is not realistic. That's a contradiction. You've repeatedly accused me and others of contradicting ourselves. I don't think that's helpful in advancing the debate, and I don't think it's what I'm doing. The point I'm trying to make is that when failover happens, lots of reconfiguration is going to be needed. There is just no getting around that. Let's ignore synchronous replication entirely for a moment. You're running 9.0 and you have 10 slaves. The master dies. You promote a slave. Guess what? You need to look at each slave you didn't promote and adjust primary_conninfo. You also need to check whether the slave has received an xlog record with a higher LSN than the one you promoted. If it has, you need to take a new base backup. Otherwise, you may have data corruption - very possibly silent data corruption. Do you dispute this? If so, on which point? The reason I think that we should centralize parameters on the master is because they affect *the behavior of the master*. Controlling whether the master will wait for the slave on the slave strikes me (and others) as spooky action at a distance. Configuring whether the master will retain WAL for a disconnected slave on the slave is outright byzantine. Of course, configuring these parameters on the master means that when the master changes, you're going to need a configuration (possibly the same, possibly different) for said parameters on the new master. But since you may be doing a lot of other adjustment at that point anyway (e.g. new base backups, changes in the set of synchronous slaves) that doesn't seem like a big deal. The chain of argument used to support this as being a sensible design choice is broken or contradictory in more than one place. I think we should be looking for a design using the KISS principle, while retaining sensible tuning options. The KISS principle is exactly what I am attempting to apply. Configuring parameters that affect the master on some machine other than the master isn't KISS, to me. You may find that broken or contradictory, but I disagree. I am attempting to disagree respectfully, but statements like the above make me feel like you're flaming, and that's getting under my skin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, Sep 24, 2010 at 7:47 AM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, 2010-09-24 at 14:12 +0300, Heikki Linnakangas wrote: What I'm saying is that in a two standby situation, if you're willing to continue operation as usual in the master even if the standby is down, you're not doing synchronous replication. Oracle and I disagree with you on that point, but I am more interested in behaviour than semantics. I *think* he meant s/two standby/two server/. That's taken from the 2 references: *the* master *the* slave. In that case, if the master is committing w/ no slave connected, it *isn't* repliation, synchronous or not. Usefull, likely, but replication, not at that PIT. If you have two standbys and one is down, please explain how data loss has occurred. Right, of course. But thinking he meant 2 servers (1 standby) not 3 servers (2 standby). But even with only 2 server, if it's down and the master is up, there isn't data loss. There's *potential* for dataloss. But you still need to decide how you will act. I choose pragmatism in that case. Others have voiced that they would like the database to shutdown or have all sessions hang. I personally doubt their employers would feel the same way. Arguing technical correctness would seem unlikely to allow a DBA to keep his job if they stood and watched the app become unavailable. Again, it all depends on the business. Synchronous replication can give you two things: 1) High Availability (Just answer my queries, dammit!) 2) High Durability (Don't give me an answer unless your damn well sure it's the right one) and its goal is to do that in the face of catastrophic failure (for some level of catastrophic). It's the trade of between: 1) The cost of delaying/refusing transactions being greater than the potential cost of a lost transaction 2) The cost of lost transaction being greater than the cost of delaying/refusing transactions So there are people who want to use PostgreSQL in a situation where they'ld much rather not say they have done something unless they are sure it's safely written in 2 different systems, in 2 different locations (and yes, the distance between those two locations will be a trade off wrt performance, and the business will need to decide on their risk levels). I understand it's optimal, desireable, or even praactical for the vast majority of cases. I don't want it to be impossible, or, if it's decide that it will be impossible, hopefully not just because you decided nobody ever needs it, but that its not feasible because of code/implimentation complexitites ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 24/09/10 14:47, Simon Riggs wrote: On Fri, 2010-09-24 at 14:12 +0300, Heikki Linnakangas wrote: What I'm saying is that in a two standby situation, if you're willing to continue operation as usual in the master even if the standby is down, you're not doing synchronous replication. Oracle and I disagree with you on that point, but I am more interested in behaviour than semantics. If you have two standbys and one is down, please explain how data loss has occurred. Sorry, that was a typo. As Aidan guessed, I meant even in a two server situation, ie. one master and one slave. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi, Defending my ideas as not to be put in the bag you're wanting to put away. We have more than 2 proposals lying around here. I'm one of the guys with a proposal and no code, but still trying to be clear. Robert Haas robertmh...@gmail.com writes: The reason I think that we should centralize parameters on the master is because they affect *the behavior of the master*. Controlling whether the master will wait for the slave on the slave strikes me (and others) as spooky action at a distance. I hope it's clear that I didn't propose anything like this in the related threads. What you setup on the slave is related only to what the slave has to offer to the master. What happens on the master wrt with waiting etc is setup on the master, and is controlled per-transaction. As my ideas come in good parts from understanding Simon work and proposal, my feeling is that stating them here will help the thread. Configuring whether the master will retain WAL for a disconnected slave on the slave is outright byzantine. Again, I can't remember having proposed such a thing. Of course, configuring these parameters on the master means that when the master changes, you're going to need a configuration (possibly the same, possibly different) for said parameters on the new master. But since you may be doing a lot of other adjustment at that point anyway (e.g. new base backups, changes in the set of synchronous slaves) that doesn't seem like a big deal. Should we take some time and define the behaviors we expect in the cluster, and the ones we want to provide in case of each error case we can think about, we'd be able to define the set of parameters that we need to operate the system. Then, some of us are betting than it will be possible to accommodate with either a unique central setup that you edit in only one place at failover time, *or* that the best way to manage the setup is having it distributed. Granted, given how it currently works, it looks like you will have to edit the primary_conninfo on a bunch of standbys at failover time, e.g. I'd like that we now follow Josh Berkus (and some other) advice now, and start a new thread to decide what we mean by synchronous replication, what kind of normal behaviour we want and what responses to errors we expect to be able to deal with in what (optional) ways. Because the more we're staying on this thread, and the clearer it is that there isn't two of us talking about the same synchronous replication feature set. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, 2010-09-24 at 16:01 +0200, Dimitri Fontaine wrote: I'd like that we now follow Josh Berkus (and some other) advice now, and start a new thread to decide what we mean by synchronous replication, what kind of normal behaviour we want and what responses to errors we expect to be able to deal with in what (optional) ways. What I intend to do from here is make a list of all desired use cases, then ask for people to propose ways of configuring those. Hopefully we don't need to discuss the meaning of the phrase sync rep, we just need to look at the use cases. That way we will be able to directly compare the flexibility/complexity/benefits of configuration between different proposals. I think this will allows us to rapidly converge on something useful. If multiple solutions exist, we may then be able to decide/vote on a prioritisation of use cases to help resolve any difficulty. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 24/09/10 17:13, Simon Riggs wrote: On Fri, 2010-09-24 at 16:01 +0200, Dimitri Fontaine wrote: I'd like that we now follow Josh Berkus (and some other) advice now, and start a new thread to decide what we mean by synchronous replication, what kind of normal behaviour we want and what responses to errors we expect to be able to deal with in what (optional) ways. What I intend to do from here is make a list of all desired use cases, then ask for people to propose ways of configuring those. Hopefully we don't need to discuss the meaning of the phrase sync rep, we just need to look at the use cases. Yes, that seems like a good way forward. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Fri, Sep 24, 2010 at 10:01 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Configuring whether the master will retain WAL for a disconnected slave on the slave is outright byzantine. Again, I can't remember having proposed such a thing. No one has, but I keep hearing we don't need the master to have a list of standbys and a list of properties for each standby... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 18:24 -0400, Robert Haas wrote: I feel like that's really nice and simple. There are already 5 separate places to configure to make streaming rep work in a 2 node cluster (master.pg_hba.conf, master.postgresql.conf, standby.postgresql.conf, standby.recovery.conf, password file/ssh key). I haven't heard anyone say we would be removing controls from those existing areas, so it isn't clear to me how adding a 6th place will make things nice and simple. Put simply, Standby registration is not required for most use cases. If some people want it, I'm happy that it can be optional. Personally, I want to make very sure that any behaviour that involves waiting around indefinitely can be turned off and should be off by default. ISTM very simple to arrange things so you can set parameters on the master OR on the standby, whichever is most convenient or desirable. Passing parameters around at handshake is pretty trivial. I do also understand that some parameters *must* be set in certain locations to gain certain advantages. Those can be documented. I would be happier if we could separate the *list* of control parameters we need from the issue of *where* we set those parameters. I would be even happier if we could agree on the top 3-5 parameters so we can implement those first. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 23/09/10 11:34, Csaba Nagy wrote: In the meantime our DBs are not able to keep in sync via WAL replication, that would need some kind of parallel WAL restore on the slave I guess, or I'm not able to configure it properly - in any case now we use slony which is working. It would be interesting to debug that case a bit more. Was bottlenecked by CPU or I/O, or network capacity perhaps? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi all, Some time ago I was also interested in this feature, and that time I also thought about complete setup possibility via postgres connections, meaning the transfer of the files and all configuration/slave registration to be done through normal backend connections. In the meantime our DBs are not able to keep in sync via WAL replication, that would need some kind of parallel WAL restore on the slave I guess, or I'm not able to configure it properly - in any case now we use slony which is working. In fact the way slony is doing the configuration could be a good place to look... On Wed, 2010-09-22 at 13:16 -0400, Robert Haas wrote: I guarantee you there is a way around the cascade slave problem. And that would be...? * restrict the local file configuration to a replication ID; * make all configuration refer to the replica ID; * keep all configuration in a shared catalog: it can be kept exactly the same on all replicas, as each replication node will only care about the configuration concerning it's own replica ID; * added advantage: after take-over the slave will change the configured master to it's own replica ID, and if the old master would ever connect again, it could easily notice that and give up; Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 12:02 +0300, Heikki Linnakangas wrote: On 23/09/10 11:34, Csaba Nagy wrote: In the meantime our DBs are not able to keep in sync via WAL replication, that would need some kind of parallel WAL restore on the slave I guess, or I'm not able to configure it properly - in any case now we use slony which is working. It would be interesting to debug that case a bit more. Was bottlenecked by CPU or I/O, or network capacity perhaps? Unfortunately it was quite long time ago we last tried, and I don't remember exactly what was bottlenecked. Our application is quite write-intensive, the ratio of writes to reads which actually reaches the disk is about 50-200% (according to the disk stats - yes, sometimes we write more to the disk than we read, probably due to the relatively large RAM installed). If I remember correctly, the standby was about the same regarding IO/CPU power as the master, but it was not able to process the WAL files as fast as they were coming in, which excludes at least the network as a bottleneck. What I actually suppose happens is that the one single process applying the WAL on the slave is not able to match the full IO the master is able to do with all it's processors. If you're interested, I could try to set up another try, but it would be on 8.3.7 (that's what we still run). On 9.x would be also interesting, but that would be a test system and I can't possibly get there the load we have on production... Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 23/09/10 15:26, Csaba Nagy wrote: Unfortunately it was quite long time ago we last tried, and I don't remember exactly what was bottlenecked. Our application is quite write-intensive, the ratio of writes to reads which actually reaches the disk is about 50-200% (according to the disk stats - yes, sometimes we write more to the disk than we read, probably due to the relatively large RAM installed). If I remember correctly, the standby was about the same regarding IO/CPU power as the master, but it was not able to process the WAL files as fast as they were coming in, which excludes at least the network as a bottleneck. What I actually suppose happens is that the one single process applying the WAL on the slave is not able to match the full IO the master is able to do with all it's processors. There's a program called pg_readahead somewhere on pgfoundry by NTT that will help if it's the single-threadedness of I/O. Before handing the WAL file to the server, it scans it through and calls posix_fadvise for all the blocks that it touches. When the server then replays it, the data blocks are already being fetched by the OS, using the whole RAID array. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: I think it should be a separate config file, and I think it should be a config file that can be edited using DDL commands as you propose. But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. ISTM that we can have a system catalog and still have cascading slaves. If we administer the catalog via the master, why can't we administer all slaves, however they cascade, via the master too? What other problems are there that mean we *must* have a file? I can't see any. Elsewhere, we've established that we can have unregistered standbys, so max_wal_senders cannot go away. If we do have a file, it will be a problem after failover since the file will be either absent or potentially out of date. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Simon Riggs si...@2ndquadrant.com writes: ISTM that we can have a system catalog and still have cascading slaves. If we administer the catalog via the master, why can't we administer all slaves, however they cascade, via the master too? What other problems are there that mean we *must* have a file? Well, for one thing, how do you add a new slave? If its configuration comes from a system catalog, it seems that it has to already be replicating before it knows what its configuration is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 11:43 -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: ISTM that we can have a system catalog and still have cascading slaves. If we administer the catalog via the master, why can't we administer all slaves, however they cascade, via the master too? What other problems are there that mean we *must* have a file? Well, for one thing, how do you add a new slave? If its configuration comes from a system catalog, it seems that it has to already be replicating before it knows what its configuration is. At the moment, I'm not aware of any proposed parameters that need to be passed from master to standby, since that was one of the arguments for standby registration in the first place. If that did occur, when the standby connects it would get told what parameters to use by the master as part of the handshake. It would have to work exactly that way with standby.conf on the master also. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, Sep 23, 2010 at 11:32 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: I think it should be a separate config file, and I think it should be a config file that can be edited using DDL commands as you propose. But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. ISTM that we can have a system catalog and still have cascading slaves. If we administer the catalog via the master, why can't we administer all slaves, however they cascade, via the master too? Well, I guess we could, but is that really convenient? My gut feeling is no, but of course it's subjective. What other problems are there that mean we *must* have a file? I can't see any. Elsewhere, we've established that we can have unregistered standbys, so max_wal_senders cannot go away. If we do have a file, it will be a problem after failover since the file will be either absent or potentially out of date. I'm not sure about that. I wonder if we can actually turn this into a feature, with careful design. Suppose that you have the common configuration of two machines, A and B. At any give time, one is the master and one is the slave. And let's say you've opted for sync rep, apply mode, don't wait for disconnected standbys. Well, you can have a config file on A that defines B as the slave, and a config file on B that defines A as the slave. When failover happens, you still have to worry about taking a new base backup, removing recovery.conf from the new master and adding it to the slave, and all that stuff, but the standby config just works. Now, admittedly, in more complex topologies, and especially if you're using configuration options that pertain to the behavior of disconnected standbys (e.g. wait for them, or retain WAL for them), you're going to need to adjust the configs. But I think that's likely to be true anyway, even with a catalog. If A is doing sync rep and waiting for B even when B is disconnected, and the machines switch roles, it's hard to see how any configuration isn't going to need some adjustment. One thing that's nice about the flat file system is that you can make the configuration changes on the new master before you promote it (perhaps you had A replicating synchronously to B and B replicating asynchronously to C, but now that A is dead and B is promoted, you want the latter replication to become synchronous). Being able to make those kinds of changes before you start processing live transactions is possibly useful to some people. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Simon Riggs si...@2ndquadrant.com writes: On Thu, 2010-09-23 at 11:43 -0400, Tom Lane wrote: Well, for one thing, how do you add a new slave? If its configuration comes from a system catalog, it seems that it has to already be replicating before it knows what its configuration is. At the moment, I'm not aware of any proposed parameters that need to be passed from master to standby, since that was one of the arguments for standby registration in the first place. If that did occur, when the standby connects it would get told what parameters to use by the master as part of the handshake. It would have to work exactly that way with standby.conf on the master also. Um ... so how does this standby know what master to connect to, what password to offer, etc? I don't think that pass down parameters after connecting is likely to cover anything but a small subset of the configuration problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, Sep 23, 2010 at 12:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On Thu, 2010-09-23 at 11:43 -0400, Tom Lane wrote: Well, for one thing, how do you add a new slave? If its configuration comes from a system catalog, it seems that it has to already be replicating before it knows what its configuration is. At the moment, I'm not aware of any proposed parameters that need to be passed from master to standby, since that was one of the arguments for standby registration in the first place. If that did occur, when the standby connects it would get told what parameters to use by the master as part of the handshake. It would have to work exactly that way with standby.conf on the master also. Um ... so how does this standby know what master to connect to, what password to offer, etc? I don't think that pass down parameters after connecting is likely to cover anything but a small subset of the configuration problem. Huh? We have that stuff already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas robertmh...@gmail.com writes: Now, admittedly, in more complex topologies, and especially if you're using configuration options that pertain to the behavior of disconnected standbys (e.g. wait for them, or retain WAL for them), you're going to need to adjust the configs. But I think that's likely to be true anyway, even with a catalog. If A is doing sync rep and waiting for B even when B is disconnected, and the machines switch roles, it's hard to see how any configuration isn't going to need some adjustment. One thing that's nice about the flat file system is that you can make the configuration changes on the new master before you promote it Actually, that's the killer argument in this whole thing. If the configuration information is in a system catalog, you can't change it without the master being up and running. Let us suppose for example that you've configured hard synchronous replication such that the master can't commit without slave acks. Now your slaves are down and you'd like to change that setting. Guess what. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas robertmh...@gmail.com writes: On Thu, Sep 23, 2010 at 12:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Um ... so how does this standby know what master to connect to, what password to offer, etc? I don't think that pass down parameters after connecting is likely to cover anything but a small subset of the configuration problem. Huh? We have that stuff already. Oh, I thought part of the objective here was to try to centralize that stuff. If we're assuming that slaves will still have local replication configuration files, then I think we should just add any necessary info to those files and drop this entire conversation. We're expending a tremendous amount of energy on something that won't make any real difference to the overall complexity of configuring a replication setup. AFAICS the only way you make a significant advance in usability is if you can centralize all the configuration information in some fashion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, Sep 23, 2010 at 1:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Sep 23, 2010 at 12:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Um ... so how does this standby know what master to connect to, what password to offer, etc? I don't think that pass down parameters after connecting is likely to cover anything but a small subset of the configuration problem. Huh? We have that stuff already. Oh, I thought part of the objective here was to try to centralize that stuff. If we're assuming that slaves will still have local replication configuration files, then I think we should just add any necessary info to those files and drop this entire conversation. We're expending a tremendous amount of energy on something that won't make any real difference to the overall complexity of configuring a replication setup. AFAICS the only way you make a significant advance in usability is if you can centralize all the configuration information in some fashion. Well, it's quite fanciful to suppose that the slaves aren't going to need to have local configuration for how to connect to the master. The configuration settings we're talking about here are the things that affect either the behavior of the master-slave system as a unit (like what kind of ACK the master needs to get from the slave before ACKing the commit back to the user) or the master alone (like tracking how much WAL needs to be retained for a particular disconnected slave, rather than as presently always retaining a fixed amount). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 16:18 +0300, Heikki Linnakangas wrote: There's a program called pg_readahead somewhere on pgfoundry by NTT that will help if it's the single-threadedness of I/O. Before handing the WAL file to the server, it scans it through and calls posix_fadvise for all the blocks that it touches. When the server then replays it, the data blocks are already being fetched by the OS, using the whole RAID array. That sounds useful, thanks for the hint ! But couldn't this also be directly built in to WAL recovery process ? It would probably help a lot for recovering from a crash too. We did have recently a crash and it took hours to recover. I will try it out as soon as I get the time to set it up... [searching pgfoundry] Unfortunately I can't find it, and google is also not very helpful. Do you happen to have some links to it ? Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 11:43 -0400, Tom Lane wrote: What other problems are there that mean we *must* have a file? Well, for one thing, how do you add a new slave? If its configuration comes from a system catalog, it seems that it has to already be replicating before it knows what its configuration is. Or the slave gets a connection string to the master, and reads the configuration from there - it has to connect there anyway... The ideal bootstrap for a slave creation would be: get the params to connect to the master + the replica ID, and the rest should be done by connecting to the master and getting all the needed thing from there, including configuration. Maybe you see some merit for this idea: it wouldn't hurt to get the interfaces done so that the master could be impersonated by some WAL repository serving a PITR snapshot, and that the same WAL repository could connect as a slave to the master and instead of recovering the WAL stream, archive it. Such a WAL repository would possibly connect to multiple masters and could also get regularly snapshots too. This would provide a nice complement to WAL replication as PITR solution using the same protocols as the WAL standby. I have no idea if this would be easy to implement or useful for anybody. Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 23/09/10 20:03, Tom Lane wrote: Robert Haasrobertmh...@gmail.com writes: On Thu, Sep 23, 2010 at 12:52 PM, Tom Lanet...@sss.pgh.pa.us wrote: Um ... so how does this standby know what master to connect to, what password to offer, etc? I don't think that pass down parameters after connecting is likely to cover anything but a small subset of the configuration problem. Huh? We have that stuff already. Oh, I thought part of the objective here was to try to centralize that stuff. If we're assuming that slaves will still have local replication configuration files, then I think we should just add any necessary info to those files and drop this entire conversation. We're expending a tremendous amount of energy on something that won't make any real difference to the overall complexity of configuring a replication setup. AFAICS the only way you make a significant advance in usability is if you can centralize all the configuration information in some fashion. If you want the behavior where the master doesn't acknowledge a commit to the client until the standby (or all standbys, or one of them etc.) acknowledges it, even if the standby is not currently connected, the master needs to know what standby servers exist. *That's* why synchronous replication needs a list of standby servers in the master. If you're willing to downgrade to a mode where commit waits for acknowledgment only from servers that are currently connected, then you don't need any new configuration files. But that's not what I call synchronous replication, it doesn't give you the guarantees that textbook synchronous replication does. (Gosh, I wish the terminology was more standardized in this area) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 13:07 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: Now, admittedly, in more complex topologies, and especially if you're using configuration options that pertain to the behavior of disconnected standbys (e.g. wait for them, or retain WAL for them), you're going to need to adjust the configs. But I think that's likely to be true anyway, even with a catalog. If A is doing sync rep and waiting for B even when B is disconnected, and the machines switch roles, it's hard to see how any configuration isn't going to need some adjustment. Well, its not at all hard to see how that could be configured, because I already proposed a simple way of implementing parameters that doesn't suffer from those problems. My proposal did not give roles to named standbys and is symmetrical, so switchovers won't cause a problem. Earlier you argued that centralizing parameters would make this nice and simple. Now you're pointing out that we aren't centralizing this at all, and it won't be simple. We'll have to have a standby.conf set up that is customised in advance for each standby that might become a master. Plus we may even need multiple standby.confs in case that we have multiple nodes down. This is exactly what I was seeking to avoid and exactly what I meant when I asked for an analysis of the failure modes. This proposal is a configuration nightmare, no question, and that is not the right way to go if you want high availability that works when you need it to. One thing that's nice about the flat file system is that you can make the configuration changes on the new master before you promote it Actually, that's the killer argument in this whole thing. If the configuration information is in a system catalog, you can't change it without the master being up and running. Let us suppose for example that you've configured hard synchronous replication such that the master can't commit without slave acks. Now your slaves are down and you'd like to change that setting. Guess what. If we have standby registration and I respect that some people want it, a table seems to be the best place for them. In a table the parameters are passed through from master to slave automatically without needing to synchronize multiple files manually. They can only be changed on a master, true. But since they only effect the behaviour of a master (commits = writes) then that doesn't matter at all. As soon as you promote a new master you'll be able to change them again, if required. Configuration options that differ on each node, depending upon the current state of others nodes are best avoided. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, Sep 23, 2010 at 3:46 PM, Simon Riggs si...@2ndquadrant.com wrote: Well, its not at all hard to see how that could be configured, because I already proposed a simple way of implementing parameters that doesn't suffer from those problems. My proposal did not give roles to named standbys and is symmetrical, so switchovers won't cause a problem. I know you proposed a way, but my angst is all around whether it was actually simple. I found it somewhat difficult to understand, so possibly other people might have the same problem. Earlier you argued that centralizing parameters would make this nice and simple. Now you're pointing out that we aren't centralizing this at all, and it won't be simple. We'll have to have a standby.conf set up that is customised in advance for each standby that might become a master. Plus we may even need multiple standby.confs in case that we have multiple nodes down. This is exactly what I was seeking to avoid and exactly what I meant when I asked for an analysis of the failure modes. If you're operating on the notion that no reconfiguration will be necessary when nodes go down, then we have very different notions of what is realistic. I think that copy the new standby.conf file in place is going to be the least of the fine admin's problems. One thing that's nice about the flat file system is that you can make the configuration changes on the new master before you promote it Actually, that's the killer argument in this whole thing. If the configuration information is in a system catalog, you can't change it without the master being up and running. Let us suppose for example that you've configured hard synchronous replication such that the master can't commit without slave acks. Now your slaves are down and you'd like to change that setting. Guess what. If we have standby registration and I respect that some people want it, a table seems to be the best place for them. In a table the parameters are passed through from master to slave automatically without needing to synchronize multiple files manually. They can only be changed on a master, true. But since they only effect the behaviour of a master (commits = writes) then that doesn't matter at all. As soon as you promote a new master you'll be able to change them again, if required. Configuration options that differ on each node, depending upon the current state of others nodes are best avoided. I think maybe you missed Tom's point, or else you just didn't respond to it. If the master is wedged because it is waiting for a standby, then you cannot commit transactions on the master. Therefore you cannot update the system catalog which you must update to unwedge it. Failing over in that situation is potentially a huge nuisance and extremely undesirable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, 2010-09-22 at 15:31 -0700, Josh Berkus wrote: The above case is one where I can see your point and it does sound easier in that case. But I then think: What happens after failover?. We would then need to have 12 different standby.conf files, one on each standby that describes what the setup would look like if that standby became the master. And guess what, every time we made a change on the master, you'd need to re-edit all 12 standby.conf files to reflect the new configuration. So we're still back to having to edit in multiple places, ISTM. Unless we can make the standby.conf files identical on all servers in the group. If we can do that, then conf file management utilities, fileshares, or a simple automated rsync could easily take care of things. Would prefer table. But ... any setup which involves each standby being *required* to have a different configuration on each standby server, which has to be edited separately, is going to be fatally difficult to manage for anyone who has more than a couple of standbys. So I'd like to look at what it takes to get away from that. Agreed. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Thu, 2010-09-23 at 20:42 +0300, Heikki Linnakangas wrote: If you want the behavior where the master doesn't acknowledge a commit to the client until the standby (or all standbys, or one of them etc.) acknowledges it, even if the standby is not currently connected, the master needs to know what standby servers exist. *That's* why synchronous replication needs a list of standby servers in the master. If you're willing to downgrade to a mode where commit waits for acknowledgment only from servers that are currently connected, then you don't need any new configuration files. As I keep pointing out, waiting for an acknowledgement from something that isn't there might just take a while. The only guarantee that provides is that you will wait a long time. Is my data more safe? No. To get zero data loss *and* continuous availability, you need two standbys offering sync rep and reply-to-first behaviour. You don't need standby registration to achieve that. But that's not what I call synchronous replication, it doesn't give you the guarantees that textbook synchronous replication does. Which textbook? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 22/09/10 03:25, Joshua D. Drake wrote: Why is this in the config file at all. It should be: synchronous_replication = TRUE/FALSE Umm, what does this do? then ALTER CLUSTER ENABLE REPLICATION FOR FOO; ALTER CLUSTER SET keep_connect ON FOO TO TRUE; Or some such thing. I like a configuration file more because you can easily add comments, comment out lines, etc. It also makes it easier to have a different configuration in master and standby. We don't support cascading slaves, yet, but you might still want a different configuration in master and slave, waiting for the moment that the slave is promoted to a new master. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 21/09/10 18:12, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 21/09/10 11:52, Thom Brown wrote: My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. I'm not a big fan of XML either. ... Then again, maybe we should go with something like json or yaml The fundamental problem with all those machine editable formats is that they aren't people editable. If you have to have a tool (other than a text editor) to change a config file, you're going to be very unhappy when things are broken at 3AM and you're trying to fix it while ssh'd in from your phone. I'm not very familiar with any of those formats, but I agree it needs to be easy to edit by hand first and foremost. I think the ini file format suggestion is probably a good one; it seems to fit this problem, and it's something that people are used to. We could probably shoehorn the info into a pg_hba-like format, but I'm concerned about whether we'd be pushing that format beyond what it can reasonably handle. The ini file format seems to be enough for the features proposed this far, but I'm a bit concerned that even that might not be flexible enough for future features. I guess we'll cross the bridge when we get there and go with an ini file for now. It should be possible to extend it in various ways, and in the worst case that we have to change to a completely different format, we can provide a how to guide on converting existing config files to the new format. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi, On 09/21/2010 08:05 PM, Simon Riggs wrote: Hmm, no reason? The reason is that the alternative is that the session would hang until a standby arrived that offered that level of service. Why would you want that behaviour? Would you really request that option? I think I now agree with Simon on that point. It's only an issue in multi-master replication, where continued operation would lead to a split-brain situation. With master-slave, you only need to make sure your master stays the master even if the standby crash(es) are followed by a master crash. If your cluster-ware is too clever and tries a fail-over on a slave that's quicker to come up, you get the same split-brain situation. Put another way: if you let your master continue, don't ever try a fail-over after a full-cluster crash. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 09/22/2010 04:18 AM, Heikki Linnakangas wrote: On 21/09/10 18:12, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 21/09/10 11:52, Thom Brown wrote: My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. I'm not a big fan of XML either. ... Then again, maybe we should go with something like json or yaml The fundamental problem with all those machine editable formats is that they aren't people editable. If you have to have a tool (other than a text editor) to change a config file, you're going to be very unhappy when things are broken at 3AM and you're trying to fix it while ssh'd in from your phone. I'm not very familiar with any of those formats, but I agree it needs to be easy to edit by hand first and foremost. I think the ini file format suggestion is probably a good one; it seems to fit this problem, and it's something that people are used to. We could probably shoehorn the info into a pg_hba-like format, but I'm concerned about whether we'd be pushing that format beyond what it can reasonably handle. The ini file format seems to be enough for the features proposed this far, but I'm a bit concerned that even that might not be flexible enough for future features. I guess we'll cross the bridge when we get there and go with an ini file for now. It should be possible to extend it in various ways, and in the worst case that we have to change to a completely different format, we can provide a how to guide on converting existing config files to the new format. The ini file format is not flexible enough, IMNSHO. If we're going to adopt a new config file format it should have these characteristics, among others: * well known (let's not invent a new one) * supports hierarchical structure * reasonably readable I realize that the last is very subjective. Personally, I'm very comfortable with XML, but then I do a *lot* of work with it, and have for many years. I know I'm in a minority on that, and some people just go bananas when they see it. Since we're just about to add a JSON parser to the backend, by the look of it, that looks like a reasonable bet. Maybe it uses a few too many quotes, but that's not really so hard to get your head around, even if it offends you a bit aesthetically. And it is certainly fairly widely known. cheers andrew
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 9:47 AM, Andrew Dunstan and...@dunslane.net wrote: The ini file format is not flexible enough, IMNSHO. If we're going to adopt a new config file format it should have these characteristics, among others: well known (let's not invent a new one) supports hierarchical structure reasonably readable The ini format meets all of those requirements - and it's certainly far more readable/editable than XML and friends. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 09/22/2010 04:54 AM, Dave Page wrote: On Wed, Sep 22, 2010 at 9:47 AM, Andrew Dunstanand...@dunslane.net wrote: The ini file format is not flexible enough, IMNSHO. If we're going to adopt a new config file format it should have these characteristics, among others: well known (let's not invent a new one) supports hierarchical structure reasonably readable The ini format meets all of those requirements - and it's certainly far more readable/editable than XML and friends. No, it's really not hierarchical. It only has goes one level deep. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 12:07 PM, Andrew Dunstan and...@dunslane.net wrote: On 09/22/2010 04:54 AM, Dave Page wrote: On Wed, Sep 22, 2010 at 9:47 AM, Andrew Dunstanand...@dunslane.net wrote: The ini file format is not flexible enough, IMNSHO. If we're going to adopt a new config file format it should have these characteristics, among others: well known (let's not invent a new one) supports hierarchical structure reasonably readable The ini format meets all of those requirements - and it's certainly far more readable/editable than XML and friends. No, it's really not hierarchical. It only has goes one level deep. I guess pgAdmin/wxWidgets are broken then :-) [Servers] Count=5 [Servers/1] Server=localhost Description=PostgreSQL 8.3 ServiceID= DiscoveryID=/PostgreSQL/8.3 Port=5432 StorePwd=true Restore=false Database=postgres Username=postgres LastDatabase=postgres LastSchema=public DbRestriction= Colour=#FF SSL=0 Group=PPAS Rolename= [Servers/1/Databases] [Servers/1/Databases/postgres] SchemaRestriction= [Servers/1/Databases/pphq] SchemaRestriction= [Servers/1/Databases/template_postgis] SchemaRestriction= [Servers/2] ... ... -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On ons, 2010-09-22 at 12:20 +0100, Dave Page wrote: No, it's really not hierarchical. It only has goes one level deep. I guess pgAdmin/wxWidgets are broken then :-) [Servers] Count=5 [Servers/1] Server=localhost Well, by that logic, even what we have now for postgresql.conf is hierarchical. I think the criterion was rather meant to be - can represent hierarchies without repeating intermediate node names (Note: no opinion on which format is better for the task at hand) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 09/22/2010 07:20 AM, Dave Page wrote: On Wed, Sep 22, 2010 at 12:07 PM, Andrew Dunstanand...@dunslane.net wrote: On 09/22/2010 04:54 AM, Dave Page wrote: On Wed, Sep 22, 2010 at 9:47 AM, Andrew Dunstanand...@dunslane.net wrote: The ini file format is not flexible enough, IMNSHO. If we're going to adopt a new config file format it should have these characteristics, among others: well known (let's not invent a new one) supports hierarchical structure reasonably readable The ini format meets all of those requirements - and it's certainly far more readable/editable than XML and friends. No, it's really not hierarchical. It only has goes one level deep. I guess pgAdmin/wxWidgets are broken then :-) [Servers] Count=5 [Servers/1] Server=localhost Description=PostgreSQL 8.3 ServiceID= DiscoveryID=/PostgreSQL/8.3 Port=5432 StorePwd=true Restore=false Database=postgres Username=postgres LastDatabase=postgres LastSchema=public DbRestriction= Colour=#FF SSL=0 Group=PPAS Rolename= [Servers/1/Databases] [Servers/1/Databases/postgres] SchemaRestriction= [Servers/1/Databases/pphq] SchemaRestriction= [Servers/1/Databases/template_postgis] SchemaRestriction= [Servers/2] ... ... Well, that's not what I'd call a hierarchy, in any sane sense. I've often had to dig all over the place in ini files to find related bits of information in disparate parts of the file. Compared to a meaningful tree structure this is utterly woeful. In a sensible hierarchical format, all the information relating to, say, Servers/1 above, wopuld be under a stanza with that heading, instead of having separate and unnested stanzas like Servers/1/Databases/template_postgis. If you could nest stanzas in ini file format it would probably do, but you can't, leading to the above major ugliness. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 12:50 PM, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-09-22 at 12:20 +0100, Dave Page wrote: No, it's really not hierarchical. It only has goes one level deep. I guess pgAdmin/wxWidgets are broken then :-) [Servers] Count=5 [Servers/1] Server=localhost Well, by that logic, even what we have now for postgresql.conf is hierarchical. Well, yes - if you consider add-in GUCs which use prefixing like foo.setting=... I think the criterion was rather meant to be - can represent hierarchies without repeating intermediate node names If this were data, I could understand that as it could lead to tremendous bloat, but as a config file, I'd rather have the readability of the ini format, despite the repeated node names, than have to hack XML files by hand. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 09/22/2010 07:57 AM, Dave Page wrote: On Wed, Sep 22, 2010 at 12:50 PM, Peter Eisentrautpete...@gmx.net wrote: On ons, 2010-09-22 at 12:20 +0100, Dave Page wrote: No, it's really not hierarchical. It only has goes one level deep. I guess pgAdmin/wxWidgets are broken then :-) [Servers] Count=5 [Servers/1] Server=localhost Well, by that logic, even what we have now for postgresql.conf is hierarchical. Well, yes - if you consider add-in GUCs which use prefixing like foo.setting=... I think the criterion was rather meant to be - can represent hierarchies without repeating intermediate node names If this were data, I could understand that as it could lead to tremendous bloat, but as a config file, I'd rather have the readability of the ini format, despite the repeated node names, than have to hack XML files by hand. XML is not the only alternative - please don't use it as a straw man. For example, here is a fragment from the Bacula docs using their hierarchical format: FileSet { Name = Test Include { File = /home/xxx/test Options { regex = .*\.c$ } } } Or here is a piece from the buildfarm client config (which is in fact perl, but could also be JSON or similar fairly easily): mail_events = { all = [], fail = [], change = ['f...@bar.com', 'b...@blurfl.org' ], green = [], }, build_env = { CCACHE_DIR = /home/andrew/pgfarmbuild/ccache/$branch, }, cheers andrew
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 1:25 PM, Andrew Dunstan and...@dunslane.net wrote: XML is not the only alternative - please don't use it as a straw man. For example, here is a fragment from the Bacula docs using their hierarchical format: FileSet { Name = Test Include { File = /home/xxx/test Options { regex = .*\.c$ } } } Or here is a piece from the buildfarm client config (which is in fact perl, but could also be JSON or similar fairly easily): mail_events = { all = [], fail = [], change = ['f...@bar.com', 'b...@blurfl.org' ], green = [], }, build_env = { CCACHE_DIR = /home/andrew/pgfarmbuild/ccache/$branch, }, Both of which I've also used in the past, and also find uncomfortable and awkward for configuration files. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 09/22/2010 08:32 AM, Dave Page wrote: On Wed, Sep 22, 2010 at 1:25 PM, Andrew Dunstanand...@dunslane.net wrote: XML is not the only alternative - please don't use it as a straw man. For example, here is a fragment from the Bacula docs using their hierarchical format: FileSet { Name = Test Include { File = /home/xxx/test Options { regex = .*\.c$ } } } Or here is a piece from the buildfarm client config (which is in fact perl, but could also be JSON or similar fairly easily): mail_events = { all = [], fail = [], change = ['f...@bar.com', 'b...@blurfl.org' ], green = [], }, build_env = { CCACHE_DIR = /home/andrew/pgfarmbuild/ccache/$branch, }, Both of which I've also used in the past, and also find uncomfortable and awkward for configuration files. I can't imagine trying to configure Bacula using ini file format - the mind just boggles. Frankly, I'd rather stick with our current config format than change to something as inadequate as ini file format. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 9:01 AM, Andrew Dunstan and...@dunslane.net wrote: I can't imagine trying to configure Bacula using ini file format - the mind just boggles. Frankly, I'd rather stick with our current config format than change to something as inadequate as ini file format. Perhaps we need to define a little better what information we think we might eventually need to represent in the config file. With one exception, nobody has suggested anything that would actually require hierarchical structure. The exception is defining the policy for deciding when a commit has been sufficiently acknowledged by an adequate quorum of standbys, and it seems to me that doing that in its full generality is going to require not so much a hierarchical structure as a small programming language. The efforts so far have centered around reducing the use cases that $AUTHOR cares about to a set of GUCs which would satisfy that person's needs, but not necessarily everyone else's needs. I think efforts to encode arbitrary algorithms using configuration settings are doomed to failure, so I'm unimpressed by the argument that we should design the config file to support our attempts to do so. For everything else, no one has suggested that we need anything more complex than, essentially, a group of GUCs per server. So we could do: [server] guc=value or server.guc=value ...or something else. Designing this to support: server.hypothesis.experimental.unproven.imaginary.what-in-the-world-could-this-possibly-be = 42 ...seems pretty speculative at this point, unless someone can imagine what we'd want it for. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Tue, 2010-09-21 at 17:04 -0700, Josh Berkus wrote: That said, the timeout option also feels a bit wishy-washy to me. With a timeout, acknowledgment of a commit means your transaction is safely committed in the master and slave. Or not, if there was some glitch with the slave. That doesn't seem like a very useful guarantee; if you're happy with that why not just use async replication? Ah, I wasn't clear. My thought was that a standby which exceeds the timeout would be marked as nonresponsive and no longer included in the list of standbys which needed to be synchronized. That is, the timeout would be a timeout which says this standby is down. So the only case where standby registration is required is where you deliberately choose to *not* have N+1 redundancy and then yet still require all N standbys to acknowledge. That is a suicidal config and nobody would sanely choose that. It's not a large or useful use case for standby reg. (But it does raise the question again of whether we need quorum commit). This is becoming very confusing. Some people advocating standby registration have claimed it allows capabilities which aren't possible any other way; all but one of those claims has so far been wrong - the remaining case is described above. If I'm the one that is wrong, please tell me where I erred. Thinking of this as a sysadmin, what I want is to have *one place* I can go an troubleshoot my standby setup. If I have 12 synch standbys and they're creating too much load on the master, and I want to change half of them to async, I don't want to have to ssh into 6 different machines to do so. If one standby needs to be taken out of the network because it's too slow, I want to be able to log in to the master and instantly identify which standby is lagging and remove it there. The above case is one where I can see your point and it does sound easier in that case. But I then think: What happens after failover?. We would then need to have 12 different standby.conf files, one on each standby that describes what the setup would look like if that standby became the master. And guess what, every time we made a change on the master, you'd need to re-edit all 12 standby.conf files to reflect the new configuration. So we're still back to having to edit in multiple places, ISTM. Please, please, somebody write down what the design proposal is *before* we make a decision on whether it is a sensible way to proceed. It would be good to see a few options written down and some objective analysis of which way is best to let people decide. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas wrote: [server] guc=value or server.guc=value Yes, this was my idea too. It uses our existing config file format. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 22 September 2010 17:23, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: [server] guc=value or server.guc=value Yes, this was my idea too. It uses our existing config file format. So... sync_rep_services = {critical: recv=2, fsync=2, replay=1; important: fsync=3; reporting: recv=2, apply=1} becomes ... sync_rep_services.critical.recv = 2 sync_rep_services.critical.fsync = 2 sync_rep_services.critical.replay = 2 sync_rep_services.important.fsync = 3 sync_rep_services.reporting.recv = 2 sync_rep_services.reporting.apply = 1 I actually started to give this example to demonstrate how cumbersome it would look... but now that I've just typed it out, I've changed my mind. I actually like it! -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Thom Brown wrote: On 22 September 2010 17:23, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: [server] guc=value or server.guc=value ? Yes, this was my idea too. ?It uses our existing config file format. So... sync_rep_services = {critical: recv=2, fsync=2, replay=1; important: fsync=3; reporting: recv=2, apply=1} becomes ... sync_rep_services.critical.recv = 2 sync_rep_services.critical.fsync = 2 sync_rep_services.critical.replay = 2 sync_rep_services.important.fsync = 3 sync_rep_services.reporting.recv = 2 sync_rep_services.reporting.apply = 1 I actually started to give this example to demonstrate how cumbersome it would look... but now that I've just typed it out, I've changed my mind. I actually like it! It can be prone to mistyping, but it seems simple enough. We already through a nice error for mistypes in the sever logs. :-) I don't think we support 3rd level specifications, but we could. Looks very Java-ish. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, 2010-09-22 at 17:43 +0100, Thom Brown wrote: So... sync_rep_services = {critical: recv=2, fsync=2, replay=1; important: fsync=3; reporting: recv=2, apply=1} becomes ... sync_rep_services.critical.recv = 2 sync_rep_services.critical.fsync = 2 sync_rep_services.critical.replay = 2 sync_rep_services.important.fsync = 3 sync_rep_services.reporting.recv = 2 sync_rep_services.reporting.apply = 1 I actually started to give this example to demonstrate how cumbersome it would look... but now that I've just typed it out, I've changed my mind. I actually like it! With respect, this is ugly. Very ugly. Why do we insist on cryptic parameters within a config file which should be set within the database by a super user. I mean really? ALTER CLUSTER ENABLE [SYNC] REPLICATION ON db.foobar.com PORT 5432 ALIAS CRITICAL; ALTER CLUSTER SET REPLICATION CRITICAL RECEIVE FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL FSYNC FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL REPLAY FOR 2; Or some such thing. I saw Heiiki's reply but really the idea that we are shoving this all into the postgresql.conf is cumbersome. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 12:51 PM, Joshua D. Drake j...@commandprompt.com wrote: On Wed, 2010-09-22 at 17:43 +0100, Thom Brown wrote: So... sync_rep_services = {critical: recv=2, fsync=2, replay=1; important: fsync=3; reporting: recv=2, apply=1} becomes ... sync_rep_services.critical.recv = 2 sync_rep_services.critical.fsync = 2 sync_rep_services.critical.replay = 2 sync_rep_services.important.fsync = 3 sync_rep_services.reporting.recv = 2 sync_rep_services.reporting.apply = 1 I actually started to give this example to demonstrate how cumbersome it would look... but now that I've just typed it out, I've changed my mind. I actually like it! With respect, this is ugly. Very ugly. Why do we insist on cryptic parameters within a config file which should be set within the database by a super user. I mean really? ALTER CLUSTER ENABLE [SYNC] REPLICATION ON db.foobar.com PORT 5432 ALIAS CRITICAL; ALTER CLUSTER SET REPLICATION CRITICAL RECEIVE FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL FSYNC FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL REPLAY FOR 2; Or some such thing. I saw Heiiki's reply but really the idea that we are shoving this all into the postgresql.conf is cumbersome. I think it should be a separate config file, and I think it should be a config file that can be edited using DDL commands as you propose. But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 22/09/10 20:00, Robert Haas wrote: But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. FWIW it could be a system catalog backed by a flat file. But I'm not in favor of that for the other reasons I stated earlier. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 8:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Not speaking to the necessity of standby registration, but... Thinking of this as a sysadmin, what I want is to have *one place* I can go an troubleshoot my standby setup. If I have 12 synch standbys and they're creating too much load on the master, and I want to change half of them to async, I don't want to have to ssh into 6 different machines to do so. If one standby needs to be taken out of the network because it's too slow, I want to be able to log in to the master and instantly identify which standby is lagging and remove it there. The above case is one where I can see your point and it does sound easier in that case. But I then think: What happens after failover?. We would then need to have 12 different standby.conf files, one on each standby that describes what the setup would look like if that standby became the master. And guess what, every time we made a change on the master, you'd need to re-edit all 12 standby.conf files to reflect the new configuration. So we're still back to having to edit in multiple places, ISTM. An interesting option here might be to have replication.conf (instead of standby.conf) which would list all servers, and a postgresql.conf setting which would set the local name the master would then ignore. Then all PG servers (master+slave) would be able to have identical replication.conf files, only having to know their own name. Their own name could be GUC, from postgresql.conf, or from command line options, or default to hostname, whatever. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: I mean really? ALTER CLUSTER ENABLE [SYNC] REPLICATION ON db.foobar.com PORT 5432 ALIAS CRITICAL; ALTER CLUSTER SET REPLICATION CRITICAL RECEIVE FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL FSYNC FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL REPLAY FOR 2; Or some such thing. I saw Heiiki's reply but really the idea that we are shoving this all into the postgresql.conf is cumbersome. I think it should be a separate config file, and I think it should be a config file that can be edited using DDL commands as you propose. But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. I guarantee you there is a way around the cascade slave problem. I believe there will be some postgresql.conf pollution. I don't see any other way around that but the conf should be limited to things that literally have to be expressed in a conf for specific static purposes. I was talking with Bruce on Jabber and one of his concerns with my approach is polluting the SQL space for non-admins. I certainly appreciate that my solution puts code in more places and that it may be more of a burden for the hackers. However, we aren't building this for hackers. Most hackers don't even use the product. We are building it for our community, which are by far user space developers and dbas. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Heikki Linnakangas wrote: On 22/09/10 20:00, Robert Haas wrote: But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. FWIW it could be a system catalog backed by a flat file. But I'm not in favor of that for the other reasons I stated earlier. I thought we just eliminated flat file backing store for tables to improve replication behavior --- I don't see returning to that as a win. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 22/09/10 20:02, Heikki Linnakangas wrote: On 22/09/10 20:00, Robert Haas wrote: But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. FWIW it could be a system catalog backed by a flat file. But I'm not in favor of that for the other reasons I stated earlier. Huh, I just realized that my reply didn't make any sense. For some reason I thought you were saying that it can't be a catalog because backends need to access it without attaching to a database. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, Sep 22, 2010 at 1:09 PM, Joshua D. Drake j...@commandprompt.com wrote: On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: I mean really? ALTER CLUSTER ENABLE [SYNC] REPLICATION ON db.foobar.com PORT 5432 ALIAS CRITICAL; ALTER CLUSTER SET REPLICATION CRITICAL RECEIVE FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL FSYNC FOR 2; ALTER CLUSTER SET REPLICATION CRITICAL REPLAY FOR 2; Or some such thing. I saw Heiiki's reply but really the idea that we are shoving this all into the postgresql.conf is cumbersome. I think it should be a separate config file, and I think it should be a config file that can be edited using DDL commands as you propose. But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. I guarantee you there is a way around the cascade slave problem. And that would be...? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas robertmh...@gmail.com writes: On Wed, Sep 22, 2010 at 1:09 PM, Joshua D. Drake j...@commandprompt.com wrote: On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. I guarantee you there is a way around the cascade slave problem. And that would be...? Indeed. If it's a catalog then it has to be exactly the same on the master and every slave; which is probably a constraint we don't want for numerous reasons, not only cascade arrangements. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Wed, 2010-09-22 at 13:26 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Sep 22, 2010 at 1:09 PM, Joshua D. Drake j...@commandprompt.com wrote: On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. I guarantee you there is a way around the cascade slave problem. And that would be...? Indeed. If it's a catalog then it has to be exactly the same on the master and every slave; which is probably a constraint we don't want for numerous reasons, not only cascade arrangements. Unless I am missing something the catalog only needs information for its specific cluster. E.g; My Master is, I am master for. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Joshua D. Drake j...@commandprompt.com writes: Unless I am missing something the catalog only needs information for its specific cluster. E.g; My Master is, I am master for. I think the cluster here is composed of all and any server partaking into the replication network, whatever its role and cascading level, because we only support one master. As soon as the setup is replicated too, you can edit the setup from the one true master and from nowhere else, so the single authority must contain the whole setup. Now that doesn't mean all lines in the setup couldn't refer to a provider which could be different from the master in the case of cascading. What I don't understand is why the replication network topology can't get serialized into a catalog? Then again, assuming that a catalog ain't possible, I guess any file based setup will mean manual syncing of the whole setup at all the servers participating in the replication? If that's the case, I'll say it again, it looks like a nightmare to admin and I'd much prefer having a distributed setup, where any standby's setup is simple and directed to a single remote node, its provider. Please note also that such an arrangement doesn't preclude from having a way to register the standbys (automatically please) and requiring some action to enable the replication from their provider, and possibly from the master. But as there's already the hba to setup, I'd think paranoid sites are covered already. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Sep 22, 2010 at 1:09 PM, Joshua D. Drake j...@commandprompt.com wrote: On Wed, 2010-09-22 at 13:00 -0400, Robert Haas wrote: But it CAN'T be a system catalog, because, among other problems, that rules out cascading slaves, which are a feature a lot of people probably want to eventually have. I guarantee you there is a way around the cascade slave problem. And that would be...? Indeed. If it's a catalog then it has to be exactly the same on the master and every slave; which is probably a constraint we don't want for numerous reasons, not only cascade arrangements. It might be an idea to store the replication information outside of all clusters involved in the replication, to not depend on any failure of the master or any of the slaves. We've been using Apache's zookeeper http://hadoop.apache.org/zookeeper/ to keep track of configuration-like knowledge that must be distributed over a number of servers. While Zookeeper itself is probably not fit (java) to use in core Postgres to keep track of configuration information, what it provides seems like the perfect solution, especially group membership and a replicated directory-like database (with per directory node a value). regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
The above case is one where I can see your point and it does sound easier in that case. But I then think: What happens after failover?. We would then need to have 12 different standby.conf files, one on each standby that describes what the setup would look like if that standby became the master. And guess what, every time we made a change on the master, you'd need to re-edit all 12 standby.conf files to reflect the new configuration. So we're still back to having to edit in multiple places, ISTM. Unless we can make the standby.conf files identical on all servers in the group. If we can do that, then conf file management utilities, fileshares, or a simple automated rsync could easily take care of things. But ... any setup which involves each standby being *required* to have a different configuration on each standby server, which has to be edited separately, is going to be fatally difficult to manage for anyone who has more than a couple of standbys. So I'd like to look at what it takes to get away from that. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 22:42 +0100, Thom Brown wrote: On 20 September 2010 22:14, Robert Haas robertmh...@gmail.com wrote: Well, if you need to talk to all the other standbys and see who has the furtherst-advanced xlog pointer, it seems like you have to have a list somewhere of who they all are. When they connect to the master to get the stream, don't they in effect, already talk to the primary with the XLogRecPtr being relayed? Can the connection IP, port, XLogRecPtr and request time of the standby be stored from this communication to track the states of each standby? They would in effect be registering upon WAL stream request... and no doubt this is a horrifically naive view of how it works. It's not viable to record information at the chunk level in that way. But the overall idea is fine. We can track who was connected and how to access their LSNs. They don't need to be registered ahead of time on the master to do that. They can register and deregister each time they connect. This discussion is reminiscent of the discussion we had when Fujii first suggested that the standby should connect to the master. At first I though don't be stupid, the master needs to connect to the standby!. It stood everything I had thought about on its head and that hurt, but there was no logical reason to oppose. We could have used standby registration on the master to handle that, but we didn't. I'm happy that we have a more flexible system as a result. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Sat, Sep 18, 2010 at 4:36 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Simon Riggs si...@2ndquadrant.com writes: On Fri, 2010-09-17 at 21:20 +0900, Fujii Masao wrote: What synchronization level does each combination of sync_replication and sync_replication_service lead to? There are only 4 possible outcomes. There is no combination, so we don't need a table like that above. The service specifies the highest request type available from that specific standby. If someone requests a higher service than is currently offered by this standby, they will either a) get that service from another standby that does offer that level b) automatically downgrade the sync rep mode to the highest available. I like the a) part, I can't say the same about the b) part. There's no reason to accept to COMMIT a transaction when the requested durability is known not to have been reached, unless the user said so. Yep, I can imagine that some people want to ensure that *all* the transactions are synchronously replicated to the synchronous standby, without regard to sync_replication. So I'm not sure if automatic downgrade/upgrade of the mode makes sense. We should introduce new parameter specifying whether to allow automatic degrade/upgrade or not? It seems complicated though. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas robertmh...@gmail.com writes: So, here, we have two quite different things to be concerned about. First is the configuration, and I say that managing a distributed setup will be easier for the DBA. Yeah, I disagree with that, but I suppose it's a question of opinion. I'd be willing to share your thoughts if it was only for the initial setup. This one is hard enough to sketch on the paper that you prefer an easy way to implement it afterwards, and in some cases a central setup would be just that. The problem is that I'm concerned with upgrading the setup once the system is live. Not at the best time for that in the project, either, but when you finally get the budget to expand the number of servers. From experience with skytools, no manual registering works best. But… I think that without standby registration it will be tricky to display information like the last time that standby foo was connected. Yeah, you could set a standby name on the standby server and just have the master remember details for every standby name it's ever seen, but then how do you prune the list? … I now realize there are 2 parts under the registration bit. What I don't see helping is manual registration. For some use cases you're talking about maintaining a list of known servers sounds important, and that's also what londiste is doing. Pruning the list would be done with some admin function. You need one to see the current state already, add some other one to unregister a known standby. In londiste, that's how it works, and events are kept in the queues for all known subscribers. For the ones that won't ever connect again, that's of course a problem, so you SELECT pgq.unregister_consumer(…);. Heikki mentioned another application for having a list of the current standbys only (rather than every standby that has ever existed) upthread: you can compute the exact amount of WAL you need to keep around. Well, either way, the system can not decide on its own whether a currently not available standby is going to join the party again later on. Now it seems to me that all you need here is the master sending one more information with each WAL segment, the currently fsync'ed position, which pre-9.1 is implied as being the current LSN from the stream, right? I don't see how that would help you. I think you want to refrain to apply any WAL segment you receive at the standby and instead only advance as far as the master is known to have reached. And you want this information to be safe against slave restart, too: don't replay any WAL you have in pg_xlog or in the archive. The other part of your proposal is another story (having slaves talk to each-other at master crash). Well, if you need to talk to all the other standbys and see who has the furtherst-advanced xlog pointer, it seems like you have to have a list somewhere of who they all are. Ah sorry I was thinking on the other part of the proposal only (sending WAL segments that are not been fsync'ed yet on the master). So, yes. But I thought you were saying that replicating a (shared?) catalog of standbys is technically hard (or impossible), so how would you go about it? As it's all about making things simpler for the users, you're not saying that they should keep the main setup in sync manually on all the standbys servers, right? Maybe there's some way to get this to work without standby registration, but I don't really understand the resistance to the idea In fact I'm now realising what I don't like is having to manually do the registration work: as I already have to setup the slaves, it only appears like a useless burden on me, giving information the system already has. Automatic registration I'm fine with, I now realize. Regards, -- Dimitri Fontaine PostgreSQL DBA, Architecte -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Sun, Sep 19, 2010 at 7:20 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Sep 18, 2010 at 5:42 PM, Josh Berkus j...@agliodbs.com wrote: There are considerable benefits to having a standby registry with a table-like interface. Particularly, one where we could change replication via UPDATE (or ALTER STANDBY) statements. I think that using a system catalog for this is going to be a non-starter, but we could use a flat file that is designed to be machine-editable (and thus avoid repeating the mistake we've made with postgresql.conf). Yep, the standby registration information should be accessible and changable while the server is not running. So using only system catalog is not an answer. My patch has implemented standbys.conf which was proposed before. This format is the almost same as the pg_hba.conf. Is this machine-editable, you think? If not, we should the format to something like xml? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 21 September 2010 09:29, Fujii Masao masao.fu...@gmail.com wrote: On Sun, Sep 19, 2010 at 7:20 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Sep 18, 2010 at 5:42 PM, Josh Berkus j...@agliodbs.com wrote: There are considerable benefits to having a standby registry with a table-like interface. Particularly, one where we could change replication via UPDATE (or ALTER STANDBY) statements. I think that using a system catalog for this is going to be a non-starter, but we could use a flat file that is designed to be machine-editable (and thus avoid repeating the mistake we've made with postgresql.conf). Yep, the standby registration information should be accessible and changable while the server is not running. So using only system catalog is not an answer. My patch has implemented standbys.conf which was proposed before. This format is the almost same as the pg_hba.conf. Is this machine-editable, you think? If not, we should the format to something like xml? I really don't think an XML config would improve anything. In fact it would just introduce more ways to break the config by the mere fact it has to be well-formed. I'd be in favour of one similar to pg_hba.conf, because then, at least, we'd still only have 2 formats of configuration. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Tue, Sep 21, 2010 at 9:34 AM, Thom Brown t...@linux.com wrote: I really don't think an XML config would improve anything. In fact it would just introduce more ways to break the config by the mere fact it has to be well-formed. I'd be in favour of one similar to pg_hba.conf, because then, at least, we'd still only have 2 formats of configuration. Want to spend a few days hacking on a config editor for pgAdmin, and then re-evaluate that comment? :-) -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 21 September 2010 09:37, Dave Page dp...@pgadmin.org wrote: On Tue, Sep 21, 2010 at 9:34 AM, Thom Brown t...@linux.com wrote: I really don't think an XML config would improve anything. In fact it would just introduce more ways to break the config by the mere fact it has to be well-formed. I'd be in favour of one similar to pg_hba.conf, because then, at least, we'd still only have 2 formats of configuration. Want to spend a few days hacking on a config editor for pgAdmin, and then re-evaluate that comment? It would be quicker to add in support for a config format we don't use yet than to duplicate support for a new config in the same format as an existing one? Plus it's a compromise between user-screw-up-ability and machine-readability. My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 3:27 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: However, the wait forever behavior becomes useful if you have a monitoring application outside the DB that decides when enough is enough and tells the DB that the slave can be considered dead. So wait forever actually means wait until I tell you that you can give up. The monitoring application can STONITH to ensure that the slave stays down, before letting the master proceed with the commit. This is also useful for preventing a failover from causing some data loss by promoting the lagged standby to the master. To avoid any data loss, we must STONITH the standby before any transactions resume on the master, when replication connection is terminated or the crash of the standby happens. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 21/09/10 11:52, Thom Brown wrote: My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. I'm not a big fan of XML either. That said, the format could use some hierarchy. If we add many more per-server options, one server per line will quickly become unreadable. Perhaps something like the ini-file syntax Robert Haas just made up elsewhere in this thread: --- globaloption1 = value [servername1] synchronization_level = async option1 = value [servername2] synchronization_level = replay option2 = value1 --- I'm not sure I like the ini-file style much, but the two-level structure it provides seems like a perfect match. Then again, maybe we should go with something like json or yaml that would allow deeper hierarchies for the sake of future expandability. Oh, and there Dimitri's idea of service levels for per-transaction control (http://archives.postgresql.org/message-id/m2sk1868hb@hi-media.com): sync_rep_services = {critical: recv=2, fsync=2, replay=1; important: fsync=3; reporting: recv=2, apply=1} We'll need to accommodate something like that too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21/09/10 11:52, Thom Brown wrote: My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. I'm not a big fan of XML either. ... Then again, maybe we should go with something like json or yaml The fundamental problem with all those machine editable formats is that they aren't people editable. If you have to have a tool (other than a text editor) to change a config file, you're going to be very unhappy when things are broken at 3AM and you're trying to fix it while ssh'd in from your phone. I think the ini file format suggestion is probably a good one; it seems to fit this problem, and it's something that people are used to. We could probably shoehorn the info into a pg_hba-like format, but I'm concerned about whether we'd be pushing that format beyond what it can reasonably handle. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Tue, Sep 21, 2010 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21/09/10 11:52, Thom Brown wrote: My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. I'm not a big fan of XML either. ... Then again, maybe we should go with something like json or yaml The fundamental problem with all those machine editable formats is that they aren't people editable. If you have to have a tool (other than a text editor) to change a config file, you're going to be very unhappy when things are broken at 3AM and you're trying to fix it while ssh'd in from your phone. Agreed. Although, if things are broken at 3AM and I'm trying to fix it while ssh'd in from my phone, I reserve the right to be VERY unhappy no matter what format the file is in. :-) I think the ini file format suggestion is probably a good one; it seems to fit this problem, and it's something that people are used to. We could probably shoehorn the info into a pg_hba-like format, but I'm concerned about whether we'd be pushing that format beyond what it can reasonably handle. It's not clear how many attributes we'll want to associate with a server. Simon seems to think we can keep it to zero; I think it's positive but I can't say for sure how many there will eventually be. It may also be that a lot of the values will be optional things that are frequently left unspecified. Both of those make me think that a columnar format is probably not best. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Tue, 2010-09-21 at 16:58 +0900, Fujii Masao wrote: On Sat, Sep 18, 2010 at 4:36 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Simon Riggs si...@2ndquadrant.com writes: On Fri, 2010-09-17 at 21:20 +0900, Fujii Masao wrote: What synchronization level does each combination of sync_replication and sync_replication_service lead to? There are only 4 possible outcomes. There is no combination, so we don't need a table like that above. The service specifies the highest request type available from that specific standby. If someone requests a higher service than is currently offered by this standby, they will either a) get that service from another standby that does offer that level b) automatically downgrade the sync rep mode to the highest available. I like the a) part, I can't say the same about the b) part. There's no reason to accept to COMMIT a transaction when the requested durability is known not to have been reached, unless the user said so. Hmm, no reason? The reason is that the alternative is that the session would hang until a standby arrived that offered that level of service. Why would you want that behaviour? Would you really request that option? Yep, I can imagine that some people want to ensure that *all* the transactions are synchronously replicated to the synchronous standby, without regard to sync_replication. So I'm not sure if automatic downgrade/upgrade of the mode makes sense. We should introduce new parameter specifying whether to allow automatic degrade/upgrade or not? It seems complicated though. I agree, but I'm not against any additional parameter if people say they really want them *after* the consequences of those choices have been highlighted. IMHO we should focus on the parameters that deliver key use cases. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Robert Haas wrote: On Tue, Sep 21, 2010 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 21/09/10 11:52, Thom Brown wrote: My fear would be standby.conf would be edited by users who don't really know XML and then we'd have 3 different styles of config to tell the user to edit. I'm not a big fan of XML either. ... Then again, maybe we should go with something like json or yaml The fundamental problem with all those machine editable formats is that they aren't people editable. ?If you have to have a tool (other than a text editor) to change a config file, you're going to be very unhappy when things are broken at 3AM and you're trying to fix it while ssh'd in from your phone. Agreed. Although, if things are broken at 3AM and I'm trying to fix it while ssh'd in from my phone, I reserve the right to be VERY unhappy no matter what format the file is in. :-) I think the ini file format suggestion is probably a good one; it seems to fit this problem, and it's something that people are used to. We could probably shoehorn the info into a pg_hba-like format, but I'm concerned about whether we'd be pushing that format beyond what it can reasonably handle. It's not clear how many attributes we'll want to associate with a server. Simon seems to think we can keep it to zero; I think it's positive but I can't say for sure how many there will eventually be. It may also be that a lot of the values will be optional things that are frequently left unspecified. Both of those make me think that a columnar format is probably not best. Crazy idea, but could we use format like postgresql.conf by extending postgresql.conf syntax, e.g.: server1.failover = false server1.keep_connect = true -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
That said, the timeout option also feels a bit wishy-washy to me. With a timeout, acknowledgment of a commit means your transaction is safely committed in the master and slave. Or not, if there was some glitch with the slave. That doesn't seem like a very useful guarantee; if you're happy with that why not just use async replication? Ah, I wasn't clear. My thought was that a standby which exceeds the timeout would be marked as nonresponsive and no longer included in the list of standbys which needed to be synchronized. That is, the timeout would be a timeout which says this standby is down. So the only case where standby registration is required is where you deliberately choose to *not* have N+1 redundancy and then yet still require all N standbys to acknowledge. That is a suicidal config and nobody would sanely choose that. It's not a large or useful use case for standby reg. (But it does raise the question again of whether we need quorum commit). Thinking of this as a sysadmin, what I want is to have *one place* I can go an troubleshoot my standby setup. If I have 12 synch standbys and they're creating too much load on the master, and I want to change half of them to async, I don't want to have to ssh into 6 different machines to do so. If one standby needs to be taken out of the network because it's too slow, I want to be able to log in to the master and instantly identify which standby is lagging and remove it there. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Crazy idea, but could we use format like postgresql.conf by extending postgresql.conf syntax, e.g.: server1.failover = false server1.keep_connect = true Why is this in the config file at all. It should be: synchronous_replication = TRUE/FALSE then ALTER CLUSTER ENABLE REPLICATION FOR FOO; ALTER CLUSTER SET keep_connect ON FOO TO TRUE; Or some such thing. Sincerely, Joshua D. Drake -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 19/09/10 01:20, Robert Haas wrote: On Sat, Sep 18, 2010 at 5:42 PM, Josh Berkusj...@agliodbs.com wrote: There are considerable benefits to having a standby registry with a table-like interface. Particularly, one where we could change replication via UPDATE (or ALTER STANDBY) statements. I think that using a system catalog for this is going to be a non-starter, but we could use a flat file that is designed to be machine-editable (and thus avoid repeating the mistake we've made with postgresql.conf). Yeah, that needs some careful design. We also need to record transient information about each slave, like how far it has received WAL already. Ideally that information would survive database restart too, but maybe we can live without that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi, On 09/17/2010 01:56 PM, Fujii Masao wrote: And standby registration is required when we support wait forever when synchronous standby isn't connected at the moment option that Heikki explained upthread. That requirement can be reduced to say that the master only needs to known how many synchronous standbys *should* be connected. IIUC that's pretty much exactly the quorum_commit GUC that Simon proposed, because it doesn't make sense to have more synchronous standbys connected than quorum_commit (as Simon pointed out downthread). I'm unsure about what's better, the full list (giving a good overview, but more to configure) or the single sum GUC (being very flexible and closer to how things work internally). But that seems to be a UI question exclusively. Regarding the wait forever option: I don't think continuing is a viable alternative, as it silently ignores the requested level of persistence. The only alternative I can see is to abort with an error. As far as comparison is allowed, that's what Postgres-R currently does if there's no majority of nodes. It allows to emit an error message and helpful hints, as opposed to letting the admin figure out what and where it's hanging. Not throwing false errors has the same requirements as waiting forever, so that's an orthogonal issue, IMO. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 09:27 +0300, Heikki Linnakangas wrote: On 18/09/10 22:59, Robert Haas wrote: On Sat, Sep 18, 2010 at 4:50 AM, Simon Riggssi...@2ndquadrant.com wrote: Waiting might sound attractive. In practice, waiting will make all of your connections lock up and it will look to users as if their master has stopped working as well. (It has!). I can't imagine why anyone would ever want an option to select that; its the opposite of high availability. Just sounds like a serious footgun. Nevertheless, it seems that some people do want exactly that behavior, no matter how crazy it may seem to you. Yeah, I agree with both of you. I have a hard time imaging a situation where you would actually want that. It's not high availability, it's high durability. When a transaction is acknowledged as committed, you know it's never ever going to disappear even if a meteor strikes the current master server within the next 10 milliseconds. In practice, people want high availability instead. That said, the timeout option also feels a bit wishy-washy to me. With a timeout, acknowledgment of a commit means your transaction is safely committed in the master and slave. Or not, if there was some glitch with the slave. That doesn't seem like a very useful guarantee; if you're happy with that why not just use async replication? However, the wait forever behavior becomes useful if you have a monitoring application outside the DB that decides when enough is enough and tells the DB that the slave can be considered dead. So wait forever actually means wait until I tell you that you can give up. The monitoring application can STONITH to ensure that the slave stays down, before letting the master proceed with the commit. err... what is the difference between a timeout and stonith? None. We still proceed without the slave in both cases after the decision point. In all cases, we would clearly have a user accessible function to stop particular sessions, or all sessions, from waiting for standby to return. You would have 3 choices: * set automatic timeout * set wait forever and then wait for manual resolution * set wait forever and then trust to external clusterware Many people have asked for timeouts and I agree it's probably the easiest thing to do if you just have 1 standby. With that in mind, we have to make sure that a transaction that's waiting for acknowledgment of the commit from a slave is woken up if the configuration changes. There's a misunderstanding here of what I've said and its a subtle one. My patch supports a timeout of 0, i.e. wait forever. Which means I agree that functionality is desired and should be included. This operates by saying that if a currently-connected-standby goes down we will wait until the timeout. So I agree all 3 choices should be available to users. Discussion has been about what happens to ought-to-have-been-connected standbys. Heikki had argued we need standby registration because if a server *ought* to have been there, yet isn't currently there when we wait for sync rep, we would still wait forever for it to return. To do this you require standby registration. But there is a hidden issue there: If you care about high availability AND sync rep you have two standbys. If one goes down, the other is still there. In general, if you want high availability on N servers then you have N+1 standbys. If one goes down, the other standbys provide the required level of durability and we do not wait. So the only case where standby registration is required is where you deliberately choose to *not* have N+1 redundancy and then yet still require all N standbys to acknowledge. That is a suicidal config and nobody would sanely choose that. It's not a large or useful use case for standby reg. (But it does raise the question again of whether we need quorum commit). My take is that if the above use case occurs it is because one standby has just gone down and the standby is, for a hopefully short period, in a degraded state and that the service responds to that. So in my proposal, if a standby is not there *now* we don't wait for it. Which cuts out a huge bag of code, specification and such like that isn't required to support sane use cases. More stuff to get wrong and regret in later releases. The KISS principle, just like we apply in all other cases. If we did have standby registration, then I would implement it in a table, not in an external config file. That way when we performed a failover the data would be accessible on the new master. But I don't suggest we have CREATE/ALTER STANDBY syntax. We already have CREATE/ALTER SERVER if we wanted to do it in SQL. If we did that, ISTM we should choose functions. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] Configuring synchronous replication
On 20/09/10 12:17, Simon Riggs wrote: err... what is the difference between a timeout and stonith? STONITH (Shoot The Other Node In The Head) means that the other node is somehow disabled so that it won't unexpectedly come back alive. A timeout means that the slave hasn't been seen for a while, but it might reconnect just after the timeout has expired. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, 2010-09-20 at 15:16 +0300, Heikki Linnakangas wrote: On 20/09/10 12:17, Simon Riggs wrote: err... what is the difference between a timeout and stonith? STONITH (Shoot The Other Node In The Head) means that the other node is somehow disabled so that it won't unexpectedly come back alive. A timeout means that the slave hasn't been seen for a while, but it might reconnect just after the timeout has expired. You've edited my reply to change the meaning of what was a rhetorical question, as well as completely ignoring the main point of my reply. Please respond to the main point: Following some thought and analysis, AFAICS there is no sensible use case that requires standby registration. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On 20/09/10 15:50, Simon Riggs wrote: On Mon, 2010-09-20 at 15:16 +0300, Heikki Linnakangas wrote: On 20/09/10 12:17, Simon Riggs wrote: err... what is the difference between a timeout and stonith? STONITH (Shoot The Other Node In The Head) means that the other node is somehow disabled so that it won't unexpectedly come back alive. A timeout means that the slave hasn't been seen for a while, but it might reconnect just after the timeout has expired. You've edited my reply to change the meaning of what was a rhetorical question, as well as completely ignoring the main point of my reply. Please respond to the main point: Following some thought and analysis, AFAICS there is no sensible use case that requires standby registration. Ok, I had completely missed your point then. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 8:50 AM, Simon Riggs si...@2ndquadrant.com wrote: Please respond to the main point: Following some thought and analysis, AFAICS there is no sensible use case that requires standby registration. I disagree. You keep analyzing away the cases that require standby registration, but I don't believe that they're not real. Aidan Van Dyk's case upthread of wanting to make sure that the standby is up and replicating synchronously before the master starts processing transactions seems perfectly legitimate to me. Sure, it's paranoid, but so what? We're all about paranoia, at least as far as data loss is concerned. So the wait forever case is, in my opinion, sufficient to demonstrate that we need it, but it's not even my primary reason for wanting to have it. The most important reason why I think we should have standby registration is for simplicity of configuration. Yes, it adds another configuration file, but that configuration file contains ALL of the information about which standbys are synchronous. Without standby registration, this information will inevitably be split between the master config and the various slave configs and you'll have to look at all the configurations to be certain you understand how it's going to end up working. As a particular manifestation of this, and as previously argued and +1'd upthread, the ability to change the set of standbys to which the master is replicating synchronously without changing the configuration on the master or any of the existing slaves seems seems dangerous. Another reason why I think we should have standby registration is to allow eventually allow the streaming WAL backwards configuration which has previously been discussed. IOW, you could stream the WAL to the slave in advance of fsync-ing it on the master. After a power failure, the machines in the cluster can talk to each other and figure out which one has the furthest-advanced WAL pointer and stream from that machine to all the others. This is an appealing configuration for people using sync rep because it would allow the fsyncs to be done in parallel rather than sequentially as is currently necessary - but if you're using it, you're certainly not going to want the master to enter normal running without waiting to hear from the slave. Just to be clear, that is a list of three independent reasons any one of which I think is sufficient for wanting standby registration. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
Hi, I'm somewhat sorry to have to play this game, as I sure don't feel smarter by composing this email. Quite the contrary. Robert Haas robertmh...@gmail.com writes: So the wait forever case is, in my opinion, sufficient to demonstrate that we need it, but it's not even my primary reason for wanting to have it. You're talking about standby registration on the master. You can solve this case without it, because when a slave is not connected it's not giving any feedback (vote, weight, ack) to the master. All you have to do is have the quorum setup in a way that disconnecting your slave means you can't reach the quorum any more. Have it SIGHUP and you can even choose to fix the setup, rather than fix the standby. So no need for registration here, it's just another way to solve the problem. Not saying it's better or worse, just another. Now we could have a summary function on the master showing all the known slaves, their last time of activity, their known current setup, etc, all from the master, but read-only. Would that be useful enough? The most important reason why I think we should have standby registration is for simplicity of configuration. Yes, it adds another configuration file, but that configuration file contains ALL of the information about which standbys are synchronous. Without standby registration, this information will inevitably be split between the master config and the various slave configs and you'll have to look at all the configurations to be certain you understand how it's going to end up working. So, here, we have two quite different things to be concerned about. First is the configuration, and I say that managing a distributed setup will be easier for the DBA. Then there's how to obtain a nice view about the distributed system, which again we can achieve from the master without manually registering the standbys. After all, the information you want needs to be there. As a particular manifestation of this, and as previously argued and +1'd upthread, the ability to change the set of standbys to which the master is replicating synchronously without changing the configuration on the master or any of the existing slaves seems seems dangerous. Well, you still need to open the HBA for the new standby to be able to connect, and to somehow take a base backup, right? We're not exactly transparent there, yet, are we? Another reason why I think we should have standby registration is to allow eventually allow the streaming WAL backwards configuration which has previously been discussed. IOW, you could stream the WAL to the slave in advance of fsync-ing it on the master. After a power failure, the machines in the cluster can talk to each other and figure out which one has the furthest-advanced WAL pointer and stream from that machine to all the others. This is an appealing configuration for people using sync rep because it would allow the fsyncs to be done in parallel rather than sequentially as is currently necessary - but if you're using it, you're certainly not going to want the master to enter normal running without waiting to hear from the slave. I love the idea. Now it seems to me that all you need here is the master sending one more information with each WAL segment, the currently fsync'ed position, which pre-9.1 is implied as being the current LSN from the stream, right? Here I'm not sure to follow you in details, but it seems to me registering the standbys is just another way of achieving the same. To be honest, I don't understand a bit how it helps implement your idea. Regards, -- Dimitri Fontaine PostgreSQL DBA, Architecte -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 4:10 PM, Dimitri Fontaine dfonta...@hi-media.com wrote: Robert Haas robertmh...@gmail.com writes: So the wait forever case is, in my opinion, sufficient to demonstrate that we need it, but it's not even my primary reason for wanting to have it. You're talking about standby registration on the master. You can solve this case without it, because when a slave is not connected it's not giving any feedback (vote, weight, ack) to the master. All you have to do is have the quorum setup in a way that disconnecting your slave means you can't reach the quorum any more. Have it SIGHUP and you can even choose to fix the setup, rather than fix the standby. I suppose that could work. The most important reason why I think we should have standby registration is for simplicity of configuration. Yes, it adds another configuration file, but that configuration file contains ALL of the information about which standbys are synchronous. Without standby registration, this information will inevitably be split between the master config and the various slave configs and you'll have to look at all the configurations to be certain you understand how it's going to end up working. So, here, we have two quite different things to be concerned about. First is the configuration, and I say that managing a distributed setup will be easier for the DBA. Yeah, I disagree with that, but I suppose it's a question of opinion. Then there's how to obtain a nice view about the distributed system, which again we can achieve from the master without manually registering the standbys. After all, the information you want needs to be there. I think that without standby registration it will be tricky to display information like the last time that standby foo was connected. Yeah, you could set a standby name on the standby server and just have the master remember details for every standby name it's ever seen, but then how do you prune the list? Heikki mentioned another application for having a list of the current standbys only (rather than every standby that has ever existed) upthread: you can compute the exact amount of WAL you need to keep around. As a particular manifestation of this, and as previously argued and +1'd upthread, the ability to change the set of standbys to which the master is replicating synchronously without changing the configuration on the master or any of the existing slaves seems seems dangerous. Well, you still need to open the HBA for the new standby to be able to connect, and to somehow take a base backup, right? We're not exactly transparent there, yet, are we? Sure, but you might have that set relatively open on a trusted network. Another reason why I think we should have standby registration is to allow eventually allow the streaming WAL backwards configuration which has previously been discussed. IOW, you could stream the WAL to the slave in advance of fsync-ing it on the master. After a power failure, the machines in the cluster can talk to each other and figure out which one has the furthest-advanced WAL pointer and stream from that machine to all the others. This is an appealing configuration for people using sync rep because it would allow the fsyncs to be done in parallel rather than sequentially as is currently necessary - but if you're using it, you're certainly not going to want the master to enter normal running without waiting to hear from the slave. I love the idea. Now it seems to me that all you need here is the master sending one more information with each WAL segment, the currently fsync'ed position, which pre-9.1 is implied as being the current LSN from the stream, right? I don't see how that would help you. Here I'm not sure to follow you in details, but it seems to me registering the standbys is just another way of achieving the same. To be honest, I don't understand a bit how it helps implement your idea. Well, if you need to talk to all the other standbys and see who has the furtherst-advanced xlog pointer, it seems like you have to have a list somewhere of who they all are. Maybe there's some way to get this to work without standby registration, but I don't really understand the resistance to the idea, and I fear it's going to do nothing good for our reputation for ease of use (or lack thereof). The idea of making this all work without standby registration strikes me as akin to the notion of having someone decide whether they're running a three-legged race by checking whether their leg is currently tied to someone else's leg. You can probably make that work by patching around the various failure cases, but why isn't simpler to just tell the poor guy Hi, Joe. You're running a three-legged race with Jane today. Hans and Juanita will be following you across the field, too, but don't worry about whether they're keeping up.? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres
Re: [HACKERS] Configuring synchronous replication
On 20 September 2010 22:14, Robert Haas robertmh...@gmail.com wrote: Well, if you need to talk to all the other standbys and see who has the furtherst-advanced xlog pointer, it seems like you have to have a list somewhere of who they all are. When they connect to the master to get the stream, don't they in effect, already talk to the primary with the XLogRecPtr being relayed? Can the connection IP, port, XLogRecPtr and request time of the standby be stored from this communication to track the states of each standby? They would in effect be registering upon WAL stream request... and no doubt this is a horrifically naive view of how it works. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Mon, Sep 20, 2010 at 5:42 PM, Thom Brown t...@linux.com wrote: On 20 September 2010 22:14, Robert Haas robertmh...@gmail.com wrote: Well, if you need to talk to all the other standbys and see who has the furtherst-advanced xlog pointer, it seems like you have to have a list somewhere of who they all are. When they connect to the master to get the stream, don't they in effect, already talk to the primary with the XLogRecPtr being relayed? Can the connection IP, port, XLogRecPtr and request time of the standby be stored from this communication to track the states of each standby? They would in effect be registering upon WAL stream request... and no doubt this is a horrifically naive view of how it works. Sure, but the point is that we can want DISCONNECTED slaves to affect master behavior in a variety of ways (master retains WAL for when they reconnect, master waits for them to connect before acking commits, master shuts down if they're not there, master tries to stream WAL backwards from them before entering normal running). I just work here, but it seems to me that such things will be easier if the master has an explicit notion of what's out there. Can we make it all work without that? Possibly, but I think it will be harder to understand. With standby registration, you can DECLARE the behavior you want. You can tell the master replicate synchronously to Bob. And that's it. Without standby registration, what's being proposed is basically that you can tell the master replicate synchronously to one server and you can tell Bob you are a server to which the master can replicate synchronously and you can tell the other servers you are not a server to which Bob can replicate synchronously. That works, but to me it seems less straightforward. And that's actually a relatively simple example. Suppose you want to tell the master keep enough WAL for Bob to catch up when he reconnects, but if he gets more than 1GB behind, forget about him. I'm sure someone can devise a way of making that work without standby registration, too, but I'm not too sure off the top of my head what it will be. With standby registration, you can just write something like this in standbys.conf (syntax invented): [bob] wal_keep_segments=64 I feel like that's really nice and simple. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
On Sat, 2010-09-18 at 14:42 -0700, Josh Berkus wrote: * Per-transaction control. Some transactions are important, others are not. Low priority. I see this as a 9.2 feature. Nobody I know is asking for it yet, and I think we need to get the other stuff right first. I understand completely why anybody that has never used sync replication would think per-transaction control is a small deal. I fully expect your clients to try sync rep and then 5 minutes later say Oh Crap, this sync rep is so slow it's unusable. Isn't there a way to tune it?. I've designed a way to tune sync rep so it is usable and useful. And putting that feature into 9.1 costs very little, if anything. My patch to do this is actually smaller than any other attempt to implement this and I claim faster too. You don't need to use the per-transaction controls, but they'll be there if you need them. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configuring synchronous replication
I've designed a way to tune sync rep so it is usable and useful. And putting that feature into 9.1 costs very little, if anything. My patch to do this is actually smaller than any other attempt to implement this and I claim faster too. You don't need to use the per-transaction controls, but they'll be there if you need them. Well, if you already have the code, that's a different story ... -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers