Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas
On 26/05/10 20:33, Kevin Grittner wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com wrote: Although, if the master crashes at that point, and quickly recovers, you could see the last transactions committed on the master before they're replicated to the standby. Versus having the

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Unless we have a transaction manager and do proper distributed transactions, how do you avoid edge conditions like that? Yeah, I guess you can't. You can guarantee that a commit is always safely flushed first in the master, or

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas
On 26/05/10 20:40, Simon Riggs wrote: On Wed, 2010-05-26 at 19:55 +0300, Heikki Linnakangas wrote: If you set quorum to 1, it also becomes critical infrastructure, because it's possible that a transaction has been replicated to the test server but not the real production standby, and a meteor

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 1:26 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2010-05-26 at 11:31 -0400, Robert Haas wrote: Your reply has again avoided the subject of how we would handle failure modes with per-standby settings. That is important. I don't think anyone is avoiding that,

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 14:30 -0400, Robert Haas wrote: On Wed, May 26, 2010 at 1:26 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2010-05-26 at 11:31 -0400, Robert Haas wrote: Your reply has again avoided the subject of how we would handle failure modes with per-standby settings.

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 3:13 PM, Simon Riggs si...@2ndquadrant.com wrote: I don't really understand this comment.  I have said, and I believe, that a system without quorum commit is simpler than one with quorum commit.  I'd debate the point with you but I find the point so self-evident that I

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Joshua D. Drake
On Wed, 2010-05-26 at 15:37 -0400, Robert Haas wrote: On Wed, May 26, 2010 at 3:13 PM, Simon Riggs si...@2ndquadrant.com wrote: I don't really understand this comment. I have said, and I believe, that a system without quorum commit is simpler than one with quorum commit. I'd debate the

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Dimitri Fontaine
Simon Riggs si...@2ndquadrant.com writes: On Wed, 2010-05-26 at 19:55 +0300, Heikki Linnakangas wrote: Now you want to set up a temporary replica of the master at a development server, for testing purposes. If you set quorum to 2, your development server becomes critical infrastructure, which

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas
On 26/05/10 23:31, Dimitri Fontaine wrote: d. choice of commit or rollback at timeout Rollback is not an option. There is no going back after the commit record has been flushed to disk or sent to a standby. The choice is to either commit anyway after the timeout, or wait forever. --

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Jan Wieck
On 5/26/2010 12:55 PM, Heikki Linnakangas wrote: On 26/05/10 18:31, Robert Haas wrote: And frankly, I don't think it's possible for quorum commit to reduce the number of parameters. Even if we have that feature available, not everyone will want to use it. And the people who don't will

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas
On 26/05/10 23:31, Dimitri Fontaine wrote: So if you want simplicity to admin, effective data availability and precise control over the global setup, I say go for: a. transaction level control of the replication level b. cascading support c. quorum with timeout d. choice of commit or

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Wed, 2010-05-26 at 17:31 -0400, Jan Wieck wrote: You can do this only with per standby options, by giving each standby a weight, or a number of votes. Your DEV server would have a weight of zero, while your production standby's have higher weights, depending on their importance for your

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Simon Riggs
On Thu, 2010-05-27 at 00:21 +0300, Heikki Linnakangas wrote: On 26/05/10 23:31, Dimitri Fontaine wrote: d. choice of commit or rollback at timeout Rollback is not an option. There is no going back after the commit record has been flushed to disk or sent to a standby. There's definitely

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Heikki Linnakangas
On 27/05/10 01:23, Simon Riggs wrote: On Thu, 2010-05-27 at 00:21 +0300, Heikki Linnakangas wrote: On 26/05/10 23:31, Dimitri Fontaine wrote: d. choice of commit or rollback at timeout Rollback is not an option. There is no going back after the commit record has been flushed to disk or

Re: [HACKERS] Synchronization levels in SR

2010-05-26 Thread Fujii Masao
On Wed, May 26, 2010 at 10:20 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2010-05-26 at 18:52 +0900, Fujii Masao wrote: I guess that dropping the support of #3 doesn't reduce complexity since the code of #3 is almost the same as that of #2. Like walreceiver sends the ACK after

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote: On Tue, May 25, 2010 at 10:29 AM, Josh Berkus j...@agliodbs.com wrote: I agree that #4 should be done last, but it will be needed, not in the least by your employer ;-) . I don't see any obvious way to make #4 compatible with any

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Mon, 2010-05-24 at 22:20 +0900, Fujii Masao wrote: Second, we need to discuss about how to specify the synch level. There are three approaches: * Per standby Since the purpose, location and H/W resource often differ from one standby to another, specifying level per standby (i.e.,

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Mon, 2010-05-24 at 18:29 -0700, Josh Berkus wrote: If people agree that the above is our roadmap, implementing per-standby first makes sense, and then we can implement per-session GUC later. IMHO per-standby sounds simple, but is dangerously simplistic, explained on another part of the

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Robert Haas
On Tue, May 25, 2010 at 12:28 PM, Simon Riggs si...@2ndquadrant.com wrote: Synchronous replication implies that a commit should wait. This wait is experienced by the transaction, not by other parts of the system. If we define robustness at the standby level then robustness depends upon unseen

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Joshua D. Drake
On Tue, 2010-05-25 at 12:40 -0400, Robert Haas wrote: On Tue, May 25, 2010 at 12:28 PM, Simon Riggs si...@2ndquadrant.com wrote: Synchronous replication implies that a commit should wait. This wait is experienced by the transaction, not by other parts of the system. If we define robustness

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: Simon Riggs si...@2ndquadrant.com wrote: If we define robustness at the standby level then robustness depends upon unseen administrators, as well as the current up/down state of standbys. This is action-at-a-distance in its worst form. Maybe, but I

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 12:40 -0400, Robert Haas wrote: On Tue, May 25, 2010 at 12:28 PM, Simon Riggs si...@2ndquadrant.com wrote: Synchronous replication implies that a commit should wait. This wait is experienced by the transaction, not by other parts of the system. If we define robustness

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote: Robert Haas robertmh...@gmail.com wrote: Simon Riggs si...@2ndquadrant.com wrote: If we define robustness at the standby level then robustness depends upon unseen administrators, as well as the current up/down state of standbys.

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Robert Haas
On Tue, May 25, 2010 at 1:10 PM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote: Robert Haas robertmh...@gmail.com wrote: Simon Riggs si...@2ndquadrant.com wrote: If we define robustness at the standby level then robustness depends upon

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 13:31 -0400, Robert Haas wrote: On Tue, May 25, 2010 at 1:10 PM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote: Robert Haas robertmh...@gmail.com wrote: Simon Riggs si...@2ndquadrant.com wrote: If we define

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 19:08 +0200, Alastair Turner wrote: On Tue, May 25, 2010 at 6:28 PM, Simon Riggs si...@2ndquadrant.com wrote: ... The best parameter we can specify is the number of servers that we wish to wait for confirmation from. That is a definition that easily manages the

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 13:31 -0400, Robert Haas wrote: So I agree that we need to talk about whether or not we want to do this. I'll give my opinion. I am not sure how useful this really is. Consider a master with two standbys. The master commits a transaction and waits for one of the two

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Alastair Turner
On Tue, May 25, 2010 at 6:28 PM, Simon Riggs si...@2ndquadrant.com wrote: ... The best parameter we can specify is the number of servers that we wish to wait for confirmation from. That is a definition that easily manages the complexity of having various servers up/down at any one time. It

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Yeb Havinga
Simon Riggs wrote: How we handle degraded mode is important, yes. Whatever parameters we choose the problem will remain the same. Should we just ignore degraded mode and respond as if nothing bad had happened? Most people would say not. If we specify server1 = synch and server2 = async we then

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Dimitri Fontaine
Hi, Simon Riggs si...@2ndquadrant.com writes: On Tue, 2010-05-25 at 19:08 +0200, Alastair Turner wrote: On Tue, May 25, 2010 at 6:28 PM, Simon Riggs si...@2ndquadrant.com wrote: The best parameter we can specify is the number of servers that we wish to wait for confirmation from. This

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs
On Tue, 2010-05-25 at 21:19 +0200, Yeb Havinga wrote: Simon Riggs wrote: How we handle degraded mode is important, yes. Whatever parameters we choose the problem will remain the same. Should we just ignore degraded mode and respond as if nothing bad had happened? Most people would say

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Florian Pflug
On May 25, 2010, at 22:16 , Simon Riggs wrote: All of these issues show why I want to specify the synchronisation mode as a USERSET. That will allow us to specify more easily which parts of our application are important when the cluster is degraded and which data is so critical it must reach

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Fujii Masao
On Wed, May 26, 2010 at 2:10 AM, Simon Riggs si...@2ndquadrant.com wrote: My suggestion is simply to have a single parameter (name unimportant) number_of_synch_servers_we_wait_for = N which is much easier to understand because it is phrased in terms of the guarantee given to the transaction,

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Robert Haas
On Tue, May 25, 2010 at 11:36 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, May 26, 2010 at 2:10 AM, Simon Riggs si...@2ndquadrant.com wrote: My suggestion is simply to have a single parameter (name unimportant) number_of_synch_servers_we_wait_for = N which is much easier to

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Fujii Masao
On Wed, May 26, 2010 at 1:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote: On Tue, May 25, 2010 at 10:29 AM, Josh Berkus j...@agliodbs.com wrote: I agree that #4 should be done last, but it will be needed, not in the least by your employer

[HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao
Hi, I'm now designing the synchronous replication feature based on SR for 9.1, while discussing that at another thread. http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php At the first design phase, I'd like to clarify which synch levels should be supported 9.1 and how it should be

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Heikki Linnakangas
On 24/05/10 16:20, Fujii Masao wrote: The log-shipping replication has some synch levels as follows. The transaction commit on the master #1 doesn't wait for replication (already suppored in 9.0) #2 waits for WAL to be received by the standby #3 waits for WAL to be received and

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Josh Berkus
#4 is useful for some cases, but might often make the transaction commit on the master get stuck since read-only query can easily block recovery by the lock conflict. So #4 seems not to be worth working on until that HS problem has been addressed. Thought? I agree that #4 should be done

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao
On Tue, May 25, 2010 at 1:18 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I see a lot of value in #4; it makes it possible to distribute read-only load to the standby using something like pgbouncer, completely transparently to the application. Agreed. In the lesser

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao
On Tue, May 25, 2010 at 10:29 AM, Josh Berkus j...@agliodbs.com wrote: I agree that #4 should be done last, but it will be needed, not in the least by your employer ;-) .  I don't see any obvious way to make #4 compatible with any significant query load on the slave, but in general I'd think

Re: [HACKERS] Synchronization levels in SR

2010-05-24 Thread Fujii Masao
On Mon, May 24, 2010 at 10:20 PM, Fujii Masao masao.fu...@gmail.com wrote: At the first design phase, I'd like to clarify which synch levels should be supported 9.1 and how it should be specified by users. There is another question about synch level: When should the master wait for

<    1   2