Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-25 Thread Bruce Momjian
On Fri, Jul 13, 2012 at 08:08:59PM -0430, Jose Ildefonso Camargo Tolosa wrote: > On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian wrote: > > On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: > >> How you decide what to do with the servers on failures isn't that > >> important here, re

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-16 Thread Daniel Farina
On Mon, Jul 16, 2012 at 10:58 PM, Heikki Linnakangas wrote: > BTW, one little detail that I don't think has been mentioned in this thread > before: Even though the master currently knows whether a standby is > connected or not, and you could write a patch to act based on that, there > are other fa

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-16 Thread Heikki Linnakangas
On 16.07.2012 22:01, Robert Haas wrote: On Sat, Jul 14, 2012 at 7:54 PM, Josh Berkus wrote: So, here's the core issue with degraded mode. I'm not mentioning this to block any patch anyone has, but rather out of a desire to see someone address this core problem with some clever idea I've not th

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-16 Thread Robert Haas
On Sat, Jul 14, 2012 at 7:54 PM, Josh Berkus wrote: > So, here's the core issue with degraded mode. I'm not mentioning this > to block any patch anyone has, but rather out of a desire to see someone > address this core problem with some clever idea I've not thought of. > The problem in a nutshell

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-14 Thread Josh Berkus
So, here's the core issue with degraded mode. I'm not mentioning this to block any patch anyone has, but rather out of a desire to see someone address this core problem with some clever idea I've not thought of. The problem in a nutshell is: indeterminancy. Assume someone implements degraded mode

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-14 Thread Jose Ildefonso Camargo Tolosa
On Sat, Jul 14, 2012 at 12:42 AM, Amit kapila wrote: >> From: Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] >> Sent: Saturday, July 14, 2012 9:36 AM >>On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila wrote: >> From: pgsql-hackers-ow...@postgresql.org >> [pgsql-hackers-ow...@postgresql

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Amit kapila
> From: Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] > Sent: Saturday, July 14, 2012 9:36 AM >On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila wrote: > From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] > on behalf of Jose Ildefonso Camargo Tolosa [ildefonso

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila wrote: > From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] > on behalf of Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] > Sent: Saturday, July 14, 2012 6:08 AM > On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Amit kapila
From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] on behalf of Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] Sent: Saturday, July 14, 2012 6:08 AM On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian wrote: > On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus We

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian wrote: > On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: >> How you decide what to do with the servers on failures isn't that >> important here, really. You can probably run e.g. Pacemaker on 3+ >> machines and have it check for quoru

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
Hi Hampus, On Fri, Jul 13, 2012 at 2:42 AM, Hampus Wessman wrote: > Hi all, > > Here are some (slightly too long) thoughts about this. Nah, not that long. > > Shaun Thomas skrev 2012-07-12 22:40: > >> On 07/12/2012 12:02 PM, Bruce Momjian wrote: >> >>> Well, the problem also exists if add it as

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 13, 2012 at 12:25 AM, Amit Kapila wrote: > >> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] >> On Behalf Of Jose Ildefonso Camargo Tolosa >>>On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk wrote: >> On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Bruce Momjian
On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: > How you decide what to do with the servers on failures isn't that > important here, really. You can probably run e.g. Pacemaker on 3+ > machines and have it check for quorums to accomplish this. That's a > good approach at least. You

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Hampus Wessman
Hi all, Here are some (slightly too long) thoughts about this. Shaun Thomas skrev 2012-07-12 22:40: On 07/12/2012 12:02 PM, Bruce Momjian wrote: Well, the problem also exists if add it as an internal database feature --- how long do we wait to consider the standby dead, how do we inform admin

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Amit Kapila
> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] > On Behalf Of Jose Ildefonso Camargo Tolosa >>On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk wrote: > On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas wrote: > > As currently is, the point of: freezing the mas

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 4:10 PM, Shaun Thomas wrote: > On 07/12/2012 12:02 PM, Bruce Momjian wrote: > >> Well, the problem also exists if add it as an internal database >> feature --- how long do we wait to consider the standby dead, how do >> we inform administrators, etc. > > > True. Though if t

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:29 PM, Aidan Van Dyk wrote: > On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa > >> Yeah, you need that with PostgreSQL, but no with DRBD, for example >> (sorry, but DRBD is one of the flagships of HA things in the Linux >> world). Also, I'm not convinced a

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa > Yeah, you need that with PostgreSQL, but no with DRBD, for example > (sorry, but DRBD is one of the flagships of HA things in the Linux > world). Also, I'm not convinced about the "2nd standby" thing... I > mean, just read this on t

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 12:17 PM, Bruce Momjian wrote: > On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote: >> > From: pgsql-hackers-ow...@postgresql.org >> [mailto:pgsql-hackers-ow...@postgresql.org] >> > On Behalf Of Jose Ildefonso Camargo Tolosa >> >> > Please, stop arguing on all of

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk wrote: > On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas > wrote: > >> So far as transaction durability is concerned... we have a continuous >> background rsync over dark fiber for archived transaction logs, DRBD for >> block-level sync, filesystem sn

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:35 AM, Dimitri Fontaine wrote: > Hi, > > Jose Ildefonso Camargo Tolosa writes: >> environments. And no, it doesn't makes synchronous replication >> meaningless, because it will work synchronous if it have someone to >> sync to, and work async (or standalone) if it doesn

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas
On 07/12/2012 12:02 PM, Bruce Momjian wrote: Well, the problem also exists if add it as an internal database feature --- how long do we wait to consider the standby dead, how do we inform administrators, etc. True. Though if there is no secondary connected, either because it's not there yet,

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 08:21:08AM -0500, Shaun Thomas wrote: > >But, putting that aside, why not write a piece of middleware that > >does precisely this, or whatever you want? It can live on the same > >machine as Postgres and ack synchronous commit when nobody is home, > >and notify (e.g. page) y

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote: > > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] > > On Behalf Of Jose Ildefonso Camargo Tolosa > > > Please, stop arguing on all of this: I don't think that adding an > > option will hurt anybo

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas wrote: > So far as transaction durability is concerned... we have a continuous > background rsync over dark fiber for archived transaction logs, DRBD for > block-level sync, filesystem snapshots for our backups, a redundant async DR > cluster, an offs

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas
On 07/12/2012 12:31 AM, Daniel Farina wrote: But RAID-1 as nominally seen is a fundamentally different problem, with much tinier differences in latency, bandwidth, and connectivity. Perhaps useful for study, but to suggest the problem is *that* similar I think is wrong. Well, yes and no. One o

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Dimitri Fontaine
Hi, Jose Ildefonso Camargo Tolosa writes: > environments. And no, it doesn't makes synchronous replication > meaningless, because it will work synchronous if it have someone to > sync to, and work async (or standalone) if it doesn't: that's perfect > for HA environment. You seem to want Service

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Amit Kapila
> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] > On Behalf Of Jose Ildefonso Camargo Tolosa > Please, stop arguing on all of this: I don't think that adding an > option will hurt anybody (specially because the work was already done > by someone), we are not

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Daniel Farina
On Wed, Jul 11, 2012 at 6:41 AM, Shaun Thomas wrote: >> Regardless of what DRBD does, I think the problem with the >> async/sync duality as-is is there is no nice way to manage exposure >> to transaction loss under various situations and requirements. > > > Which would be handy. With synchronous c

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Daniel Farina
On Wed, Jul 11, 2012 at 3:03 AM, Dimitri Fontaine wrote: > Daniel Farina writes: >> Notable caveat: one can't very easily measure or bound the amount of >> transaction loss in any graceful way as-is. We only have "unlimited >> lag" and "2-safe or bust". > > ¡per-transaction! > > You can chang

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Jose Ildefonso Camargo Tolosa
On Wed, Jul 11, 2012 at 11:48 PM, Josh Berkus wrote: > >> Please, stop arguing on all of this: I don't think that adding an >> option will hurt anybody (specially because the work was already done >> by someone), we are not asking to change how the things work, we just >> want an option to decided

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Josh Berkus
> Please, stop arguing on all of this: I don't think that adding an > option will hurt anybody (specially because the work was already done > by someone), we are not asking to change how the things work, we just > want an option to decided whether we want it to freeze on standby > disconnection, o

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Jose Ildefonso Camargo Tolosa
Greetings, On Wed, Jul 11, 2012 at 9:11 AM, Shaun Thomas wrote: > On 07/10/2012 06:02 PM, Daniel Farina wrote: > >> For example, what if DRBD can only complete one page per second for >> some reason? Does it it simply have the primary wait at this glacial >> pace, or drop synchronous replication

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Robert Haas
On Tue, Jul 10, 2012 at 12:57 PM, Josh Berkus wrote: > Per your exchange with Heikki, that's not actually how SyncRep works in > 9.1. So it's not giving you what you want anyway. > > This is why we felt that the "sync rep if you can" mode was useless and > didn't accept it into 9.1. The *only* d

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Josh Berkus
On 7/11/12 6:41 AM, Shaun Thomas wrote: > Which would be handy. With synchronous commits, it's given that the > protocol is bi-directional. Then again, PG can detect when clients > disconnect the instant they do so, and having such an event implicitly > disable synchronous_standby_names until recon

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Dimitri Fontaine
Shaun Thomas writes: >> Regardless of what DRBD does, I think the problem with the >> async/sync duality as-is is there is no nice way to manage exposure >> to transaction loss under various situations and requirements. Yeah. > Which would be handy. With synchronous commits, it's given that the

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Shaun Thomas
On 07/10/2012 06:02 PM, Daniel Farina wrote: For example, what if DRBD can only complete one page per second for some reason? Does it it simply have the primary wait at this glacial pace, or drop synchronous replication and go degraded? Or does it do something more clever than just a timeout?

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Dimitri Fontaine
Daniel Farina writes: > Notable caveat: one can't very easily measure or bound the amount of > transaction loss in any graceful way as-is. We only have "unlimited > lag" and "2-safe or bust". ¡per-transaction! You can change your mind mid-transaction and ask for 2-safe or bust. That's the de

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Daniel Farina
On Tue, Jul 10, 2012 at 2:42 PM, Dimitri Fontaine wrote:> > What you explain you want reads to me "Async replication + Archiving". Notable caveat: one can't very easily measure or bound the amount of transaction loss in any graceful way as-is. We only have "unlimited lag" and "2-safe or bust".

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Dimitri Fontaine
Shaun Thomas writes: > When you re-connect a secondary device, it catches up as fast as possible by > replaying waiting transactions, and then re-attaching to the cluster. Until > it's fully caught-up, it doesn't exist. DRBD acknowledges the secondary is > there and attempting to catch up, but doe

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Josh Berkus
Shaun, > Too many mental gymnastics. I get that async is "faster" than sync, but > the inconsistent transactional state makes it *look* slower. If a > customer makes an order, but just happens to check that order state on > the secondary before it can catch up, that's a net loss. Like I said, > th

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Daniel Farina
On Tue, Jul 10, 2012 at 6:28 AM, Shaun Thomas wrote: > On 07/10/2012 01:11 AM, Daniel Farina wrote: > >> So if I get this straight, what you are saying is "be asynchronous >> replication unless someone is around, in which case be synchronous" >> is the mode you want. > > > Er, no. I think I see wh

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Shaun Thomas
On 07/10/2012 09:40 AM, Heikki Linnakangas wrote: You are mistaken. It only guarantees that it's been sync'd to disk in the standby, but if there are open snapshots or the system is simply busy, it might takes minutes or more until the effects of that transaction become visible. Well, crap. It

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Heikki Linnakangas
On 10.07.2012 17:31, Shaun Thomas wrote: On 07/09/2012 05:15 PM, Josh Berkus wrote: So I'm unclear on why sync rep would be faster than async rep given that they use exactly the same mechanism. Explain? Too many mental gymnastics. I get that async is "faster" than sync, but the inconsistent tr

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Shaun Thomas
On 07/09/2012 05:15 PM, Josh Berkus wrote: "Total-consistency" replication is what I think you want, that is, to guarantee that at any given time a read query on the master will return the same results as a read query on the standby. Heck, *most* people would like to have that. You would also

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Aidan Van Dyk
On Tue, Jul 10, 2012 at 9:28 AM, Shaun Thomas wrote: > Async is simply too slow for our OLTP system except for the disaster > recovery node, which isn't expected to carry on within seconds of the > primary's failure. I briefly considered sync mode when it appeared as a > feature, but I see it's s

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Shaun Thomas
On 07/10/2012 01:11 AM, Daniel Farina wrote: So if I get this straight, what you are saying is "be asynchronous replication unless someone is around, in which case be synchronous" is the mode you want. Er, no. I think I see where you might have gotten that, but no. This is a pretty tricky de

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Magnus Hagander
On Tue, Jul 10, 2012 at 8:42 AM, Amit Kapila wrote: >> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Daniel Farina >> Sent: Tuesday, July 10, 2012 11:42 AM >>>On Mon, Jul 9, 2012 at 1:30 PM, Shaun Thomas > wrote: >>> >>> 1. Slave wants to be s

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-09 Thread Amit Kapila
> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Daniel Farina > Sent: Tuesday, July 10, 2012 11:42 AM >>On Mon, Jul 9, 2012 at 1:30 PM, Shaun Thomas wrote: >> >> 1. Slave wants to be synchronous with master. Master wants replication on at least o

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-09 Thread Daniel Farina
On Mon, Jul 9, 2012 at 1:30 PM, Shaun Thomas wrote: > > 1. Slave wants to be synchronous with master. Master wants replication on at > least one slave. They have this, and are happy. > 2. For whatever reason, slave crashes or becomes unavailable. > 3. Master notices no more slaves are available,

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-09 Thread Josh Berkus
Shaun, > PostgreSQL's implementation means the master will block until > someone/something notices and tells it to stop waiting, or the slave > comes back. For pretty much any high-availability environment, this is > not viable. Based on that alone, I can't imagine a scenario where > synchronous r

[HACKERS] Synchronous Standalone Master Redoux

2012-07-09 Thread Shaun Thomas
Hey everyone, Upon doing some usability tests with PostgreSQL 9.1 recently, I ran across this discussion: http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php And after reading the entire thing, I found it odd that the overriding pushback was because nobody could think of a use