Re: [HACKERS] Synchronous Standalone Master Redoux

Shaun Thomas Thu, 12 Jul 2012 06:21:59 -0700

On 07/12/2012 12:31 AM, Daniel Farina wrote:

But RAID-1 as nominally seen is a fundamentally different problem,
with much tinier differences in latency, bandwidth, and connectivity.
Perhaps useful for study, but to suggest the problem is *that* similar
I think is wrong.

Well, yes and no. One of the reasons I brought up DRBD was because it'sbasically RAID-1 over a network interface. It's not without overhead,but a few basic pgbench tests show it's still 10-15% faster than asynchronous PG setup for two servers in the same rack. Greg Smith'stests show that beyond a certain point, a synchronous PG setupeffectively becomes untenable simply due to network latency in theprotocol implementation. In reality, it probably wouldn't be usablebeyond two servers in different datacenters in the same city.

RAID-1 was the model for DRBD, but I brought it up only because it'spretty much the definition of a synchronous commit that degradesgracefully. I'd even suggest it's more important in a network contextthan for RAID-1, because you're far more likely to get syncinterruptions due to network issues than you are for a disk to fail.

But, putting that aside, why not write a piece of middleware that
does precisely this, or whatever you want? It can live on the same
machine as Postgres and ack synchronous commit when nobody is home,
and notify (e.g. page) you in the most precise way you want if nobody
is home "for a while".

You're right that there are lots of ways to kinda get this ability,they're just not mature enough or capable enough to really matter.Tailing the log to watch for secondary disconnect is too slow. Monit orNagios style checks are too slow and unreliable. A custom-builtmiddle-layer (a master-slave plugin for Pacemaker, for example) is tooslow. All of these would rely on some kind of check interval. Set thattoo high, and we get 10,000xn missed transactions for n seconds. Toolow, and we'd increase the likelihood of false positives and unnecessarydetachments.

If it's possible through a PG 9.x extension, that'd probably be the wayto *safely* handle it as a bolt-on solution. If the original author ofthe patch can convert it to such a beast, we'd install it approximatelyfive seconds after it finished compiling.

So far as transaction durability is concerned... we have a continuousbackground rsync over dark fiber for archived transaction logs, DRBD forblock-level sync, filesystem snapshots for our backups, a redundantasync DR cluster, an offsite backup location, and a tape archivalservice stretching back for seven years. And none of that will cause themaster to stop processing transactions unless the master itself dies andtriggers a failover.

Using PG sync in its current incarnation would introduce an extrafailure scenario that wasn't there before. I'm pretty sure we're not theonly ones avoiding it for exactly that reason. Our queue discardsmessages it can't fulfil within ten seconds and then throws an error foreach one. We need to decouple the secondary as quickly as possible if itbecomes unresponsive, and there's really no way to do that withoutsomething in the database, one way or another.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
[email protected]



______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous Standalone Master Redoux

Reply via email to