On 07/12/2012 12:31 AM, Daniel Farina wrote:

But RAID-1 as nominally seen is a fundamentally different problem,
with much tinier differences in latency, bandwidth, and connectivity.
Perhaps useful for study, but to suggest the problem is *that* similar
I think is wrong.

Well, yes and no. One of the reasons I brought up DRBD was because it's basically RAID-1 over a network interface. It's not without overhead, but a few basic pgbench tests show it's still 10-15% faster than a synchronous PG setup for two servers in the same rack. Greg Smith's tests show that beyond a certain point, a synchronous PG setup effectively becomes untenable simply due to network latency in the protocol implementation. In reality, it probably wouldn't be usable beyond two servers in different datacenters in the same city.

RAID-1 was the model for DRBD, but I brought it up only because it's pretty much the definition of a synchronous commit that degrades gracefully. I'd even suggest it's more important in a network context than for RAID-1, because you're far more likely to get sync interruptions due to network issues than you are for a disk to fail.

But, putting that aside, why not write a piece of middleware that
does precisely this, or whatever you want? It can live on the same
machine as Postgres and ack synchronous commit when nobody is home,
and notify (e.g. page) you in the most precise way you want if nobody
is home "for a while".

You're right that there are lots of ways to kinda get this ability, they're just not mature enough or capable enough to really matter. Tailing the log to watch for secondary disconnect is too slow. Monit or Nagios style checks are too slow and unreliable. A custom-built middle-layer (a master-slave plugin for Pacemaker, for example) is too slow. All of these would rely on some kind of check interval. Set that too high, and we get 10,000xn missed transactions for n seconds. Too low, and we'd increase the likelihood of false positives and unnecessary detachments.

If it's possible through a PG 9.x extension, that'd probably be the way to *safely* handle it as a bolt-on solution. If the original author of the patch can convert it to such a beast, we'd install it approximately five seconds after it finished compiling.

So far as transaction durability is concerned... we have a continuous background rsync over dark fiber for archived transaction logs, DRBD for block-level sync, filesystem snapshots for our backups, a redundant async DR cluster, an offsite backup location, and a tape archival service stretching back for seven years. And none of that will cause the master to stop processing transactions unless the master itself dies and triggers a failover.

Using PG sync in its current incarnation would introduce an extra failure scenario that wasn't there before. I'm pretty sure we're not the only ones avoiding it for exactly that reason. Our queue discards messages it can't fulfil within ten seconds and then throws an error for each one. We need to decouple the secondary as quickly as possible if it becomes unresponsive, and there's really no way to do that without something in the database, one way or another.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com



______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to