Re: [GENERAL] Bigtime scaling of Postgresql (cluster and stuff I suppose)

Markus Schiltknecht Tue, 28 Aug 2007 05:53:36 -0700

Hi,

Bill Moran wrote:

First off, "clustering" is a word that is too vague to be useful, so
I'll stop using it.  There's multi-master replication, where every
database is read-write, then there's master-slave replication, where
only one server is read-write and the rest are read-only.  You can
add failover capabilities to master-slave replication.  Then there's
synchronous replication, where all servers are guaranteed to get
updates at the same time.  And asynchronous replication, where other
servers may take a while to get updates.  These descriptions aren't
really specific to PostgreSQL -- every database replication system
has to make design decisions about which approaches to support.


Good explanation!

Synchronous replication is only
really used when two servers are right next to each other with a
high-speed link (probably gigabit) between them.

Why is that so? There's certainly very valuable data which would gainfrom an inter-continental database system. For money transfers, forexample, I'd rather wait half a second for a round trip around theworld, to make sure the RDBS does not 'loose' my money.

PostgreSQL-R is in development, and targeted to allow multi-master,
asynchronous replication without rewriting your application.  As
far as I know, it works, but it's still beta.

Sorry, this is nitpicking, but for some reason (see current namingdiscussion on -advocacy :-) ), it's "Postgres-R".

Additionally, Postgres-R is considered to be a *synchronous* replicationsystem, because once you get your commit confirmation, your transactionis guaranteed to be deliverable and *committable* on all running nodes(i.e. it's durable and consistent). Or put it another way: asynchronoussystems have to deal with conflicting, but already committedtransactions - Postgres-R does not.

Certainly, this is slightly less restrictive than saying that atransaction needs to be *committed* on all nodes, before confirming thecommit to the client. But as long as a database session is tied to anode, this optimization does not alter any transactional semantics. Anddespite that limitation, which is mostly the case in reality anyway, Istill consider this to be synchronous replication.

[ To get a strictly synchronous system with Postgres-R, you'd have todelay read only transactions on a node which hasn't applied all remotetransactions, yet. In most cases, that's unwanted. Instead, a consistentsnapshot is enough, just as if the transaction started *before* theremote ones which still need to be applied. ]

BTW: does anyone know of a link that describes these high-level concepts?
If not, I think I'll write this up formally and post it.

Hm.. somewhen before 8.3 was released, we had lots of discussions on-docs about the "high availability and replication" section of thePostgreSQL documentation. I'd have liked to add these fundamentalconcepts, but Bruce - rightly - wanted to keep focused on existingsolutions. And unfortunately, most existing solutions are async,single-master. So explaining all these wonderful theoretic concepts onlyto state that there are no real solutions would have been silly.


Regards

Markus


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [GENERAL] Bigtime scaling of Postgresql (cluster and stuff I suppose)

Reply via email to