Re: [HACKERS] Synchronous replication - patch status inquiry

Heikki Linnakangas Wed, 01 Sep 2010 03:24:27 -0700

On 01/09/10 10:53, Fujii Masao wrote:

Before discussing about that, we should determine whether registering
standbys in master is really required. It affects configuration a lot.
Heikki thinks that it's required, but I'm still unclear about why and
how.


Why do standbys need to be registered in master? What information
should be registered?

That requirement falls out from the handling of disconnected standbys.If a standby is not connected, what does the master do with commits? Ifthe answer is anything else than acknowledge them to the clientimmediately, as if the standby never existed, the master needs to knowwhat standby servers exist. Otherwise it can't know if all the standbysare connected or not.

What does synchronous replication mean, when is a transaction
acknowledged as committed?


I proposed four synchronization levels:

1. async
   doesn't make transaction commit wait for replication, i.e.,
   asynchronous replication. This mode has been already supported in
   9.0.

2. recv
   makes transaction commit wait until the standby has received WAL
   records.

3. fsync
   makes transaction commit wait until the standby has received and
   flushed WAL records to disk

4. replay
   makes transaction commit wait until the standby has replayed WAL
   records after receiving and flushing them to disk

OTOH, Simon proposed the quorum commit feature. I think that both
is required for various our use cases. Thought?

I'd like to keep this as simple as possible, yet flexible so that withenough scripting and extensions, you can get all sorts of behavior. Ithink quorum commit falls into the "extension" category; if you're setupis complex enough, it's going to be impossible to represent that in ourconfig files no matter what. But if you write a little proxy, you canimplement arbitrary rules there.

I think recv/fsync/replay should be specified in the standby. It has nodirect effect on the master, the master would just relay the setting tothe standby when it connects, or the standby would send multipleXLogRecPtrs and let the master decide when the WAL is persistent enough.And what if you write a proxy that has some other meaning of "persistentenough"? Like when it has been written to the OS buffers but not yetfsync'd, or when it has been fsync'd to at least one standby andreceived by at least three others. recv/fsync/replay is not going torepresent that behavior well.

"sync vs async" on the other hand should be specified in the master,because it has a direct impact on the behavior of commits in the master.


I propose a configuration file standbys.conf, in the master:

# STANDBY NAME    SYNCHRONOUS   TIMEOUT
importantreplica  yes           100ms
tempcopy          no            10s

Or perhaps this should be stored in a system catalog.

What to do if a standby server dies and never
acknowledges a commit?


The master's reaction to that situation should be configurable. So
I'd propose new configuration parameter specifying the reaction.
Valid values are:

- standalone
   When the master has waited for the ACK much longer than the timeout
   (or detected the failure of the standby), it closes the connection
   to the standby and restarts transactions.

- down
   When that situation occurs, the master shuts down immediately.
   Though this is unsafe for the system requiring high availability,
   as far as I recall, some people wanted this mode in the previous
   discussion.


Yeah, though of course you might want to set that per-standby too..

Let's step back a bit and ask what would be the simplest thing that youcould call "synchronous replication" in good conscience, and also beuseful at least to some people. Let's leave out the "down" mode, becausethat requires registration. We'll probably have to do registration atsome point, but let's take as small steps as possible.

Without the "down" mode in the master, frankly I don't see the point ofthe "recv" and "fsync" levels in the standby. Either way, when themaster acknowledges a commit to the client, you don't know if it hasmade it to the standby yet because the replication connection might bedown for some reason.

That leaves us the 'replay' mode, which *is* useful, because it givesyou the guarantee that when the master acknowledges a commit, it willappear committed in all hot standby servers that are currentlyconnected. With that guarantee you can build a reliable cluster withsomething pgpool-II where all writes go to one node, and reads aredistributed to multiple nodes.

I'm not sure what we should aim for in the first phase. But if you wantas little code as possible yet have something useful, I think 'replay'mode with no standby registration is the way to go.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

Reply via email to