On 01/08/2014 01:34 PM, Kevin Grittner wrote:

I'm torn on whether we should cave to popular demand on this; but
if we do, we sure need to be very clear in the documentation about
what a successful return from a commit request means.  Sooner or
later, Murphy's Law being what it is, if we do this someone will
lose the primary and blame us because the synchronous replica is
missing gobs of transactions that were successfully committed.

I am trying to follow this thread and perhaps I am just being dense but it seems to me that:

If you are running synchronous replication, as long as the target (subscriber) is up, synchronous replication operates as it should. That is that the origin will wait for a notification from the subscriber that the write has been successful before continuing.

However, if the subscriber is down, the origin should NEVER wait. That is just silly behavior and makes synchronous replication pretty much useless. Machines go down, that is the nature of things. Yes, we should log and log loudly if the subscriber is down:

ERROR: target xyz is non-communicative: switching to async replication.

We then should store the wal logs up to wal_keep_segments.

When the subscriber comes back up, it will then replicate in async mode until the two are back in sync and then switch (perhaps by hand) to sync mode. This of course assumes that we have a valid database on the subscriber and we have not overrun wal_keep_segments.

Sincerely,

Joshua D. Drake



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to