Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> writes: > * Support multiple standbys with various synchronization levels. > > * What happens if a synchronous standby isn't connected at the moment? > Return immediately vs. wait forever. > > * Per-transaction control. Some transactions are important, others are not. > > * Quorum commit. Wait until n standbys acknowledge. n=1 and n=all servers > can be seen as important special cases of this. > > * async, recv, fsync and replay levels of synchronization. > > So what should the user interface be like? Given the 1st and 2nd > requirement, we need standby registration. If some standbys are important > and others are not, the master needs to distinguish between them to be able > to determine that a transaction is safely delivered to the important > standbys.
Well the 1st point can be handled in a distributed fashion, where the sync level is setup at the slave. Ditto for second point, you can get the exact same behavior control attached to the quorum facility. What I think you're description is missing is the implicit feature that you want to be able to setup the "ignore-or-wait" failure behavior per standby. I'm not sure we need that, or more precisely that we need to have that level of detail in the master's setup. Maybe what we need instead is a more detailed quorum facility, but as you're talking about something similar later in the mail, let's follow you. > For per-transaction control, ISTM it would be enough to have a simple > user-settable GUC like synchronous_commit. Let's call it > "synchronous_replication_commit" for now. For non-critical transactions, you > can turn it off. That's very simple for developers to understand and use. I > don't think we need more fine-grained control than that at transaction > level, in all the use cases I can think of you have a stream of important > transactions, mixed with non-important ones like log messages that you want > to finish fast in a best-effort fashion. I'm actually tempted to tie that to > the existing synchronous_commit GUC, the use case seems exactly the > same. Well, that would be an over simplification. In my applications I set the "sessions" transaction with synchronous_commit = off, but the business transactions to synchronous_commit = on. Now, among those last, I have backoffice editing and money transactions. I'm not willing to be forced to endure the same performance penalty for both when I know the distributed durability needs aren't the same. > OTOH, if we do want fine-grained per-transaction control, a simple boolean > or even an enum GUC doesn't really cut it. For truly fine-grained control > you want to be able to specify exceptions like "wait until this is replayed > in slave named 'reporting'" or 'don't wait for acknowledgment from slave > named 'uk-server'". With standby registration, we can invent a syntax for > specifying overriding rules in the transaction. Something like SET > replication_exceptions = 'reporting=replay, uk-server=async'. Then you want to be able to have more than one reporting server and need only one of them at the "replay" level, but you don't need to know which it is. Or on the contrary you have a failover server and you want to be sure this one is at the replay level whatever happens. Then you want topology flexibility: you need to be able to replace a reporting server with another, ditto for the failover one. Did I tell you my current thinking on how to tackle that yet? :) Using a distributed setup, where each slave has a weight (several votes per transaction) and a level offering would allow that I think. Now something similar to your idea that I can see a need for is being able to have a multi-part quorum target: when you currently say that you want 2 votes for sync, you would be able to say you want 2 votes for recv, 2 for fsync and 1 for replay. Remember that any slave is setup to offer only one level of synchronicity but can offer multiple votes. How this would look like in the setup? Best would be to register the different service levels your application need. Time to bikeshed a little? sync_rep_services = {critical: recv=2, fsync=2, replay=1; important: fsync=3; reporting: recv=2, apply=1} Well you get the idea, it could maybe get stored on a catalog somewhere with nice SQL commands etc. The goal is then to be able to handle a much simpler GUC in the application, sync_rep_service = important for example. Reserved label would be off, the default value. > For the control between async/recv/fsync/replay, I like to think in terms of > a) asynchronous vs synchronous > b) if it's synchronous, how synchronous is it? recv, fsync or replay? Same here. > I think it makes most sense to set sync vs. async in the master, and the > level of synchronicity in the slave. Yeah, exactly. If you add a weight to each slave then a quorum commit, you don't change the implementation complexity and you offer lot of setup flexibility. If the slave sync-level and weight are SIGHUP, then it even become rather easy to switch roles online or to add new servers or to organise a maintenance window — the quorum to reach is a per-transaction GUC on the master, too, right? Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers