Re: [HACKERS] SynchRep; wait-forever and shutdown
On Thu, Dec 9, 2010 at 11:54 PM, Fujii Masao wrote: > In previous discussion, some people wanted the "wait-forever" option which > blocks all the transactions on the master until sync'd standby has appeared, > in order to reduce the risk of data loss in synchronous replication. > > What I'm not clear is; How does smart or fast shudown advance while all the > transactions are being blocked? > > 1. Shutdown should wait for all the transactions to end by appearance of > sync'd standby? > * Problem is that shutdown would take very long. > > 2. Shutdown should commit all the blocking transactions? > * Problem is that a client thinks that those transactions have > successfully > been committed even though they have not been replicated to the > standby. > > 3. Shutdown should abort all the blocking transactions? > * Problem is that a client thinks that those transactions have been > aborted > even though those WAL records have been written on the master. But > this is very common problem for DBMS, so we don't need to worry about > this in the context of replication. > > ISTM smart and fast shutdown fits in with #1 and #3, respectively. Thought? I might be missing something, but I don't see why this case requires any special handling. As far as I can see, #2 and #3 are nonsense: the client isn't waiting on the commit per se, but rather the acknowledgment of the commit. In a smart shutdown, we wait for all clients to disconnect. If they never disconnect, we never shut down. It's a lame behavior and we might want to change it some day - at least by adding a timeout - but I don't see any reason to change it because of synchronous replication per se. In a fast shutdown, we boot all clients off immediately. If they were waiting for an acknowledgment, they don't get it. The application has to handle this case, just as it does today if it sends a COMMIT command and the connection is disconnected before it receives a response. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SynchRep; wait-forever and shutdown
3. Shutdown should abort all the blocking transactions? * Problem is that a client thinks that those transactions have been aborted even though those WAL records have been written on the master. But this is very common problem for DBMS, so we don't need to worry about this in the context of replication. Hmmm. The WAL records are written as commited ... this is why people get into 2PC if they want full synchrnous. Short of using 2PC, there is simply no way we can guarentee that the master and the standby won't get out of sync. And even 2PC isn't perfect. I think the best we can do is have the master abort the sessions and shutdown for a -fast. Yes, the clients are confused about what's been committed, but frequently that's the case with a -fast anyway. However, we need to give the user more information. I'd say that we need to have a specific error message associated with a synchronization failure around shutdown time. This error should be both returned to the clients, and logged. That way the DBA can decide what to do about the error, if anything. So, I'd say this is the way to go: Shutdown Smart: Wait for all pending standby transaction to clear. After 60 seconds, emit an error message on the shutdown console: NOTICE: pending replication transactions still waiting ... that way the DBA knows to move on to -fast Shutdown Fast: Wait for 1 second for all pending standby transactions to clear. If they don't clear, emit an error to both the shutdown console and the client consoles: WARNING: some transactions not replicated Send a commit message on the client consoles Shutdown. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SynchRep; wait-forever and shutdown
Hi, In previous discussion, some people wanted the "wait-forever" option which blocks all the transactions on the master until sync'd standby has appeared, in order to reduce the risk of data loss in synchronous replication. What I'm not clear is; How does smart or fast shudown advance while all the transactions are being blocked? 1. Shutdown should wait for all the transactions to end by appearance of sync'd standby? * Problem is that shutdown would take very long. 2. Shutdown should commit all the blocking transactions? * Problem is that a client thinks that those transactions have successfully been committed even though they have not been replicated to the standby. 3. Shutdown should abort all the blocking transactions? * Problem is that a client thinks that those transactions have been aborted even though those WAL records have been written on the master. But this is very common problem for DBMS, so we don't need to worry about this in the context of replication. ISTM smart and fast shutdown fits in with #1 and #3, respectively. Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers