On Mon, Jan 5, 2015 at 11:55 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Jan 2, 2015 at 3:45 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > In short, you can't force 2PC technology on people who aren't using it > > already; while for those who are using it already, this isn't nearly > > good enough as-is. > > I was involved in some internal discussions related to this patch, so > I have some opinions on it. The long-term, high-level goal here is to > facilitate sharding. If we've got a bunch of PostgreSQL servers > interlinked via postgres_fdw, it should be possible to perform > transactions on the cluster in such a way that transactions are just > as atomic, consistent, isolated, and durable as they would be with > just one server. As far as I know, there is no way to achieve this > goal through the use of an external transaction manager, because even > if that external transaction manager guarantees, for every > transaction, that the transaction either commits on all nodes or rolls > back on all nodes, there's no way for it to guarantee that other > transactions won't see some intermediate state where the commit has > been completed on some nodes but not others. To get that, you need > some of integration that reaches down to the way snapshots are taken. > > I think, though, that it might be worthwhile to first solve the > simpler problem of figuring out how to ensure that a transaction > commits everywhere or rolls back everywhere, even if intermediate > states might still be transiently visible. Agreed. > I don't think this patch, > as currently designed, is equal to that challenge, because > XACT_EVENT_PRE_COMMIT fires before the transaction is certain to > commit - PreCommit_CheckForSerializationFailure or PreCommit_Notify > could still error out. We could have a hook that fires after that, > but that doesn't solve the problem if a user of that hook can itself > throw an error. Even if part of the API contract is that it's not > allowed to do so, the actual attempt to commit the change on the > remote side can fail due to - e.g. - a network interruption, and > that's go to be dealt with somehow. > > Tom mentioned -- in particular it treats the local transaction asymmetrically from the remote ones, which doesn't seem like a great idea --- ie, the local transaction could still abort after committing all the remote ones, leaving you no better off in terms of cross-server consistency. -- You have given a specific example of this case. So, let me dry run through CommitTransaction() after applying my patch. 1899 CallXactCallbacks(XACT_EVENT_PRE_COMMIT); While processing this event in postgres_fdw's callback pgfdw_xact_callback() sends a PREPARE TRANSACTION to all the foreign servers involved. These servers return with their success or failures. Even if one of them fails, the local transaction is aborted along-with all the prepared transactions. Only if all the foreign servers succeed we proceed further. 1925 PreCommit_CheckForSerializationFailure(); 1926 1932 PreCommit_Notify(); 1933 If any of these function (as you mentioned above), throws errors, the local transaction will be aborted as well as the remote prepared transactions. Note, that we haven't yet committed the local transaction (which will be done below) and also not the remote transactions which are in PREPAREd state there. Since all the transactions local as well as remote are aborted in case of error, the data is still consistent. If these steps succeed, we will proceed ahead. 1934 /* Prevent cancel/die interrupt while cleaning up */ 1935 HOLD_INTERRUPTS(); 1936 1937 /* Commit updates to the relation map --- do this as late as possible */ 1938 AtEOXact_RelationMap(true); 1939 1940 /* 1941 * set the current transaction state information appropriately during 1942 * commit processing 1943 */ 1944 s->state = TRANS_COMMIT; 1945 1946 /* 1947 * Here is where we really truly commit. 1948 */ 1949 latestXid = RecordTransactionCommit(); 1950 1951 TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid); 1952 1953 /* 1954 * Let others know about no transaction in progress by me. Note that this 1955 * must be done _before_ releasing locks we hold and _after_ 1956 * RecordTransactionCommit. 1957 */ 1958 ProcArrayEndTransaction(MyProc, latestXid); 1959 Local transaction committed. Remote transactions still in PREPAREd state. Any server (including local) crash or link failure happens here, we leave the remote transactions dangling in PREPAREd state and manual cleanup will be required. 1975 1976 CallXactCallbacks(XACT_EVENT_COMMIT); The postgresql callback pgfdw_xact_callback() commits the PREPAREd transactions by sending COMMIT TRANSACTION to remote server (my patch). So, I don't see why would my patch cause inconsistencies. It can cause dangling PREPAREd transactions and I have already acknowledged that fact. Am I missing something? > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company