On Tue, 29 Sep 2020 at 11:37, tsunakawa.ta...@fujitsu.com
<tsunakawa.ta...@fujitsu.com> wrote:
>
> From: Masahiko Sawada <masahiko.saw...@2ndquadrant.com>
> > No. Please imagine a case where a user executes PREPARE TRANSACTION on
> > the transaction that modified data on foreign servers. The backend
> > process prepares both the local transaction and foreign transactions.
> > But another client can execute COMMIT PREPARED on the prepared
> > transaction. In this case, another backend newly connects foreign
> > servers and commits prepared foreign transactions. Therefore, the new
> > connection cache entry can be created during COMMIT PREPARED which
> > could lead to an error but since the local prepared transaction is
> > already committed the backend must not fail with an error.
> >
> > In the latter case, I’m assumed that the backend continues to retry
> > foreign transaction resolution until the user requests cancellation.
> > Please imagine the case where the server-A connects a foreign server
> > (say, server-B) and server-B connects another foreign server (say,
> > server-C). The transaction initiated on server-A modified the data on
> > both local and server-B which further modified the data on server-C
> > and executed COMMIT.  The backend process on server-A (say, backend-A)
> > sends PREPARE TRANSACTION to server-B then the backend process  on
> > server-B (say, backend-B) connected by backend-A prepares the local
> > transaction and further sends PREPARE TRANSACTION to server-C. Let’s
> > suppose a temporary connection failure happens between server-A and
> > server-B before the backend-A sending COMMIT PREPARED (i.g, 2nd phase
> > of 2PC). When the backend-A attempts to sends COMMIT PREPARED to
> > server-B it realizes that the connection to server-B was lost but
> > since the user doesn’t request cancellatino yet the backend-A retries
> > to connect server-B and suceeds. Since now that the backend-A
> > established a new connection to server-B, there is another backend
> > process on server-B (say, backend-B’). Since the backend-B’ doen’t
> > have a connection to server-C yet, it creates new connection cache
> > entry, which could lead to an error.  IOW, on server-B different
> > processes performed PREPARE TRANSACTION and COMMIT PREPARED and
> > the
> > later process created a connection cache entry.
>
> Thank you, I understood the situation.  I don't think it's a good design to 
> not address practical performance during normal operation by fearing the rare 
> error case.
>
> The transaction manager (TM) or the FDW implementor can naturally do things 
> like the following:
>
> * Use palloc_extended(MCXT_ALLOC_NO_OOM) and hash_search(HASH_ENTER_NULL) to 
> return control to the caller.
>
> * Use PG_TRY(), as its overhead is relatively negligible to connection 
> establishment.

I suppose you mean that the FDW implementor uses PG_TRY() to catch an
error but not do PG_RE_THROW(). I'm concerned that it's safe to return
the control to the caller and continue trying to resolve foreign
transactions without neither rethrowing an error nor transaction
abort.

IMHO, it's rather a bad design something like "high performance but
doesn't work fine in a rare failure case", especially for the
transaction management feature.

>
> * If the commit fails, the TM asks the resolver to take care of committing 
> the remote transaction, and returns success to the user.
>
>
> > Regarding parallel and asynchronous execution, I basically agree on
> > supporting asynchronous execution as the XA specification also has,
> > although I think it's better not to include it in the first version
> > for simplisity.
> >
> > Overall, my suggestion for the first version is to support synchronous
> > execution of prepare, commit, and rollback, have one resolver process
> > per database, and have resolver take 2nd phase of 2PC. As the next
> > step we can add APIs for asynchronous execution, have multiple
> > resolvers on one database and so on.
>
> We don't have to rush to commit a patch that is likely to exhibit 
> non-practical performance, as we still have much time left for PG 14.  The 
> design needs to be more thought for the ideal goal and refined.  By making 
> efforts to sort through the ideal design, we may be able to avoid rework and 
> API inconsistency.  As for the API, we haven't validated yet that the FDW 
> implementor can use XA, have we?

Yes, we still need to check if FDW implementor other than postgres_fdw
is able to support these APIs. I agree that we need more discussion on
the design. My suggestion is to start a small, simple feature as the
first step and not try to include everything in the first version.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to