Is the following sequence possible ? In pgfdw_inval_callback(): entry->invalidated = true; + have_invalid_connections = true;
At which time the loop in pgfdw_xact_callback() is already running (but past the above entry). Then: + /* We are done closing all the invalidated connections so reset. */ + have_invalid_connections = false; At which time, there is still at least one invalid connection but the global flag is off. On Mon, Dec 14, 2020 at 6:39 PM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > Hi, > > As discussed in [1], in postgres_fdw the cached connections to remote > servers can stay until the lifetime of the local session without > getting a chance to disconnect (connection leak), if the underlying > user mapping or foreign server is dropped in another session. Here are > few scenarios how this can happen: > > Use case 1: > 1) Run a foreign query in session 1 with server 1 and user mapping 1 > 2) Drop user mapping 1 in another session 2, an invalidation message > gets generated which will have to be processed by all the sessions > 3) Run the foreign query again in session 1, at the start of txn the > cached entry gets invalidated via pgfdw_inval_callback() (as part of > invalidation message processing). Whatever may be the type of foreign > query (select, update, explain, delete, insert, analyze etc.), upon > next call to GetUserMapping() from postgres_fdw.c, the cache lookup > fails(with ERROR: user mapping not found for "XXXX") since the user > mapping 1 has been dropped in session 2 and the query will also fail > before reaching GetConnection() where the connections associated with > the invalidated entries would have got disconnected. > > So, the connection associated with invalidated entry would remain > until the local session exits. > > Use case 2: > 1) Run a foreign query in session 1 with server 1 and user mapping 1 > 2) Try to drop foreign server 1, then we would not be allowed because > of dependency. Use CASCADE so that dependent objects i.e. user mapping > 1 and foreign tables get dropped along with foreign server 1. > 3) Run the foreign query again in session 1, at the start of txn, the > cached entry gets invalidated via pgfdw_inval_callback() and the query > fails because there is no foreign table. > > Note that the remote connection remains open in session 1 until the > local session exits. > > To solve the above connection leak problem, it looks like the right > place to close all the invalid connections is pgfdw_xact_callback(), > once registered, which gets called at the end of every txn in the > current session(by then all the sub txns also would have been > finished). Note that if there are too many invalidated entries, then > the following txn has to close all of them, but that's okay than > having leaked connections and it's a one time job for the following > one txn. > > Attaching a patch for the same. > > Thoughts? > > [1] - > https://www.postgresql.org/message-id/flat/CALj2ACUOQYs%2BssxkxRvZ6Ja5%2Bsfc6a-s_0e-Nv2A591hEyOgiw%40mail.gmail.com > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com >