On Thu, 14 Mar 2024 at 11:33, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote:
> Hmm, isn't this basically saying that we're giving up on reliably
> canceling queries altogether?  I mean, maybe we'd like to instead fix
> the bug about canceling queries in extended query protocol ...
> Isn't that something you're worried about?

In any case I think it's worth having (non-flaky) test coverage of our
libpq cancellation sending code. So I think it makes sense to commit
the patch I proposed, even if the backend code to handle that code is
arguably buggy.

Regarding the question if the backend code is actually buggy or not:
the way cancel requests are defined to work is a bit awkward. They
cancel whatever operation is running on the session when they arrive.
So if the session is just in the middle of a Bind and Execute message
there is nothing to cancel. While surprising and probably not what
someone would want, I don't think this behaviour is too horrible in
practice in this case. Most of the time people cancel queries while
the Execute message is being processed. The new test really only runs
into this problem because it sends a cancel request, immediately after
sending the query.

I definitely think it's worth rethinking the way we do query
cancellations though. I think what we would probably want is a way to
cancel a specific query/message on a session. Instead of cancelling
whatever is running at the moment when the cancel request is processed
by Postgres. Because this "cancel whatever is running" behaviour is
fraught with issues, this Bind/Execute issue being only one of them.
One really annoying race condition of a cancel request cancelling
another query than intended can happen with this flow (that I spend
lots of time on addressing in PgBouncer):
1. You send query A on session 1
2. You send a cancel request for session 1 (intending to cancel query A)
3. Query A completes by itself
4. You now send query B
5. The cancel request is now processed
6. Query B is now cancelled

But solving that race condition would involve changing the postgres
protocol. Which I'm trying to make possible with the first few commits
in[1]. And while those first few commits might still land in PG17, I
don't think a large protocol change like adding query identifiers to
cancel requests is feasible for PG17 anymore.


Reply via email to