The pg_db_cancel function (dbdimp.c) contains the following code (line
5458 onwards):

    /* This almost always works. If not, free our structure and complain loudly 
*/
    TRACE_PQGETCANCEL;
    if (! PQcancel(cancel,errbuf,sizeof(errbuf))) {
        TRACE_PQFREECANCEL;
        PQfreeCancel(cancel);
        if (TRACEWARN_slow) { TRC(DBILOGFP, "%sPQcancel failed: %s\n", 
THEADER_slow, errbuf); }
        _fatal_sqlstate(aTHX_ imp_dbh);
        pg_error(aTHX_ h, PGRES_FATAL_ERROR, "PQcancel failed");
        if (TEND_slow) TRC(DBILOGFP, "%sEnd pg_db_cancel (error: cancel 
failed)\n", THEADER_slow);
        return DBDPG_FALSE;
    }
    TRACE_PQFREECANCEL;
    PQfreeCancel(cancel);

    /* Whatever else happens, we should no longer be inside of an async query */
    imp_dbh->async_status = -1;
    if (NULL != imp_dbh->async_sth)
        imp_dbh->async_sth->async_status = -1;

    /* Read in the result - assume only one */
    TRACE_PQGETRESULT;
    result = PQgetResult(imp_dbh->conn);

The PQcancel documentation states that

        Successful dispatch is no guarantee that the request will have
        any effect, however. If the cancellation is effective, the
        current command will terminate early and return an error
        result. If the cancellation fails (say, because the server was
        already done processing the command), then there will be no
        visible result at all.
        
        https://www.postgresql.org/docs/15/libpq-cancel.html

Further, the PQgetResult documentation says:

        PQgetResult must be called repeatedly until it returns a null
        pointer, indicating that the command is done.
        
        https://www.postgresql.org/docs/15/libpq-async.html

It follows that the lone PQgetResult inb pg_db_cancel will block the
client until the current command has terminated, possibly because of the
cancel request. As PQgetResult is called only once, the null pointer
supposed to indicate that the command really terminated now will remain
queued. The next call to PQgetResult will thus return this null pointer
regardless of any other commands which might have been sent to the
server in the meantime.

It's possible to see this when enabling DBI tracing (DBI->trace(15))
while executing the following sequence of code from the 08async.t test
file (sth is 'select 123', line 303 onwards).

-----
$sth->execute();
$sth->finish(); ## Ideally, this would clear out the async, but it cannot at 
the moment
eval {
    $dbh->do('SELECT 345');
};
like ($@, qr{previous async}, $t);

$dbh->pg_cancel;

$t=q{Directly after pg_cancel(), pg_async_status is -1};
is ($dbh->{pg_async_status}, -1, $t);

$t=q{Method execute() works when prepare has PG_ASYNC flag};
$sth->execute();
-----

The ->execute will send the command. The ->do will then fail because an
async command was in progress. The ->pg_cancel will send the cancel
request, wait for the result of the execute and set the dbh async state
to 'query cancelled' (-1). The following ->execute will invoke
handle_old_sync because the dbh async state is non-zero and the
PQgetResult invoked by handle_old_async will then dequeue the
still-queued null pointer.

This is a very roundabout and non-obvious way of handling this situation
(it took me essentially the whole day to figure this out in detail) and
it's also arguably not the right thing to do when using async queries as
the database driver shouldn't just block the client using it silently
when said client has requested to avoid being blocked.

In my opinion, at the very least, pg_db_cancel should be changed to
conform to the documented PQgetResult protocol so that it not only
blocks until the running query (if any) has terminated but also gets rid
of the null pointer indicating this.

IMHO, a better idea would be to get
rid of the PQgetResult call in pg_db_cancel altogether and document that
a client must still determine the result of the running command (which
may be "running command was cancelled") in a suitable way after invoking
pg_cancel. Problem: Might break existing code which assumes that
pg_cancel is a synchronous operation.

?

[This is a request for opinions of other peple about this topic.]

Reply via email to