I would like to add truly asynchronous query processing to libpq, enabling command pipelining. The idea is to to allow applications to auto-tune to the bandwidth-delay product and reduce the number of context switches when running against a local server.

Here's a sketch of what the interface would look like. I believe it can be implemented in principle within the existing libpq structure, although the interaction with single-row mode looks complicated.

If the application is not interested in intermediate query results, it would use something like this:

  PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_DISCARD);
  PGresult *res NULL;
  while (more_data()) {
     int ret = PQsendQueryParams(conn, "INSERT ...", ...);
     if (ret == 0) {
       // handle low-level error
     }
     // We can issue a second command.  There is no need to fetch
     // the result immediately.
     ret = PQsendQueryParams(conn, "INSERT ...", ...);
     if (ret == 0) {
       // handle low-level error
     }
     res = PQgetResultNoWait(conn);
     if (res != NULL) {
       // Only returns non-NULL in case of error.  Does not wait for
       // data to arrive from the server.  If there is an error,
       // drains all pending errors and reports only the first error.
       break;
     }
  }
  if (res == NULL) {
    // NULL means that all commands completed without error.
    // Again, drains all pending errors.
    // (Necessarily syncs with the server.)
    res = PQgetResult(conn));
  }
  PQsetsendAsyncMode(conn, oldMode);
  if (res != NULL) {
    // Handle error result.
    ...
    PQclear(res);
    throw_error();
  }

If there is no need to exit from the loop early (say, because errors are expected to be extremely rare), the PQgetResultNoWait call can be left out.

Internally, libpq will check in PQsendQueryParams (and similar functions) whether data is available for reading. If yes, it is consumed, discarded unless it is an error, and stored for later retrieval with PQgetResultNoWait or PQgetResult. (After the first error, all responses are discarded—this is necessary to avoid deadlocks.) PQgetResultNoWait returns immediately if there is no stored error object. If there is one, it syncs with the server (draining all pending results). PQgetResult just drains all pending results, reporting the first error.

The general expectation here is that commands are sent within a transaction block, so only the first error is interesting. (It make sense for fatal errors to overtake regular errors.)

If the programmer needs results, not just the errors, a different mode is needed:

  PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
  bool more_data;
  do {
     more_data = ...;
     if (more_data) {
       int ret = PQsendQueryParams(conn,
         "INSERT ... RETURNING ...", ...);
       if (ret == 0) {
         // handle low-level error
       }
     }
     // Consume all pending results.
     while (1) {
       PGresult *res;
       if (more_data) {
         res = PQgetResultNoWait(conn);
       } else {
         res = PQgetResult(conn);
       }
       if (res == NULL) {
         // No result so far.
         break;
       }

       // Result data is available.  Check if it is an error.
       switch (PGresultStatus(res)) {
       case PGRES_TUPLES_OK:
          // Process asynchronous result data.
          ...
          break;

       // Error handling.
       case PGRES_BAD_RESPONSE:
       case PGRES_NONFATAL_ERROR:
       case PGRES_FATAL_ERROR:
          // Store error somewhere.
          ...
          // Keep draining results, eventually
          // synchronizing with the server.
          more_data = false;
          break;
       }
       PQclear(res);
     }
  } while (more_data);
  PQsetsendAsyncMode(conn, oldMode);

(This suggest that PQgetResultNoWait should have a different name and a flag argument that tells it whether it is blocking or not.)

In PQASYNC_RESULT mode, libpq has to build up a list of pending results each time it notices available data during a PQsendQueryParams operation. The number of responses that need to be stored this way is not totally unbounded, it is related to what fits into the bandwidth-delay product of the link. This, and the fairly elaborate processing logic, is the reason why I think it makes to have sense PQASYNC_DISCARD as well.

Instead of buffering the results, we could buffer the encoded command messages in PQASYNC_RESULT mode. This means that PQsendQueryParams would not block when it cannot send the (complete) command message, but store in the connection object so that the subsequent PQgetResultNoWait and PQgetResult would send it. This might work better with single-tuple result mode. We cannot avoid buffering either multiple queries or multiple responses if we want to utilize the link bandwidth, or we'd risk deadlocks.

There'd also be the default PQASYNC_DEFAULT mode, where it is an error to call PQsendQueryParams multiple times without a PQgetResult call in-between.

In PQASYNC_DISCARD mode, we could implicitly synchronize with the server when a synchronous API is used, reporting a pending error against the synchronous command. (I think this is what the delayed error reporting in X11 does.) Typically, that would be the "COMMIT" at the end of the transaction. With a result-buffering implementation of PQASYNC_RESULT, we could do that as well, but this might be too much magic.

I thought I'd ask for comments before starting coding because this looks a bit more complicated than I expected. :)
--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to