On Mon, Mar 14, 2016 at 3:06 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Agreed, we need to deal with this one way or the other. My proposal > is: > > 1. In HEAD, do it as Peter E. suggests, ie clear error queue before calls. > > 2. In back branches, clear error queue before *and* after calls. This > will waste a few nanoseconds but will avoid any risk of breaking > existing third-party code. > > I think it's reasonable to expect extensions to update to the newer > behavior in a major release, but we're taking risks if we expect > that to happen in minor releases.
I am concerned that users will never be able to get this right, since I think it requires every Ruby or PHP app using some thin OpenSSL wrapper to clear the per-queue thread. It's a big mess, but it's our mess to some degree. I wonder if it would be just as good if we ensured that ERR_get_error() was definitely called in any circumstance where it looked like we might have an error: ERR_get_error() would be *reliably* called, as in my patch, but unlike my patch only when SSL_get_error() indicated a problem (not all the time). Heikki believed that clearing the error queue by calling ERR_clear_error() before calling an I/O function like SSL_read() was necessary, as we all do; no controversy there. But Heikki also believed, even prior to hearing about this bug, that it was important and necessary for ERR_get_error() to be reached when there might be an error added to the queue following a Postgres/libpq call to an I/O function like SSL_read() followed by SSL_get_error() indicating trouble. He thought, as I do, that it would be a good idea to not rely on that happening from a distance (i.e. not relying on reaching SSLerrmessage()). Peter E. seems to believe that there is absolutely no reason to rely on ERR_get_error() getting called at all, and that the existing SSLerrmessage() only exists for the purposes of producing a human-readable error message. Anyway, thinking about it some more, perhaps the best solution is to do the ERR_get_error() call iff SSL_get_error() seems unhappy, perhaps even placing the two into a utility function. That's probably almost the same as the existing behavior, as far as clearing up the queue after we may have added to it goes. I don't know if that's any less safe then my patch's pessimistic approach. It seems like it might be just as safe. Under this compromise, I think we'd probably clear the error queue after SSL_get_error() returned a value that is not SSL_ERROR_NONE, though (including SSL_ERROR_WANT_READ, etc). What do you think about that? > In any case, we need a patch that touches the backend-side code as well. I agree that the backend-side code should be covered. I quickly changed my mind about that, and am happy to produce a revision along those lines. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers