On 02/12/2015 01:33 AM, Andres Freund wrote:
On 2015-02-11 14:54:03 +0200, Heikki Linnakangas wrote:
Thoughts? Can you reproduce any errors with this?

I'm on battery right now, so I can't really test much.

Did you test having an actual standby instead of pg_receivexlog? I saw
some slightly different errors there.

Does this patch alone work for you or did you test it together with the
earlier one to fix the renegotiation handling server side when
nonblocking? Because that frequently seemed to error out for me, at
least over actual network. A latch loop + checking for the states seemed
to fix that for me.

This patch alone worked for me.

+        * NB: This relies on the calling code to call pqsecure_read(), 
completing
+        * the renegotiation handshake, if pqsecure_write() returns 0. Otherwise
+        * we'll never make progress.
+        */

I think this should a) refer to the fact that pqSendSome does that in
blocking scenarios b) expand the comment around the pqReadData to
reference that fact.

How are we going to deal with callers using libpq in nonblocking mode. A
client doing a large COPY FROM STDIN to the server using a nonblocking
connection (not a stupid thing to do) could hit this IIUC.

Hmm, that's problematic even without SSL. If you do a large COPY FROM STDIN, and the server sends a lot of NOTICEs, the NOTICEs will be accumulated in the TCP send and receive buffers until they fill up. After that, the server will block on the send, and will stop reading the tuple data the client sends, and you get a deadlock. In blocking mode, pqSendSome calls pqReadData to avoid that.

I think pqSendSome should call pqReadData() after getting EWOULDBLOCK, even in non-blocking mode. It won't completely prevent the problem: it's possible that the receive buffer fills up, but the application doesn't call pqSendSome() until the socket becomes writeable, which won't happen until the client drains the incoming data is drained and unblocks the server. But it greatly reduces the risk. In practice that will solve the problem for SSL renegotiations.

We should also document that the application should always wait for the socket readable condition, in addition to writeable, and call pqConsumeInput(). That fixes the issue for good.

Could we perhaps call SSL_peek() when SSL_write() hits WANTS_READ? In
combination with your fix I think that should fix the possibility of
such a deadlock? Especially on the serverside where the socket most of
the time is is in, although emulated, blocking mode that seems easier -
we can just retry afterwards.

Hmm. Not sure what the semantics of SSL_peek() are. At least we'd need to call it with a large enough buffer that it would read all the incoming data from the OS buffer. I think it would be safer to just call SSL_read(). Both the server and libpq have an input buffer that you can easily read into at any time.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to