> From: openssl-users <openssl-users-boun...@openssl.org> On Behalf Of Brice > André > Sent: Friday, 13 November, 2020 05:06
> ... it seems that in some rare execution cases, the server performs a > SSL_read, > the client disconnects in the meantime, and the server never detects the > disconnection and remains stuck in the SSL_read operation. ... > #0 0x00007f836575d210 in __read_nocancel () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #1 0x00007f8365c8ccec in ?? () from > /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 > #2 0x00007f8365c8772b in BIO_read () from > /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 So OpenSSL is in a blocking read of the socket descriptor. > tcp 0 0 http://5.196.111.132:5413 http://85.27.92.8:25856 > ESTABLISHED 19218/./MabeeServer > tcp 0 0 http://5.196.111.132:5412 http://85.27.92.8:26305 > ESTABLISHED 19218/./MabeeServer > From this log, I can see that I have two established connections with remote > client machine on IP 109.133.193.70. Note that it's normal to have two > connexions > because my client-server protocol relies on two distinct TCP connexions. So the client has not, in fact, disconnected. When a system closes one end of a TCP connection, the stack will send a TCP packet with either the FIN or the RST flag set. (Which one you get depends on whether the stack on the closing side was holding data for the conversation which the application hadn't read.) The sockets are still in ESTABLISHED state; therefore, no FIN or RST has been received by the local stack. There are various possibilities: - The client system has not in fact closed its end of the conversation. Sometimes this happens for reasons that aren't immediately apparent; for example, if the client forked and allowed the descriptor for the conversation socket to be inherited by the child, and the child still has it open. - The client system shut down suddenly (crashed) and so couldn't send the FIN/RST. - There was a failure in network connectivity between the two systems, and consequently the FIN/RST couldn't be received by the local system. - The connection is in a state where the peer can't send the FIN/RST, for example because the local side's receive window is zero. That shouldn't be the case, since OpenSSL is (apparently) blocked in a receive on the connection. but as I don't have the complete picture I can't rule it out. > This let me think that the connexion on which the SSL_read is listening is > definitively dead (no more TCP keepalive) "definitely dead" doesn't have any meaning in TCP. That's not one of the TCP states, or part of the other TCP or IP metadata associated with the local port (which is what matters). Do you have keepalives enabled? > and that, for a reason I do not understand, the SSL_read keeps blocked into > it. The reason is simple: The connection is still established, but there's no data to receive. The question isn't why SSL_read is blocking; it's why you think the connection is gone, but the stack thinks otherwise. > Note that the normal behavior of my application is : client connects, server > daemon forks a new instance, Does the server parent process close its copy of the conversation socket? -- Michael Wojcik