Hi, Can I get this patch in to djabberd? Incrementing a counter on empty SSL reads is not the correct way to determine ssl_eof (http://code.sixapart.com/trac/djabberd/changeset/758)
Instead we should check the error code to see if it is one of the SSL errors for which we are supposed to retry( http://www.openssl.org/docs/ssl/SSL_get_error.html) thanks, Jacob On Thu, Feb 28, 2008 at 12:40 PM, Jacob Burkhart <[EMAIL PROTECTED]> wrote: > looks like the number was original 3, and then bumped up to 10 because of > observed disconnects "in practice" > http://code.sixapart.com/trac/djabberd/changeset/758 > > > On Thu, Feb 28, 2008 at 11:33 AM, Jacob Burkhart <[EMAIL PROTECTED]> > wrote: > >> Hi, >> So we've been experiencing some bizarre behavior with clients randomly >> disconnecting when trying to send messages over SSL. The disconnects vary >> based on size of message and speed of network connections. >> >> The problems seems to stem from: >> >> http://code.sixapart.com/trac/djabberd/browser/trunk/DJabberd/lib/DJabberd/Connection.pm#L494 >> >> In http://code.sixapart.com/trac/djabberd/changeset/756 bradfitz added >> code to handle non-graceful SSL disconnects. However, the number of times >> to retry read before giving up (10) proved to be 1 to 2 too few times in our >> particular situation. >> >> But why did he pick 10? >> >> In our tests, the number of times that Net::SSLeay::read can return zero >> bytes in a row varies based on the speed of a client's connection and the >> size of bytes attempting to be sent. >> >> During stream initiation for instance, we experience about 2 reads of zero >> in a row from clients connected on reasonably fast connections, and 0 reads >> of zero in a row from a client running on the same machine as the server. >> >> When sending messages larger than 8k, the number of zero reads is >> somewhere under 10 for clients connection over local ethernet, but is in the >> 11 to 12 range for clients connecting over WiFi. >> >> So clearly, counting the number of zero bytes reads that happen is a row >> is not the most reliable way to determine if the client has improperly >> disconnected >> >> It turns out that if we look at the Djabberd code for SSL write, we can >> see that there is a different answer for handling write failures. >> >> >> http://code.sixapart.com/trac/djabberd/browser/trunk/DJabberd/lib/DJabberd/Stanza/StartTLS.pm#L90 >> >> Here, get_error is called to determine if the problem might have been >> caused by SSL_ERROR_WANT_READ or perhaps SSL_ERROR_WANT_WRITE. If it is, >> then this is not a fatal condition, and we don't want to close the >> connection. instead, we simply return zero and expect that the writer will >> come around and try again at some point. >> >> So why not do the same thing for read failures? >> >> Please consider my patch (attached) as a possible solution for this >> problem. >> >> thanks, >> Jacob >> >> >> >> >> >> >> >
ssl_eof.diff
Description: Binary data