patch posted: http://codereview.appspot.com/2341
On Mon, Jun 16, 2008 at 11:59 PM, Brad Fitzpatrick <[EMAIL PROTECTED]> wrote: > || instead of 'or' (local style), > spaces after # in comments, > wrap long lines, > keep the => lined up in the constants at top, > use logging system instead of literal "warn"? > > If you could also upload this patch once updated to codereview.appspot.com, > that'd be great. (there's a command-line tool you can download from there > which does it all... no need to use the web-app) > > Brad > > > On Mon, Jun 16, 2008 at 8:19 AM, Jacob Burkhart <[EMAIL PROTECTED]> > wrote: > >> Hi, >> Can I get this patch in to djabberd? >> >> Incrementing a counter on empty SSL reads is not the correct way to >> determine ssl_eof >> (http://code.sixapart.com/trac/djabberd/changeset/758) >> >> Instead we should check the error code to see if it is one of the SSL >> errors for which we are supposed to retry( >> http://www.openssl.org/docs/ssl/SSL_get_error.html) >> >> thanks, >> Jacob >> >> >> >> On Thu, Feb 28, 2008 at 12:40 PM, Jacob Burkhart <[EMAIL PROTECTED]> >> wrote: >> >>> looks like the number was original 3, and then bumped up to 10 because of >>> observed disconnects "in practice" >>> http://code.sixapart.com/trac/djabberd/changeset/758 >>> >>> >>> On Thu, Feb 28, 2008 at 11:33 AM, Jacob Burkhart <[EMAIL PROTECTED]> >>> wrote: >>> >>>> Hi, >>>> So we've been experiencing some bizarre behavior with clients randomly >>>> disconnecting when trying to send messages over SSL. The disconnects vary >>>> based on size of message and speed of network connections. >>>> >>>> The problems seems to stem from: >>>> >>>> http://code.sixapart.com/trac/djabberd/browser/trunk/DJabberd/lib/DJabberd/Connection.pm#L494 >>>> >>>> In http://code.sixapart.com/trac/djabberd/changeset/756 bradfitz added >>>> code to handle non-graceful SSL disconnects. However, the number of times >>>> to retry read before giving up (10) proved to be 1 to 2 too few times in >>>> our >>>> particular situation. >>>> >>>> But why did he pick 10? >>>> >>>> In our tests, the number of times that Net::SSLeay::read can return zero >>>> bytes in a row varies based on the speed of a client's connection and the >>>> size of bytes attempting to be sent. >>>> >>>> During stream initiation for instance, we experience about 2 reads of >>>> zero in a row from clients connected on reasonably fast connections, and 0 >>>> reads of zero in a row from a client running on the same machine as the >>>> server. >>>> >>>> When sending messages larger than 8k, the number of zero reads is >>>> somewhere under 10 for clients connection over local ethernet, but is in >>>> the >>>> 11 to 12 range for clients connecting over WiFi. >>>> >>>> So clearly, counting the number of zero bytes reads that happen is a row >>>> is not the most reliable way to determine if the client has improperly >>>> disconnected >>>> >>>> It turns out that if we look at the Djabberd code for SSL write, we can >>>> see that there is a different answer for handling write failures. >>>> >>>> >>>> http://code.sixapart.com/trac/djabberd/browser/trunk/DJabberd/lib/DJabberd/Stanza/StartTLS.pm#L90 >>>> >>>> Here, get_error is called to determine if the problem might have been >>>> caused by SSL_ERROR_WANT_READ or perhaps SSL_ERROR_WANT_WRITE. If it is, >>>> then this is not a fatal condition, and we don't want to close the >>>> connection. instead, we simply return zero and expect that the writer will >>>> come around and try again at some point. >>>> >>>> So why not do the same thing for read failures? >>>> >>>> Please consider my patch (attached) as a possible solution for this >>>> problem. >>>> >>>> thanks, >>>> Jacob >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> >