Darryl Miles wrote:
Nanno Langstraat wrote:
So I can add one more voice to the choir: the current SSL_shutdown() API appears to give trouble to every non-blocking developer (I remember I lost serious time noticing + tracking down this 100% CPU bug), and afterwards things still don't really work right.

I can't immediately think of a reason why you'd get 100% CPU, except with a badly constructed event loop (even with non-blocking I/O in use) much as I'd like to see your support :).

Them's fighting words :-) But not inconceivable, the event loop juggles several different kinds of file descriptors.

So I've taken that event loop from the application and whittled it down to a simple test case,

It turns out that the problem does *not* directly involve SSL_shutdown(), but it *is* attributable to OpenSSL, and specifically OpenSSL's non-blocking shutdown semantics.

--

Details: the application's event loop is built analogous to a regular TCP event loop. Meaning that the event loop tracks the upstream direction and the downstream independently: the event loop handles the possibility that the downstream direction gets shut down [SSL_read() returned 0] while the upstream direction stays open a while longer [we might still make a few SSL_write() calls]. Just like TCP.

This goes pear-shaped as follows:

   * The SSL connection is made and used
   * The remote side closes its file descriptor (e.g. process killed,
     TCP shutdown(RD))
   * Local SSL_read() returns 0. The app event loop sets a flag and
     makes sure it never calls SSL_read() again.
   * The app event loop prepares for poll() by calling SSL_want_read()
     and SSL_want_write().
   * SSL_want_read() returns 'true'. This is erroneous.
   * poll() returns immediately.
   * Repeat the last 3 steps indefinitely. Uses 100% CPU.


--

I see three ways to slice this:

   * Say "Silly OpenSSL API user! You should have known/guessed that
     you can't use SSL_want_read() / SSL_want_write() anymore after
     SSL_read() has returned 0."

     This does not seem reasonable, because as far as I can see this
     rule is not mentioned in the API documentation for SSL_read() or
     SSL_want_read(). Right?

   * The OpenSSL documentation can be updated to mention this.

   * The OpenSSL code can be updated to handle this nicely: make sure
     SSL_want_read() and SSL_want_write() return false, and wait for
     the application to call SSL_write() sooner or later, which will
     return an error as normal. At that moment the non-blocking
     application can be expected to "get the idea" and clean up the
     connection smoothly.


In case of interest, I can clean up the test case code further and email it.

   Regards,
   Nanno


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to