Darryl Miles wrote:
Nanno Langstraat wrote:
So I can add one more voice to the choir: the current SSL_shutdown()
API appears to give trouble to every non-blocking developer (I
remember I lost serious time noticing + tracking down this 100% CPU
bug), and afterwards things still don't really work right.
I can't immediately think of a reason why you'd get 100% CPU, except
with a badly constructed event loop (even with non-blocking I/O in
use) much as I'd like to see your support :).
Them's fighting words :-) But not inconceivable, the event loop juggles
several different kinds of file descriptors.
So I've taken that event loop from the application and whittled it down
to a simple test case,
It turns out that the problem does *not* directly involve
SSL_shutdown(), but it *is* attributable to OpenSSL, and specifically
OpenSSL's non-blocking shutdown semantics.
--
Details: the application's event loop is built analogous to a regular
TCP event loop. Meaning that the event loop tracks the upstream
direction and the downstream independently: the event loop handles the
possibility that the downstream direction gets shut down [SSL_read()
returned 0] while the upstream direction stays open a while longer [we
might still make a few SSL_write() calls]. Just like TCP.
This goes pear-shaped as follows:
* The SSL connection is made and used
* The remote side closes its file descriptor (e.g. process killed,
TCP shutdown(RD))
* Local SSL_read() returns 0. The app event loop sets a flag and
makes sure it never calls SSL_read() again.
* The app event loop prepares for poll() by calling SSL_want_read()
and SSL_want_write().
* SSL_want_read() returns 'true'. This is erroneous.
* poll() returns immediately.
* Repeat the last 3 steps indefinitely. Uses 100% CPU.
--
I see three ways to slice this:
* Say "Silly OpenSSL API user! You should have known/guessed that
you can't use SSL_want_read() / SSL_want_write() anymore after
SSL_read() has returned 0."
This does not seem reasonable, because as far as I can see this
rule is not mentioned in the API documentation for SSL_read() or
SSL_want_read(). Right?
* The OpenSSL documentation can be updated to mention this.
* The OpenSSL code can be updated to handle this nicely: make sure
SSL_want_read() and SSL_want_write() return false, and wait for
the application to call SSL_write() sooner or later, which will
return an error as normal. At that moment the non-blocking
application can be expected to "get the idea" and clean up the
connection smoothly.
In case of interest, I can clean up the test case code further and email it.
Regards,
Nanno
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-dev@openssl.org
Automated List Manager [EMAIL PROTECTED]