Regarding the max number of sessions problem. I think I've figured out what was going 
wrong, but now I need some insight.

I increased the Session Cache as you suggested to (40 * 1024) without any change in 
behavior. It turns out is was a bug in my code (whew!).

After doing some more debugging I discovered that the infinite loop was actually in my 
program. Let me describe:

After I call SSL_read(), if I receive an error of SSL_ERROR_WANT_READ or (_WRITE), I 
then call select() on the socket to wait for a small amount of time (0.1 seconds) for 
something to arrive. If I timeout, I put this user on the queue and move on to the 
next, figuring that data will eventually arrive. If select() told me that data was 
available, I would loop around to call SSL_read() again.

What I found was that select would return with a result of 1, telling me that my 
socket had data pending. I would call SSL_read(), and would again receive 
SSL_ERROR_WANT_READ. In my loop, I would call select() again, etc. Each time through, 
select() would tell me that there was data pending on the socket, but SSL_read() kept 
returning SSL_ERROR_WANT_READ. (I do handle the SSL_ERROR_WANT_WRITE condition as 
well, but that never happened in this situation).

At any rate, my program would get stuck in the loop. I added a loop count to make sure 
that I would break out after a fixed number of attempts. This allowed my program to 
muddle through. Higher-level (application) level time-outs would eventually allow me 
to close the SSL session since that session seemed to be broken.

Here's an interesting thing too: To help handle the call to select() better, I checked 
to see if my socket was in the descriptor set using FD_ISSET(). Surprisingly, even 
though select() told me my socket had data pending, the result of FD_ISSET indicated 
that my socket was not in the set! On the front end of select, there was only one 
socket in the set. Why would select() return with a result of 1 but have no sockets in 
the result set?

The other thing I noticed was that (according to the man page for select()) the 
results of the FD_ macros are undefined if the descriptor value is greater than 
FD_SETSIZE, which is 1024 on my system. I find this odd since the hard limit of the 
number of files any given process can have open is kern.maxfilesperproc = 10240. Is 
this a limitation of the POSIX API or could the man page for select() be wrong? Does 
anyone have any insight into the proper use of select() if the descriptor values are 
larger than FD_SETSIZE? Or maybe some other function that replaces select() for 
programs with LOTS of descriptors?

Thanks,
Joe

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to