Hi,

On Tue, 9 Nov 1999, Dr Stephen Henson wrote:

> Well I don't know the reason myself but from looking at the code I'd
> guess that those functions that return an object with an upped reference
> count tend to be those where its likely the application will want to
> keep the object around. This things like X509_get_version() will return
> an internal pointer because you are more likely to just check the value
> and do nothing else whereas with other examples you might want to keep
> the object afterwards. 

I'd reached a similar guess ... perhaps this could be made at least mildly
sane by restricting this behaviour to "internal" functions (ones beginning
with a lower case name, eg. ssl_get_whatever rather than any
SSL_get_zombied_pointer) but it would still be a bit ugly.

Another way to perhaps legitimise when reference counting can be safely
ignored would be as follows; if it is contained within a function
operating on an object which already has a reference to the thing being
"get"'d. Eg. Inside a function operating on an SSL* object, an internal
function may examine the peer certificate without incrementing the
reference count - the SSL object has already obtained (upped) a reference
in the X509 object so the function could temporarily "get" and use an
X509* pointer without upping the reference count and all would be ok.

However, this really shouldn't be allowed to happen across function
boundaries, and should be restricted to those small internal managings
where it's provably OK not to grab another reference and would otherwise
drag down concurrency by having excess locking going on. Anything less
than that however, and the reference counting can't be trusted by the
programmer and all code will have to carefully hack around what OpenSSL
does (or doesn't do) in each particular case.

> Of course having said that this isn't consistent and I changed the
> X509_get_pubkey() behaviour myself to up the public key reference count.
> Tracing and fixing all the memory leaks this caused was not an easy task
> :-(

I've had to do exactly the same thing with what I believe is the worst
culprit I've found yet; SSL_get_session(). It returns an SSL_SESSION*
pointer that is, from the moment you receive it, unsafe. Even if you
immediately make a manual call to up its reference count you could still
get caught out by a context-switch race. If you don't even try this, and
instead just hold onto the pointer for any length of time then you really
invite trouble. I was using this for a client utility where I didn't want
a cache, just a record of the last successfully used session ... so upon
SSL_is_init_finished(), I would call SSL_get_session and store whatever
was retrieved. Then next time I start a handshake, I'd check that the
session was still within reasonable timeout limits before applying it to a
new client SSL. The problem is that if the session had expired and been
destroyed during that time, the code that stored the SSL_SESSION* pointer
will never know until it segfaults trying to do something with the
SSL_SESSION* pointer.

I'm using the following patch, anyone have any objections to making this
permanent?

----------- clip here -------------
--- openssl-0-9-5-dev/ssl/ssl_sess.c    Thu Nov  4 21:21:02 1999
+++ openssl-0-9-5-geoff/ssl/ssl_sess.c  Wed Nov 10 15:39:34 1999
@@ -69,7 +69,16 @@
 
 SSL_SESSION *SSL_get_session(SSL *ssl)
        {
-       return(ssl->session);
+       SSL_SESSION *sess;
+       /* Need to lock this all up rather than just use CRYPTO_add so that
+        * somebody doesn't free ssl->session between when we check it's
+        * non-null and when we up the reference count. */
+       CRYPTO_r_lock(CRYPTO_LOCK_SSL_SESSION);
+       sess = ssl->session;
+       if(sess)
+               sess->references++;
+       CRYPTO_r_unlock(CRYPTO_LOCK_SSL_SESSION);
+       return(sess);
        }
 
 int SSL_SESSION_get_ex_new_index(long argl, char *argp, int (*new_func)(),
----------- clip here -------------

> Changing the logic is problematical not just because of the effort
> involved. Many programs will expect the old behaviour. In the example of
> a multithreaded server its memory requirements will grow steadily if
> they don't free things up: this is one case where a multithreaded server
> is at a disadvantage compared to a fork() + exit() kind.

Which is probably why secure web-servers haven't blown the lid off OpenSSL
yet. (a) the session caching is generally overriden by a more carefully
managed shared-memory cache, and (b) the child processes can be allowed to
die off after n requests and thus bury any memory leaks that might have
amassed during that time.

I think the best way to attempt to keep the memory under control would be
to get the reference counting consistent and understandable, then fix the
kludged code that used it and is thus temporarily broken, and then we'd
have some hope of being able to keep OpenSSL's memory management stable
and usable without fork()s and exit()s (or ugly code).

Thoughts?

Cheers,
ME


----------------------------------------------------------------------
Geoff Thorpe                                    Email: [EMAIL PROTECTED]
Cryptographic Software Engineer, C2Net Europe    http://www.int.c2.net
----------------------------------------------------------------------
May I just take this opportunity to say that of all the people I have
EVER emailed, you are definitely one of them.



______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to