Wow. This may explain a once-in-a-million anomaly going on here.
Hmmmmm... Checked this and yes, same story in 0.9.9 HEAD.

Note that this is a particular wicked insect, because, as it is the
stack/stack.c internal_find() function who exhibits the _side effect_
of sorting a yet unsorted stack, _anyone_ who employs a sk_xyz_find()
finds themselves a piece of non-threadsafe code. (You got /my/
attention now!)


There are two ways about this:

a) either forego on the side effect and resort to a linear (slow)
search/scan when the stack is unsorted (which can happen, for
instance, after an element has been inserted (sk_*_insert())), so that
sk_find() will be threadsafe (assuming no parallel insert()/push()
calls) or

b) provide a threadsafe environment by surrounding a least each sort
with a lock. (I can see why lock-encapsulating the find() would be
bothersome, as it would make threads wait for each other, where they
wouldn't really have to in 99% of cases, so the brute-force approach
of lock-covering the find() shouldn't be contemplated.


Which leads to a code inspection:
This finds us one working example of a solution of type (b):

crypto/asn1/x_crl.c, function def_crl_lookup(): here locking surrounds
the sort() operation in the flow

if (!..._is_sorted())
{
  lock();
  sort();
  unlock();
}
find();

and thus _is_ thread safe (my initial kneejerk reaction was: it is
not, but then I looked again...): since the find() won't ever be
reached if the stack is unsorted, the lock surrounding the sort() is
good enough -- assuming no-one will sk_*_insert() / sk_*_push() items
into the stack in another thread. Even a non-atomic read in
_is_sorted() is no problem, as it can only conclude the stack is /not/
sorted a little too often, walking the code straight into the
lock+sort, resulting in the stack getting sorted more often than
necessary in this fringe, but the sort() is repeatable and when your
RTL provides a high quality qsort() which checks for partially/fully
sorted input (and several implementations do so), this is a minor
overhead at the start, while allowing full parallelism of _find()
lateron.

Also, note that sk_*_insert()/push() are thread unsafe (hm, no .pod
for this stuff yet; I should write one then...) and should not be used
on a thread-shared stack while multiple threads may access it, as
_insert() will surely corrupt any parallel running find() - as the
latter doesn't come with locks. Worse, insert/push() will make the
stack object not-thread-safe until it is followed by a completed
sk_sort().


A scan of the OpenSSL source code reveals these cases at least:

crypto/x509v3/v3_purp.c:

see global xptable become unsorted by the insert() code in
X509_PURPOSE_add() - which is not followed by a sort(), why should it?
I wouldn't expect it'd need that... - leading to the requirement that
X509_PURPOSE_get_by_id() - which contains an sk_*_find() - MUST be
called at least once in a single-thread situation, before multiple
threads can call X509_PURPOSE_get_by_id() in parallel without
endangering themselves: no lock around the find(), thus no lock around
the sort() within that find(). Eek!

crypto/x509v3/pcy_cache.c

The policy cache code has locks around the sk_*_push() - several calls
up - in policy_cache_set(), no sort() in there, so the lock-free
policy_cache_find_data() is definitely _not_ thread safe as it has a
sk_*_find() in there, which, assuming two threads calling
policy_cache_find_data() with the same X509 object, will run two
sort() side effects in parallel. I do not know if is  acceptable
(sharing X509 instances among threads), so this is one for the High
Lords.

A similar story for X509 policy nodes, etc., (grep for sk_.*_find() to
find locations of use) but more interesting is code which accesses
global variables of type xyz_stack: those babies will have a much
higher risk of being invoked from several threads:

crypto/x509v3/v3_lib.c

Another case: global stack 'ext_list' means X509V3_EXT_add()
(sk_push()) MUST be followed by at least one call to
X509V3_EXT_get_nid() (which contains sk_..._find()) before
X509V3_EXT_get_nid() can be used in a thread-safe manner.


crypto/objects/obj_xref.c

global 'sig_app' + 'sigx_app': code is okay as OBJ_add_sigid(), itself
of course not thread safe, does a sort() on both stacks, so that
OBJ_find_sigid_algs() _is_ thread safe.


crypto/engine/eng_table.c

This one is tricky. the push-without-sort code in
engine_table_register() comes with locks surrounding it, so do all
functions calling engine_table_doall(), which triggers the callback
which does a find-with-hidden-sort, but any bugger outside of this
code, should make bloody sure [s]he surrounds her/his calls to
engine_table_doall() with CRYPTO_LOCK_ENGINE-locking code or the
callback-doing-a-find() will represent a thread-unsafe situation.
I didn't check this one further (I don't use engines ATM - getting lazy...)


The next occurrence is crypto/asn1/asn1_mime.c but my guess is nobody
with any sense is sharing MIME_HEADERs across threads, right?


crypto/asn1/a_strnid.c

Here is one that's important for me: global stack 'stable' is modified
in ASN1_STRING_TABLE_add(), push-no-sort, and find()-with-hidden-sort
happens in ASN1_STRING_TABLE_get() - which _will_ be called from
multiple threads, at least over here. No locks, which, by the time I'm
writing this line, I consider rather a kludge as any time you forget
to 'prep' for find() by adding an rather non-intuitive sort() at 'the
right spot', you're toast. Again.

Let's finish up the list:

crypto/asn1/ameth_lib.c

global stack 'app_methods'. EVP_PKEY_asn1_add0() does the 'right'
thing (koff koff koff): push, then sort at the end. Which means
EVP_PKEY_asn1_find() - which (indirectly) calls the related find()
_is_ thread safe.

crypto/x509/x509_vpm.c:

global: 'param_table'. push, no sort in
X509_VERIFY_PARAM_add0_table(), so X509_VERIFY_PARAM_lookup() - which
has the find() is screwed: got to be called at least ONCE BEFORE going
multi-threaded with this call to make the sort()-inside-find() happen
safely.


crypto/x509/by_dir.c: here the push()/find() code is completely
surrounded by CRYPTO_LOCK_X509_STORE locks everywhere, so we're
reasonably safe here (anyone messing around with the hashes stack on
their own should wear a hardtop).


crypto/x509/x509_trs.c:

global stack 'trtable'; the story is getting old by now:
X509_TRUST_add() has push but no sort, so X509_TRUST_get_by_id() is
definitely NOT thread safe on first invocation.


Which leaves the already reported case, though it happens in more
places with the same dangers AFAICS:

ssl/s3_srvr.c: ssl3_get_client_hello()
ssl/s2_srvr.c: get_client_hello()
ssl/s3_clnt.c: ssl3_get_server_hello()
ssl/s2_clnt.c: get_server_hello()

with unknown risks in:
ssl/ssl_ciph.c: ssl_cipher_get_evp()
ssl/s3_lib.c: ssl3_choose_cipher()



May I suggest a approach to the fix similar to the code in
crypto/asn1/x_crl.c, yet NOT outside the internal_find() code (which
means every man and his dog has to 'remember' this cute side effect of
sk_find() all the time) but move that type of code *inside* the
internal_find() call, i.e. put locking code around that sort() in
there a x_crl.c:

<snip>
        if (!sk_X509_REVOKED_is_sorted(crl->crl->revoked))
                {
                CRYPTO_w_lock(CRYPTO_LOCK_X509_CRL);
                sk_X509_REVOKED_sort(crl->crl->revoked);
                CRYPTO_w_unlock(CRYPTO_LOCK_X509_CRL);
                }
        idx = sk_X509_REVOKED_find(crl->crl->revoked, &rtmp);
</snip>

so that internal_find() in crypto/stack/stack.c essentially performs a
conditional, LOCKED sort(). It would probably be best to put the
locking code in sk_sort() itself, it mimics the essential bit of
x_crl.c best:

void sk_sort(_STACK *st)
        {
        if (st && !st->sorted)
                {
                CRYPTO_w_lock(CRYPTO_STACK_SORT_LOCK);
                qsort(st->data,st->num,sizeof(st->data[0]), (int (*)(const void 
*,
const void *))(st->comp));
                st->sorted=1;
                CRYPTO_w_unlock(CRYPTO_STACK_SORT_LOCK);
                }
        }

where CRYPTO_STACK_SORT_LOCK is a lock for this purpose alone, so we
don't clash with other, possibly surrounding locks and create
ourselves a serious case of deadlock.
Meanwhile, this approach is incomplete until sk_insert() is graced
with mandatory sk_sort() call at the end. If you don;t do that, yes,
sk_insert() will be faster (much faster for large stacks), but will
also keep this thread safety issue alive for ever.

Alternatively, one might consider redoing sk_insert() in such a way
that it inserts the item in the stack in such a way as to keep the
stack sorted from start to end: it's an elemental operation for an
insertion sort anyhow, so we could write it like that. ;-)


Opinions, anyone?
(Yes, visiting all the places listed above and adding a sort() after
each each bloody push() I consider an extreme form of kludge. Yech.
Besides, I'll need to do the same to my insignificant custom code, so
you can imagine my opinion if that particular 'solution' is picked.
Besides, appending each push with a sort() is a very convoluted and
fault-inviting alternative way of having sk_insert() act as an
insert-sort elemental operation. Minus the speed gains, compared to
when we do it *inside* sk_insert().)

Yours,

Ger









On Fri, Nov 28, 2008 at 2:33 PM, Peter Edwards via RT <[EMAIL PROTECTED]> wrote:
> Hi,
> I previously reported this issue with the subject "ssl3_get_server_hello
> not threadsafe(?)" to -dev. Rephrased a bit and reposted to -bugs. This
> is definitely relevant for 0.9.8i, but seems to be so for current
> snapshots too.
>
> I've a simple program (used as a stress tool) that creates a number of
> threads, and uses a shared SSL_CTX to create an SSL object for each
> thread. My understanding is that this fits with the intended
> thread-safety model of the library: the SSL_CTX structure is intended to
> be shareable between different threads without additional locking by the
> application. Sporadically, I get "SSL_R_WRONG_CIPHER_RETURNED" errors as
> the threads negotiated with the server. Stepping through to reproduce
> the problem, I got to ssl3_get_server_hello(), where (at ssl/s3_clnt.c
> line 802 in 0.9.8i) it does this:
>
>  >        sk=ssl_get_ciphers_by_id(s);
>  >        i=sk_SSL_CIPHER_find(sk,c);
>
> The stack returned by ssl_get_ciphers_by_id happens to be the one from
> the SSL_CTX structure: there's no specific set of ciphers for the SSL
> object itself. This stack is shared between all threads accessing the
> SSL_CTX structure.
>
> The implementation of sk_SSL_CIPHER_find is a macro that ends up
> invoking "sk_find", in turn invoking "internal_find". This calls
> sk_sort() before doing a binary search of the stack. This means that
> sk_sort can be invoked from different threads simultaneously without any
> lock protection. A quick analysis of the instance of the SSL_CTX
> structure at the point that I get my "wrong cipher returned" error shows
> that the "sorted" field of the stack is indeed set, but when running
> through the list of ciphers, I see that the ID fields are mostly, but
> not completely, sorted - hence the intermittent failure of
> sk_SSL_CIPHER_find.
>
> I think the best way to avoid this problem is to ensure that the cipher
> stack is sorted before there's any possibility of different threads
> using the SSL_CTX structure: My reading is that the ciphers_by_id will
> end up being sorted most of the time anyway, and the overhead of sorting
> the small number of items in it justifies sorting it earlier rather than
> later. The attached patch fixes the problem locally for me, but I'm not
> familiar enough with OpenSSL to be sure its the best thing to do.
>
>
>
> diff -u -r1.1.1.2 ssl_ciph.c
> --- ssl/ssl_ciph.c      18 Nov 2008 12:53:20 -0000      1.1.1.2
> +++ ssl/ssl_ciph.c      28 Nov 2008 11:41:41 -0000
> @@ -1088,6 +1088,14 @@
>        *cipher_list_by_id = tmp_cipher_list;
>        
> (void)sk_SSL_CIPHER_set_cmp_func(*cipher_list_by_id,ssl_cipher_ptr_id_cmp);
>
> +        /*
> +         * XXX: Without this, the ostensibly readonly operation
> +         * sk_SSL_CIPHER_find will invoke the sort. A single SSL_CTX object
> +         * shared between multiple threads could therefore be updated
> +         * unsafely
> +         */
> +        sk_SSL_CIPHER_sort(*cipher_list_by_id);
> +
>        return(cipherstack);
>        }
>
>
>



-- 
Met vriendelijke groeten / Best regards,

Ger Hobbelt

--------------------------------------------------
web:    http://www.hobbelt.com/
        http://www.hebbut.net/
mail:   [EMAIL PROTECTED]
mobile: +31-6-11 120 978
--------------------------------------------------
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to