Hi Bodo (and anyone else interested),

Just a thought I was having about locking and things. Rather than us
worrying so much about how to do per-object locking (as opposed to our
current per-class locking), I wonder if it's worth considering how to
minimize the number and complexity of operations that happen under this
global locking. Anyway, "per-object" locking can be manufactured out of
per-class locking by simply using the class locks to sychronise access to
integers in the objects that themselves measure on/off mutexing, semaphore
counters, or whatever. I won't say it's a great idea, but it's possible.
My idea though is that the same principle could at least gain us some
better mileage with the per-class locking if we use those locks only to
flick switches and the like, but otherwise try and put all the meaty
structural operations *outside* such locking.

This seems to be one such case;

On Mon, 18 Dec 2000 [EMAIL PROTECTED] wrote:

>   Obtain lock CRYPTO_LOCK_RSA before creating BN_MONT_CTX
>   structures and setting rsa->_method_mod_{n,p,q}.

Rather than creating the BN_MONT_CTX structures inside a lock, surely it's
a better idea to compete via a lock for *who* will assign the per-computed
data to the structures. Doing the actual computations, allocations, etc
should not need to happen inside a lock. I think ... perhaps you or anyone
else can put me straight if I'm missing something salient.

Eg.

>   Index: rsa_eay.c
>   ===================================================================
>   RCS file: /e/openssl/cvs/openssl/crypto/rsa/rsa_eay.c,v
>   retrieving revision 1.14
>   retrieving revision 1.14.2.1
>   diff -u -r1.14 -r1.14.2.1
>   --- rsa_eay.c       2000/06/01 22:18:44     1.14
>   +++ rsa_eay.c       2000/12/18 16:36:07     1.14.2.1
>   @@ -138,9 +138,24 @@
>       
>       if ((rsa->_method_mod_n == NULL) && (rsa->flags & RSA_FLAG_CACHE_PUBLIC))
>               {
>   -           if ((rsa->_method_mod_n=BN_MONT_CTX_new()) != NULL)
>   -                   if (!BN_MONT_CTX_set(rsa->_method_mod_n,rsa->n,ctx))
>   -                       goto err;
>   +           CRYPTO_w_lock(CRYPTO_LOCK_RSA);
>   +           if (rsa->_method_mod_n == NULL)
>   +                   {
>   +                   BN_MONT_CTX* bn_mont_ctx;
>   +                   if ((bn_mont_ctx=BN_MONT_CTX_new()) == NULL)
>   +                           {
>   +                           CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
>   +                           goto err;
>   +                           }
>   +                   if (!BN_MONT_CTX_set(bn_mont_ctx,rsa->n,ctx))
>   +                           {
>   +                           BN_MONT_CTX_free(bn_mont_ctx);
>   +                           CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
>   +                           goto err;
>   +                           }
>   +                   rsa->_method_mod_n = bn_mont_ctx;
>   +                   }
>   +           CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
>               }

[snip]

What if the code was structured as follows;

    if((rsa->_method_mod_n == NULL) && [etc])
        {
        BN_MONT_CTX *bn_mont_ctx;
        int bailout;
        if((bn_mont_ctx=BN_MONT_CTX_new()) == NULL)
            [do error stuff]
        if(!BN_MONT_CTX_set(....))
            [do error stuff]
        bailout = 0;
        /* Only now do we grab the lock to ensure threads don't race to
         * assign montgomery stuff to the RSA structure. */
        CRYPTO_w_lock(CRYPTO_LOCK_RSA);
        /* Now we check the _method_mod_n and stuff *inside* the lock */
        if((rsa->_method_mod_n == NULL) & [etc])
             rsa->_method_mod_n = bn_mont_ctx;
        else
             bailout = 1;
        CRYPTO_w_unlock(CRYPTO_LOCK_RSA);
        if(bailout)
             /* Release the pre-calculated montgomery stuff, we had
              * threads race on *this* RSA structure so we wasted some
              * effort in this case. */
             BN_MONT_CTX_free(bn_mont_ctx);
        }

In this case, (as with your existing code), the initial test doesn't
require a lock - it will test again inside a lock if it gets that far.
However, the penalty in my variant is that if threads race then some may
end up creating montgomery stuff only to find out inside the lock that the
work has just been done by another thread. The worst that can happen is
that more than one thread does that same work in the event of a race (the
race is specific to *this* RSA object). However, that's a risk of wasted
effort local to just this particular RSA structure - the global lock is
only being used to seal off the final check & assign of the precalculated
data to the structure. Conversely, the risk in the existing code seems to
be that all such operations on *all* RSA objects block.

In multi-threading apps (where one assumes threading is being used to
maintain high concurrency), it seems (by thinking of classical SSL
examples) that there's far less likely to be a race to initialise the same
RSA structure than there is to be initialising montgomery stuff for
distinct RSA structures. Eg. ephemeral keys, temporary keys, per-session
keys, whatever. IMHO it'd be better to take a start-up hit in the form of
a risk of redundant calculation on an internal key (used across all
threads)  than to jeopardise concurrency in the long-term becase the
threads are doing calculations on their own RSA keys inside global locks.

Thoughts?

Cheers,
Geoff


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to