The turn came to BN_mul() and the following code:

                        /* symmetric and > 4 */
                        /* 16 or larger */
                        j=BN_num_bits_word((BN_ULONG)al);
                        j=1<<(j-1);
                        k=j+j;
                        t = BN_CTX_get(ctx);
                        if (al == j) /* exact multiple */
                                {
                                bn_wexpand(t,k*2);
                                bn_wexpand(rr,k*2);
                                bn_mul_recursive(rr->d,a->d,b->d,al,t->d);
                                }
                        else
                                {
                                bn_wexpand(a,k);
                                bn_wexpand(b,k);
                                bn_wexpand(t,k*4);
                                bn_wexpand(rr,k*4);
                                for (i=a->top; i<k; i++)
                                        a->d[i]=0;
                                for (i=b->top; i<k; i++)
                                        b->d[i]=0;
                                bn_mul_part_recursive(rr->d,a->d,b->d,al-j,j,t->d);
                                }

The else part is what's interesting.  If I understand everything
correctly, the reason why a and b are expanded (to something that's
between al+1 and al+(al+1)/2 in size for a and b alike) is that
bn_mul_part_recursive() uses some routines that require the data sent
to them to be equally big.  The expansion is therefore done to
guarantee that a->d and b->d is at least 2*(al-j) in length.

However, I'm right now pondering how hard it would be to make a
variant of said routines (bn_cmp_words(), bn_add_words() and
bn_sub_words()) that can take two arrays of differing sizes.  At least
for bn_cmp_words it would be trivial, since the size would be the
deciding factor instead of the contents and is very quick to check,
and for the others I can't imaging at first glance that it would be
too hard to implement.  One might argue that it's yet another check
and yet another performance hit, but what one also needs to think is
that the expansion of a and b would then be unneeded, which is a few
checks and possibly a realloc check less, so when you sum it up, I
wonder if the performance would really be decreased.  Also, consider
that variants of bn_add_words() and bn_sub_words() that took arrays of
differing sizes would do less calculations, which I think is a gain as
well, compared to what would be an increase of plain memory copying.

Again, if someone could review my thinking, I'd be the happier.

And no, I haven't done any actual performance tests on this...  yet.

-- 
Richard Levitte   \ Spannv�gen 38, II \ [EMAIL PROTECTED]
Chairman@Stacken   \ S-168 35  BROMMA  \ T: +46-8-26 52 47
Redakteur@Stacken   \      SWEDEN       \ or +46-709-50 36 10
Procurator Odiosus Ex Infernis                -- [EMAIL PROTECTED]
Member of the OpenSSL development team: http://www.openssl.org/
Software Engineer, Celo Communications: http://www.celocom.com/

Unsolicited commercial email is subject to an archival fee of $400.
See <http://www.stacken.kth.se/~levitte/mail/> for more info.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to