http://cvs.openssl.org/chngview?cn=22966 is based on submission and
takes a notch up by boosting RSA *sign* performance, RSA1024 sign turned
~2x faster and RSA2048 - ~1.7x. I chose not to generate code for all
input lengths, only for 512, 1024, 1536 and 2048 bits. Rationale is that
subroutines operating on shorter inputs (usable in ECp) didn't deliver
better performance than VIS3 code, while whole spectrum of subroutines
operating on longer inputs is effectively waste of space. The fact that
shorter subroutines fail to deliver better performance might have more
to do with the way ECp calculations are arranged. In other words if ECp
is overhauled, the situation is likely to change.
As for 32-bit code and earlier discussed concerns about its sensitivity
to interrupts (on Solaris and non-bleeding edge Linux). For timer
interrupt rate penalties are hardly measurable even at longest key
length, so that suggestion is to execute code for all applicable
lengths. If operation fails, it's retried once and then falls back to
VIS3 code path. It's possible to arrange for some kind of "stickiness"
when a number of failures in row limits attempts to execute subroutines
in question. I mean a counter that counts failures and let be trying
when counter is high. Fall-back can decrement it so that after several
steps it can try again.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [email protected]