http://cvs.openssl.org/chngview?cn=22966 is based on submission and takes a notch up by boosting RSA *sign* performance, RSA1024 sign turned ~2x faster and RSA2048 - ~1.7x. I chose not to generate code for all input lengths, only for 512, 1024, 1536 and 2048 bits. Rationale is that subroutines operating on shorter inputs (usable in ECp) didn't deliver better performance than VIS3 code, while whole spectrum of subroutines operating on longer inputs is effectively waste of space. The fact that shorter subroutines fail to deliver better performance might have more to do with the way ECp calculations are arranged. In other words if ECp is overhauled, the situation is likely to change.

As for 32-bit code and earlier discussed concerns about its sensitivity to interrupts (on Solaris and non-bleeding edge Linux). For timer interrupt rate penalties are hardly measurable even at longest key length, so that suggestion is to execute code for all applicable lengths. If operation fails, it's retried once and then falls back to VIS3 code path. It's possible to arrange for some kind of "stickiness" when a number of failures in row limits attempts to execute subroutines in question. I mean a counter that counts failures and let be trying when counter is high. Fall-back can decrement it so that after several steps it can try again.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to