Re: [PATCH] sparc: Add support for montmul and montsqr opcodes.

Andy Polyakov Sat, 20 Oct 2012 11:59:18 -0700

Secondarily, since we can end up having to retry (deep window spill on
32-bit and register ECC errors on 32-bit and 64-bit)

I'm thinking about letting be the check after *every* montsqr, issuing
multiple montsqr back to back and only then check for retry
condition. One can do it only for inputs shorter than specific
length. What do you think?


This gets to the issue of outputs aliasing an input.

And? It's just that we don't try to identify which particular montsqrthat failed, but short sequence of them. And in case of failure retrythe sequence, not single instruction. Why not detect specificinstruction failure? In 32-bit mode detection involves traversing backthe register windows in order detect if the subroutine has suffered fromwindows flush. But even in 64-bit mode if we were to detect specificinstruction failure, we would have to traverse the windows back towindow holding the result in order to save the correct one from previousinstruction. It appears to be expensive operation, at least for shorterkeys (as mentioned vis3-mont delivered better result on RSA1024 sign).And as failure is seldom, it makes sense to share the costs. Sosuggestion is to fire several montsqr without looking at result,accumulate FSR.fcc3, and only then traverse windows back in order toeither save the result from sequence or discard it and reload inputs tosequence.

Question in context of 32-bit application. My understanding is that inorder to detect if multi-window subroutine such one we have to use herehas suffered from windows flush (as result of context switch or deliveryof asynchronous signal) it's sufficient to detect if current window isreloaded. I mean it never flushes say couple of top windows, but all ofthem. Is it correct understanding?

One annoying aspect of all of this is that we need to use
a temporary on-stack location for the result until we know
we don't have to do a retry.  Otherwise we might corrupt
one of the inputs.

Really, the thing to do is to put the whole RSA/DSA/etc. path
into a specially written T4 code block.  That way we won't have
to deal with details such as the fact that the words in the
openssl bignum layout are transposed to what the T4 engine
wants in the registers, etc.

The above suggestion implies that we do break dependence frombn_mul_mont and do something else. Naturally it allows for amortizingvarious overheads such as transposing the words in input and outputvectors. So don't worry, it's all considered ;-)


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Re: [PATCH] sparc: Add support for montmul and montsqr opcodes.

Reply via email to