Re: [PATCH] sparc: Add support for montmul and montsqr opcodes.

Andy Polyakov Sun, 21 Oct 2012 04:06:51 -0700

It's just that we don't try to identify which particular montsqr
that failed, but short sequence of them. And in case of failure
retry the sequence, not single instruction. Why not detect specific
instruction failure? In 32-bit mode detection involves traversing
back the register windows in order detect if the subroutine has
suffered from windows flush.


Right, and time is your worst enemy for this issue.  You want to
minimize, not increase, the amount of time that those top level
register windows are live in the chip and potentially flushed out.

Yes, it's race, but there is chance of not loosing it. Here are yourresults:


rsa 1024 bits 0.000786s 0.000025s   1272.6  40080.4
rsa 2048 bits 0.002111s 0.000049s    473.8  20273.7

And here are vis3-mont ones:

rsa 1024 bits 0.000671s 0.000033s   1490.4  30207.6
rsa 2048 bits 0.003375s 0.000119s    296.3   8404.4

Last column is public key operations per second, one before last isprivate key ones. Private key operations are server-side ones andtherefore are most interesting/critical. RSA1024 delivers more than 1000private key operations per second. Each private key operation is about1000 512-bit squarings (thanks to Chinese remainder theorem, right?).Timer frequency of 1000 ticks per second means that as little as 0.1% ofsquarings will be hit by timer interrupt. If we restart sequence of 5instead of single instruction, then penalties from restarted sequencesgo up to 0.5%. Still totally reasonable price to pay. But note how muchfaster your RSA2048 *public* key operations are. Is it plausible toassume that RSA1024 can be improved by say 2x factor?

Of course more often interrupts would ruin it, but there is swing room[for shorter key lengths!]. Of course fall-back is required. But it'srequired in either case, for sequence or single instruction retry, it'sjust that retry detection will be slower for sequence...

Therefore, for systems that don't have support for a biased 64-bit
stack in 32-bit processes, you should check after every operation.

But as we seem to agree that code with sequence retry is worthimplementing for 64-bit [and biased stack] account, it would be trivialto check the above theory in 32-bit process context ;-)

Question in context of 32-bit application. My understanding is that in
order to detect if multi-window subroutine such one we have to use
here has suffered from windows flush (as result of context switch or
delivery of asynchronous signal) it's sufficient to detect if current
window is reloaded.


What usually happens is something as simple/ as a device interrupt, or
the per-cpu timer interrupt, comes in.  Th/at's enough to blow the top
register window and cause a restart.

Question was if it *always* blows away the top window. So that it can beused as canary for early exit even from between instructions in thesequence. Another question is following. Imagine I traversed registerwindows down to one with result. Imagine that so far all windows werefound intact. Does it guarantee that even bottom window is intact? Theone holding M? The question is if I can save the result withoutexamining the bottom window. On the other hand I can copy part of theresult residing in integer registers to floating point register bank(yes, zapping M), get down to bottom window and then decide if result isvalid or not.

Which means two things:

1) We have to limit the retries and fallback to software if necessary.


Yes.

2) The only reasonable thing to do longer term is the biased stack
   idea we've designed the other day.  I'm almost done with an
   implementation for Linux and I'll let you know when it's running
   on the T4 test system.

Yes, but then there is Solaris and there are users who are notnecessarily run bleeding edge. I mean yes, but both should be supportedin either case. "Both" means "32-bit processes and 32-bit processes withmisaligned stack".

BTW, we could create even a JIT compiler for this.

While it would be totally cool, I'd prefer to adhere to static code. Atthe very least auto-generated code would be impossible to FIPS-validate.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Re: [PATCH] sparc: Add support for montmul and montsqr opcodes.

Reply via email to