I agree that it's not the most optimal, but at the same time no real
reason to fill bad about it.
But on the other hand I've done all the work to implement the macros
to do the PIC sequence properly.  You really don't have to implement
anything.

BTW, two other points need restating:

1) My macros handle the non-PIC case optimally.

2) Your RAS corruption cost considerations are only considering
   the most immediate effect on the return from the assembler
   routine in question.

   Whereas the true RAS miss cost must be multiplied onto the
   next N functions up in the call chain, where N is the size
   of the RAS.  Since all of those will miss as well.

N is 4 on UltraSPARC. For comparison, in AES case depth from EVP_encrypt to assembly code is 4, so that penalties don't spill on caller. [Apparently we are talking about obsolete platform, as I measure no performance difference between sequences depicted in last message on T4.] All I'm saying is that it doesn't have to be classified as "absolutely critical to fix." Basically, in the context I'd prefer not to touch aes-sparcv9.pl and stick to "aesni" approach as the only one, i.e. keep T4 code as separate module referred from EVP. It allows to concentrate on things that matter, optimizing specific modes performance. By extension it's preferred approach even for other ciphers.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to