Hi,

i just found out that the openssl package of Slackware for ARM 14.0 doesn't use the assembler optimizations available for ARMv4 in OpenSSL. Since all the packages are built for the baseline architecture of ARMv5te, enabling the optimization shouldn't affect the compatibility to any of the platforms that are supported by Slackware for ARM 14.0.* It would, however, have a huge impact on the performance of OpenSSL and very likely all the programs that use OpenSSL libraries (like OpenSSH).

(* I am not 100% sure about that because i am no expert regarding the different ARM architectures, so maybe i am wrong here. Maybe someone on the mailing list with more expertise on this matter can confirm my assumption or correct me?)


I rebuilt the OpenSSL package and ran some tests on my Sheevaplug. As you can see the results are pretty impressive (the original output of the "openssl speed" command is much longer, this is just an excerpt):


OpenSSL 1.0.1c, default package (no optimization)

The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes

md5 3309.68k 11805.99k 34086.83k 64614.06k 87405.91k aes-128 cbc 10003.75k 11224.32k 11537.49k 11619.33k 11635.37k aes-192 cbc 8812.01k 9592.04k 9819.65k 9879.89k 9890.47k aes-256 cbc 7777.37k 8376.77k 8548.95k 8593.75k 8601.60k sha256 2670.09k 6356.61k 11432.70k 14294.36k 15349.08k sha512 412.84k 1650.65k 2386.60k 3272.36k 3670.02k

                 sign    verify    sign/s verify/s
rsa  512 bits 0.002360s 0.000218s    423.8   4589.0
rsa 1024 bits 0.012267s 0.000625s     81.5   1599.4
rsa 2048 bits 0.074701s 0.002067s     13.4    483.8
rsa 4096 bits 0.494286s 0.007278s      2.0    137.4


OpenSSL 1.0.1c, with ARMv4 assembler optimization enabled

The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes


md5 4078.73k 14215.30k 38787.16k 68390.57k 88432.50k aes-128 cbc 15095.54k 16847.00k 17382.31k 17521.32k 17555.46k aes-192 cbc 13213.38k 14536.19k 14932.91k 15036.76k 15059.63k aes-256 cbc 11756.32k 12785.60k 13089.37k 13167.96k 13186.39k sha256 5125.39k 12176.15k 21252.61k 26089.81k 28068.52k sha512 1446.27k 5780.57k 8251.65k 11255.81k 12593.83k

                  sign    verify    sign/s verify/s
rsa  512 bits 0.001101s 0.000106s    907.9   9456.0
rsa 1024 bits 0.005549s 0.000313s    180.2   3190.6
rsa 2048 bits 0.035971s 0.001120s     27.8    892.8
rsa 4096 bits 0.257692s 0.004279s      3.9    233.7


Short summary (performance increase, numbers rounded)

aes-128 cbc: +50% (16 bytes)
aes-192 cbc: +50% (16 bytes)
aes-256 cbc: +50% (16 bytes)
sha256: +90% (16 bytes)
sha512: +250% (16 bytes)

rsa 512 bits: +115% (sign) / +105% (verify)
rsa 1024 bits: +120% / +100%
rsa 2048 bits: + 110% / +85%
rsa 4096 bits: + 100% / +70%


If you want to test this yourself, just add the switch "-linux-armv4" when you run the Configure script from OpenSSL or apply the patch [1] to the debian-targets.patch file before running the Slackbuild script. Warning: As OpenSSL is removed during the build process, you won't be able to login with SSH during the build process and until you reinstall the openssl package. So don't forget to temporarily enable telnet or something similar, especially if you only have remote access to the machine.

I found out about the assembler optimization from the Raspberry Pi forum [2].

Cheers,
Michael



[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676533
[2] http://www.raspberrypi.org/phpBB3/viewtopic.php?f=66&t=8433


_______________________________________________
ARMedslack mailing list
ARMedslack@lists.armedslack.org
http://lists.armedslack.org/mailman/listinfo/armedslack

Reply via email to