Hi,
i just found out that the openssl package of Slackware for ARM 14.0
doesn't use the assembler optimizations available for ARMv4 in OpenSSL.
Since all the packages are built for the baseline architecture of
ARMv5te, enabling the optimization shouldn't affect the compatibility to
any of the platforms that are supported by Slackware for ARM 14.0.* It
would, however, have a huge impact on the performance of OpenSSL and
very likely all the programs that use OpenSSL libraries (like OpenSSH).
(* I am not 100% sure about that because i am no expert regarding the
different ARM architectures, so maybe i am wrong here. Maybe someone on
the mailing list with more expertise on this matter can confirm my
assumption or correct me?)
I rebuilt the OpenSSL package and ran some tests on my Sheevaplug. As
you can see the results are pretty impressive (the original output of
the "openssl speed" command is much longer, this is just an excerpt):
OpenSSL 1.0.1c, default package (no optimization)
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192
bytes
md5 3309.68k 11805.99k 34086.83k 64614.06k
87405.91k
aes-128 cbc 10003.75k 11224.32k 11537.49k 11619.33k
11635.37k
aes-192 cbc 8812.01k 9592.04k 9819.65k 9879.89k
9890.47k
aes-256 cbc 7777.37k 8376.77k 8548.95k 8593.75k
8601.60k
sha256 2670.09k 6356.61k 11432.70k 14294.36k
15349.08k
sha512 412.84k 1650.65k 2386.60k 3272.36k
3670.02k
sign verify sign/s verify/s
rsa 512 bits 0.002360s 0.000218s 423.8 4589.0
rsa 1024 bits 0.012267s 0.000625s 81.5 1599.4
rsa 2048 bits 0.074701s 0.002067s 13.4 483.8
rsa 4096 bits 0.494286s 0.007278s 2.0 137.4
OpenSSL 1.0.1c, with ARMv4 assembler optimization enabled
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192
bytes
md5 4078.73k 14215.30k 38787.16k 68390.57k
88432.50k
aes-128 cbc 15095.54k 16847.00k 17382.31k 17521.32k
17555.46k
aes-192 cbc 13213.38k 14536.19k 14932.91k 15036.76k
15059.63k
aes-256 cbc 11756.32k 12785.60k 13089.37k 13167.96k
13186.39k
sha256 5125.39k 12176.15k 21252.61k 26089.81k
28068.52k
sha512 1446.27k 5780.57k 8251.65k 11255.81k
12593.83k
sign verify sign/s verify/s
rsa 512 bits 0.001101s 0.000106s 907.9 9456.0
rsa 1024 bits 0.005549s 0.000313s 180.2 3190.6
rsa 2048 bits 0.035971s 0.001120s 27.8 892.8
rsa 4096 bits 0.257692s 0.004279s 3.9 233.7
Short summary (performance increase, numbers rounded)
aes-128 cbc: +50% (16 bytes)
aes-192 cbc: +50% (16 bytes)
aes-256 cbc: +50% (16 bytes)
sha256: +90% (16 bytes)
sha512: +250% (16 bytes)
rsa 512 bits: +115% (sign) / +105% (verify)
rsa 1024 bits: +120% / +100%
rsa 2048 bits: + 110% / +85%
rsa 4096 bits: + 100% / +70%
If you want to test this yourself, just add the switch "-linux-armv4"
when you run the Configure script from OpenSSL or apply the patch [1] to
the debian-targets.patch file before running the Slackbuild script.
Warning: As OpenSSL is removed during the build process, you won't be
able to login with SSH during the build process and until you reinstall
the openssl package. So don't forget to temporarily enable telnet or
something similar, especially if you only have remote access to the
machine.
I found out about the assembler optimization from the Raspberry Pi
forum [2].
Cheers,
Michael
[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676533
[2] http://www.raspberrypi.org/phpBB3/viewtopic.php?f=66&t=8433
_______________________________________________
ARMedslack mailing list
ARMedslack@lists.armedslack.org
http://lists.armedslack.org/mailman/listinfo/armedslack