On Sun, Feb 11, 2001 at 01:02:43PM -0800, Alfred Perlstein wrote:
> * Kris Kennaway <[EMAIL PROTECTED]> [010211 12:52] wrote:
> > On Sun, Feb 11, 2001 at 12:47:07PM -0800, Alfred Perlstein wrote:
> >
> > > Looks awesome, someone complained that Linux was able to maintain
> > > an order of magnitude more SSL connections than FreeBSD, since you
> > > say this gives us a 3-5x speed up, I'd really like to see it committed
> > > and ported to -stable ASAP.
> >
> > Yep! Just want to give a few days for people to comment on the
> > MACHINE_CPU thing.
> >
> > > Is it possible to have multiple ASM cores and use the appropriate
> > > routines? Or must it all be choosen at compile time?
> >
> > It's done at compile-time.
>
> bah, lame. :(
>
> How is the worst asm code vs the best C code again?
OpenSSL includes 386 and 586 asm for the following: bf, bn (number
libraries), cast, des, md5, rc4, rc5, ripemd, sha1.
and 686 asm for bf only (DES is broken)
In fact there's not a lot of difference between (what are claimed to
be) the i386 versions and the i586 versions (they're generated from
the same source by a preprocessor, and in fact are identical for
some/most files) - this probably means they are not very optimal.
I was also wrong about the speed improvements (they're not quite so
high, only around 2x) - perhaps my baseline benchmark was sharing the
CPU with something else giving it a 2x slowdown. So I'm not sure
where the 3-5x speed up comes from - either it's another rumour (you
didn't hear it from Peter again, did you? :) or the cause is
elsewhere. What we build now should be exactly in line with what
openssl does itself.
These measurements were done on my PPro 233, and no attempt at sample
averaging was performed :-)
Kris
[C code]
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 159.52k 437.09k 590.38k 647.48k 653.41k
mdc2 405.46k 440.38k 439.93k 442.00k 442.93k
md4 2415.06k 12806.00k 24615.33k 32313.88k 35873.96k
md5 1888.65k 9092.61k 16840.50k 20897.62k 22739.51k
hmac(md5) 741.81k 4722.98k 11755.58k 18427.53k 22120.47k
sha1 1319.27k 3052.54k 6990.83k 10423.14k 11986.67k
rmd160 846.12k 3629.76k 6249.11k 7644.14k 8178.40k
rc4 13176.13k 17308.64k 18127.45k 18709.00k 18527.01k
des cbc 2589.75k 2911.96k 2918.99k 2930.14k 2961.85k
des ede3 719.78k 751.80k 758.33k 758.61k 761.84k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 1476.49k 1540.60k 1551.96k 1547.98k 1571.13k
rc5-32/12 cbc 6533.14k 8820.63k 9144.01k 9159.59k 9189.25k
blowfish cbc 3921.72k 4490.54k 4551.53k 4567.12k 4582.91k
cast cbc 3725.39k 4496.47k 4425.20k 4432.26k 4461.36k
sign verify sign/s verify/s
rsa 512 bits 0.0106s 0.0011s 94.5 951.4
rsa 1024 bits 0.0620s 0.0034s 16.1 296.7
rsa 2048 bits 0.3963s 0.0112s 2.5 89.4
rsa 4096 bits 2.6106s 0.0389s 0.4 25.7
sign verify sign/s verify/s
dsa 512 bits 0.0109s 0.0134s 91.5 74.6
dsa 1024 bits 0.0342s 0.0406s 29.3 24.6
[i386]
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 2525.24k 13682.82k 26954.24k 34031.00k 38153.23k
hmac(md5) 952.04k 6381.75k 17338.03k 29527.15k 37320.02k
sha1 1621.91k 6960.45k 11626.82k 13810.67k 14685.97k
rmd160 1238.63k 5838.79k 10350.12k 12930.47k 13941.05k
rc4 18170.79k 24351.64k 25941.40k 26300.99k 26613.25k
des cbc 4743.99k 5342.60k 5377.98k 5406.58k 5379.56k
des ede3 1809.64k 1903.68k 1908.81k 1921.80k 1928.79k
rc5-32/12 cbc 11934.06k 15701.79k 16004.71k 16014.24k 16569.98k
blowfish cbc 5885.08k 6493.90k 6553.44k 6575.91k 6569.06k
cast cbc 5889.94k 6558.54k 6578.21k 6627.23k 6571.16k
sign verify sign/s verify/s
rsa 512 bits 0.0057s 0.0005s 174.2 1822.0
rsa 1024 bits 0.0299s 0.0016s 33.4 641.4
rsa 2048 bits 0.1757s 0.0052s 5.7 193.5
rsa 4096 bits 1.1865s 0.0179s 0.8 55.8
sign verify sign/s verify/s
dsa 512 bits 0.0057s 0.0068s 176.4 146.8
dsa 1024 bits 0.0157s 0.0185s 63.8 54.1
dsa 2048 bits 0.0503s 0.0621s 19.9 16.1
[i586]
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 2588.19k 13504.14k 26623.31k 35248.59k 38189.04k
hmac(md5) 946.51k 6358.38k 17134.34k 29501.74k 37456.10k
sha1 1616.66k 7562.68k 13581.01k 16957.68k 18224.95k
rmd160 1265.14k 5918.43k 10375.20k 12866.01k 13842.36k
rc4 18405.78k 24232.61k 25611.99k 25964.78k 26504.73k
des cbc 4800.96k 5321.68k 5351.02k 5421.37k 5358.96k
des ede3 1829.59k 1903.18k 1915.56k 1914.43k 1907.37k
rc5-32/12 cbc 11815.49k 15352.52k 15709.59k 16072.66k 16316.64k
blowfish cbc 7099.84k 8148.80k 8255.26k 8310.07k 8332.21k
cast cbc 6991.88k 8031.58k 8116.97k 8118.18k 8196.42k
sign verify sign/s verify/s
rsa 512 bits 0.0057s 0.0006s 175.6 1814.9
rsa 1024 bits 0.0296s 0.0016s 33.8 643.1
rsa 2048 bits 0.1749s 0.0052s 5.7 193.8
rsa 4096 bits 1.1860s 0.0179s 0.8 55.7
sign verify sign/s verify/s
dsa 512 bits 0.0057s 0.0069s 176.8 145.9
dsa 1024 bits 0.0157s 0.0188s 63.9 53.3
dsa 2048 bits 0.0503s 0.0624s 19.9 16.0
[i686]
blowfish cbc 7449.80k 8656.35k 8900.01k 8913.19k 8932.08k
Kris
PGP signature