"Henry E. Thorpe" wrote:
>
> Carlos;
>
> You can certainly use the PKCS #11 interface-- just don't expect to get
> the full benefit of your hardware accelerator. This is what I found
> with the Sun PCI Crypto card (basically an OEM Rainbow card):
>
> This is the product that I've been looking at:
>
> http://www.sun.com/connectivity/suncryptoaccel1/
>
> Sun P/N: X1133A
>
> What I wanted to see was a rough-order comparison to what the card can
> do for SSL session setup. The test I used was an "SSL soak" tool that
> negotiates an SSL connection, without session reuse, and then closes
> it. No requests, just an SSL open and close.
>
> This is a worst-case scenario that is almost all RSA signing
> operations (I used only RSA key pairs and certificates).
>
> Here is what I recorded ("SSLs" are new SSL session negotiations,
> "hosts" are number of simultaneous instances of SSLsoak, which is
> single-threaded ):
>
> Netra (2x296MHz CPUs) base, nes 4.1:
>
> ~27-32 SSLs/sec, 20 hosts
>
> SunOS wnpwp02 5.6 Generic_105181-23 sun4u 01/24/01
>
> 14:06:49 %usr %sys %wio %idle
> 14:06:50 99 1 0 0
> 14:06:51 96 4 0 0
> 14:06:52 99 1 0 0
> 14:06:53 99 1 0 0
> 14:06:54 94 6 0 0
>
> Average 97 3 0 0
>
> Netra (2x296MHz CPUs) w/ Crypto card, nes 4.1:
>
> ~72-73 SSLs/sec, 20 hosts
>
> SunOS wnpwp01 5.6 Generic_105181-23 sun4u 01/24/01
>
> 12:55:29 %usr %sys %wio %idle
> 12:55:30 62 12 0 26
> 12:55:31 67 7 0 26
> 12:55:32 71 7 0 22
> 12:55:33 67 6 0 27
> 12:55:34 71 10 0 19
>
> Average 68 8 0 24
>
> 420R (2x450MHz CPUs) base, nes 4.1sp5:
>
> ~46/47 SSLs/sec, 20 hosts
>
> SunOS wnpwp06 5.6 Generic_105181-23 sun4u 01/24/01
>
> 12:56:45 %usr %sys %wio %idle
> 12:56:46 97 3 0 0
> 12:56:47 98 2 0 0
> 12:56:48 97 3 0 0
> 12:56:49 99 1 0 0
> 12:56:50 94 6 0 0
>
> Average 97 3 0 0
>
> Please note that I used the 420 to drive all the tests, except that
> against the 420. However, using a Netra to load the 420 gave the same
> results as using Reddog, which is a 6-processor Ultra-Enterprise server.
>
> As an interesting side, my HP LDPro desktop machine (2x200MHz Pentium
> II) running Apache/mod_ssl under Linux 2.2.16 kernel, can support
> about 40 SSLs/sec with 20 hosts. I don't know whether this is because
> of the FPU performance of the Pentium II vs. USparc-II, Linux
> vs. Solaris, Apache vs. NES, or OpenSSL vs. Netscape Security
> Services. Just interesting.
>
> I then compiled Apache 1.3 on the same box, and tried, again:
>
> Apache 1.3.17, OpenSSL engine 0.9.6, mod_ssl 2.8.0:
>
> Netra (2x296MHz CPUs)
>
> ~175-200 SSLs/sec, 20 hosts
>
> SunOS wnpwp01 5.6 Generic_105181-23 sun4u 02/06/01
>
> 13:20:36 %usr %sys %wio %idle
> 13:20:37 47 19 0 34
> 13:20:38 50 17 0 33
> 13:20:39 27 12 0 61
> 13:20:40 30 20 6 44
> 13:20:41 47 18 1 34
> 13:20:42 48 22 0 31
> 13:20:43 43 25 0 32
> 13:20:44 30 8 0 62
> 13:20:45 48 20 0 32
> 13:20:46 46 21 0 34
>
> Average 42 18 1 40
>
> And no, I didn't misplace a decimal point. With
> mod_ssl/openssl-engine, Apache rules.
>
> The Rainbow card should have a theoretical maximum of 200 RSA
> signings/second. Apache/mod_ssl worked the card to the max; while NES
> showed disappointing improvement. What I was looking to see was the
> effect on CPU load, this being one of our production system limitations.
>
> So, why can't NES fully exercise the crypto hardware like Apache? My
> guess is the Java PKCS #11 interface. Apache's mod_ssl interface is a
> lot closer to the hardware than NES's PKCS #11 interface.
(Note that NES is now called iPlanet Web Server, or iWS.)
Actually the lack of throughput you describe is very
easily explained, and has nothing to do with Java PKCS #11
interfaces (NSS is pure C code, with no Java installed).
The problem is simply that the current Rainbow PKCS #11
driver is not thread safe, which forces NSS to use a lock
to serialize access to the Rainbow card by threads in the
same process (iWS is multithreaded), so you actually need
to run multiple processes to get the proper performance.
The lock problem with Rainbow is easily worked around in
iWS by setting MaxProcs to at least 2. This is usually
sufficient to get the full 200 ops/sec that the card is
capable of. If there is more than one card and many CPUs
in the system, setting MaxProcs to a higher value can help.
Sun was able to reach 1400 ops/s using 8 Rainbow cards with
MaxProcs 40 on a 12 CPU Ultrasparc box. The maximum limit
of the cards was 1600 ops/s (8 * 200). This was with iWS
4.1SP2 and NSS 2.8.3. But achieving the right performance
does require you to set MaxProcs to >1 , which is not the
default setting. This should however have been noted
somewhere in the doc that comes with the card with the
instructions for setting up iWS with it.
You bring up two very good points here. 1. One has to be
very careful when contructing the PKCS #11 module. Very simple
things could competely destroy your hardware performance.
2. It's also important to properly configure your server.
iWS is capable of performance equal to Apache on the Rainbow
card when properly configured.
Also, NSS 3.2 is about 3x faster than NSS 2.x (used in iWS
4.1) on UltraSPARC, and this will show up as a performance
improvement (in the software crypto token) in iWS 6.0.