Re: NSS non-blocking mode and long computations

2009-10-27 Thread Nelson B Bolyard
On 2009-10-22 12:09 PDT, Ambroz Bizjak wrote:
> On Oct 22, 7:22 pm, Nelson B Bolyard  wrote:
>> What kind of system?  What CPU? What clock speed?  What memory speed?
>>
>> Are you doing client authentication with a client certificate?
>> Are you using Diffie-Hellman Ephemeral cipher suites?
>> 100ms is indeed a long time if you're not.

> My program is acting as a server which requires client authentication. 
> I've generated all certificates with certutil without requesting any
> specific ciphers, so they are RSA 1042 bit. In the server I also don't 
> override any defaults.

Does that mean you're using only export cipher suites?
What cipher suite(s) are your connections actually using?

> On average, the complete handshake in local network takes about 400 ms. 
> I've noticed this is considerably greater than on other systems I have.
> With the server program running on an AMD Sempron 2800 (1.8 GHz),
> complete handhshake takes only about 50ms (though it is 64-bit while my
> Atom system is not). I find this surprising; perhaps there is some
> performance regression with Atom processors? I know the board is
> relatively low-performance, but is really that slow?

Yes, CPU multiplication performance alone could account for much of it.

Make sure you're not doing unnecessary stuff.  Disable all export cipher
suites.  Disable all export suite support.  Use  SSL_NO_STEP_DOWN.

>> A reactor?  What's that?  (nuclear? :)
> It's the part of the program that blocks on all resources and calls
> associated handlers when
> an operation can be performed without blocking; no nuclear reactions
> involved :) see http://en.wikipedia.org/wiki/Reactor_pattern

Oh, a new name for the oldest server software design pattern of all. :-/

> I went through the handshake with a debugger and found the following:
> - when a new client is accepted, SSL_ConfigSecureServer takes about
> 200 ms.
> - the first SSL_ForceHandshake is fast; I assume it receives some data
> from the client and requests a certificate
> - the second SSL_ForceHandshake takes about 200 ms. This is probably
> because it's verifying the client's certificate.

That's a public key operation, which shouldn't be too slow.  The private
key operations are what usually hurt, and if you're doing export cipher
suites with a 1024 bit key in your cert, then you're doing TWO private
key operations for each full handshake.

> I think I could get rid of the SSL_ConfigSecureServer delay by first
> performing it on a dummy SSL file descriptor and pass it as a model to
> SSL_ImportFD for every accpeted client. 

Yes, that's exactly what you should be doing.

> But what's with the SSL_ForceHandshake delay?

Lots of slow multiplication, I suspect.
-- 
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto


Re: NSS non-blocking mode and long computations

2009-10-22 Thread Ambroz Bizjak
On Oct 22, 10:32 pm, Wan-Teh Chang  wrote:
> I'm wondering if your server is spending some of the 100 ms in
> checking the revocation status of the client certificate.  Did
> you enable OCSP checking?
No, haven't configured any OCSP server.

I went through the handshake with a debugger and found the following:
- when a new client is accepted, SSL_ConfigSecureServer takes about
200 ms.
- the first SSL_ForceHandshake is fast; I assume it receives some data
from the client and requests a certificate
- the second SSL_ForceHandshake takes about 200 ms. This is probably
because it's verifying the client's certificate.

I think I could get rid of the SSL_ConfigSecureServer delay by first
performing it on a dummy SSL file descriptor and pass it as a model to
SSL_ImportFD for every accpeted client. But what's with the
SSL_ForceHandshake delay?
-- 
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto


Re: NSS non-blocking mode and long computations

2009-10-22 Thread Wan-Teh Chang
On Thu, Oct 22, 2009 at 12:09 PM, Ambroz Bizjak  wrote:
>
> My program is acting as a server which requires client authentication.

I'm wondering if your server is spending some of the 100 ms in
checking the revocation status of the client certificate.  Did
you enable OCSP checking?

Wan-Teh
-- 
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto


Re: NSS non-blocking mode and long computations

2009-10-22 Thread Ambroz Bizjak
On Oct 22, 7:22 pm, Nelson B Bolyard  wrote:
> What kind of system?  What CPU? What clock speed?  What memory speed?
>
> Are you doing client authentication with a client certificate?
> Are you using Diffie-Hellman Ephemeral cipher suites?
> 100ms is indeed a long time if you're not.

The system is a mini-itx board D945GSEJT. CPU Intel Atom N270 1.6 GHz,
memory
DDR2 533MHz 1GB.
Operating system is Gentoo Linux, x86 architecture (CPU doesn't
support 64bit), kernel version 2.6.31,
compiler optimization flags "-march=nocona -mssse3".

My program is acting as a server which requires client authentication.
I've generated all certificates with
certutil without requesting any specific ciphers, so they are RSA 1042
bit. In the server I also don't
override any defaults.
On average, the complete handshake in local network takes about 400
ms.
I've noticed this is considerably greater than on other systems I
have. With the server program
running on an AMD Sempron 2800 (1.8 GHz), complete handhshake takes
only about 50ms
(though it is 64-bit while my Atom system is not). I find this
surprising; perhaps there is some performance
regression with Atom processors? I know the board is relatively low-
performance, but is really that slow?

> Could your system actually be doing the socket IO on that thread?
It would be hard to do I/O in a different thread with the existing
design of my software, but possible, however
it would for sure introduce additional overhead. That is the last
resort.

> Does it use the CPU to do the actual network IO?
What do you mean? My I/O code is quite efficient; it can't be taking
so long. After the handshake is done, my server does around
5 Mbit/s traffic constantly over SSL, and "uptime" indicates zero CPU
load (though with "top" the load jumps to 80% at regular intervals,
but this is probably a measurement artifact to do with the timings of
the i/o and load sampling). While a client is
connecting, the server program's CPU usage rises suddenly to almost
100%.

> What is the speed of your network link?
The network is not an issue, I'm using 100Mbit LAN.

> A reactor?  What's that?  (nuclear? :)
It's the part of the program that blocks on all resources and calls
associated handlers when
an operation can be performed without blocking; no nuclear reactions
involved :) see http://en.wikipedia.org/wiki/Reactor_pattern
-- 
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto


Re: NSS non-blocking mode and long computations

2009-10-22 Thread Ian G

On 22/10/2009 19:22, Nelson B Bolyard wrote:


As my program is single-threaded (built on a reactor),


A reactor?  What's that?



http://en.wikipedia.org/wiki/Reactor_pattern


 (nuclear? :)



more like a substation :)

iang
--
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto


Re: NSS non-blocking mode and long computations

2009-10-22 Thread Nelson B Bolyard
On 2009-10-22 05:50 PDT, Ambroz Bizjak wrote:
> Hi,
> I'm using NSS in non-blocking mode. To perform a handshake on a SSL
> socket, I use SSL_ForceHandshake (if it returns PR_WOULD_BLOCK_ERROR I
> retry when the SSL socket becomes readable). It works, but I've
> noticed that SSL_ForceHandshake sometimes takes a long time to return
> (around 100 ms). I suppose this is because of all the computations
> involved. 

What kind of system?  What CPU? What clock speed?  What memory speed?

Are you doing client authentication with a client certificate?
Are you using Diffie-Hellman Ephemeral cipher suites?
100ms is indeed a long time if you're not.

Could your system actually be doing the socket IO on that thread?
Does it use the CPU to do the actual network IO?
What is the speed of your network link?

> As my program is single-threaded (built on a reactor), 

A reactor?  What's that?  (nuclear? :)

> it cannot respond to anything else while in a long SSL_ForceHandshake
> call, which causes latency problems with other I/O my program does.
> Is possible to forbid SSL_ForceHandshake from doing any excessive
> computation, and to allow me to perform computations in a different
> thread, then call SSL_ForceHandshake again from the main thread when
> the computation is complete?

No, not with NSS as it exists today.

> It would theoretically be possible to call SSL_ForceHandshake in a
> different thread altogether, but this would be hard and non-optimal in
> my case.
> 
> Thank you for help,
> Ambroz Bizjak

-- 
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto