Re: Some wird OpenSSL perfomance slowdown
On Mon, Mar 05, 2007 at 10:46:19AM -0800, Rick Jones wrote: > Sergey S. Levin wrote: > >Hello Rick, > > > >>SW crypto aint cheap. It can consume lots of CPU cycles. If the > >>system was nearly CPU saturated with a "plain" transfer, then the > >>overhead of the crypto can very definitely take the throughput down > >>considerably. > >> > >1. If i use FileZilla and SSL connection - it works on 100% of speed. > >2. The processor load is just 5% so, this should not be the crypto problem. > > I'd next wonder what TCP window size(s) were being used in each case, > and if SSL was making full use of the TCP window it had available. That > means a bit of tcpdump tracing, including the connection establishment > so you can see any window scale values being exchanged. Even default window-scaling (in LAN environements) will run fast enough, the key thing to avoid with bulk data transfer is lock-step half-duplex operation. -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Sergey S. Levin wrote: Hello Rick, SW crypto aint cheap. It can consume lots of CPU cycles. If the system was nearly CPU saturated with a "plain" transfer, then the overhead of the crypto can very definitely take the throughput down considerably. 1. If i use FileZilla and SSL connection - it works on 100% of speed. 2. The processor load is just 5% so, this should not be the crypto problem. I'd next wonder what TCP window size(s) were being used in each case, and if SSL was making full use of the TCP window it had available. That means a bit of tcpdump tracing, including the connection establishment so you can see any window scale values being exchanged. rick jones __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Darryl Miles wrote: Sergey S. Levin wrote: 1. If i use FileZilla and SSL connection - it works on 100% of speed. I dont know what FileZilla is, but which SSL implementations is used and what key exchange protocol and what symmetric cipher did it choose ? FileZilla uses also OpenSSL. Ciao, Richard -- Dr. Richard W. Könning Fujitsu Siemens Computers GmbH __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Sergey S. Levin wrote: 1. If i use FileZilla and SSL connection - it works on 100% of speed. I dont know what FileZilla is, but which SSL implementations is used and what key exchange protocol and what symmetric cipher did it choose ? 2. The processor load is just 5% so, this should not be the crypto problem. Hey you are only transfering 30Mb, increase this to something that make the test take 10 minutes or more then come back to tell us what the processor load is at each end. This might also be a good test to highlight if the problem is with the application and not something more fundamental with the TCPIP and Ethernet layers. If you can't get your app to give you 100% load on at least one end then maybe you need to get tcpdump out. If you can then it highlights which end is having a performance problem at application level. Maybe the others are right in questioning: BIO_ctrl(out, BIO_CTRL_FLUSH, 0, NULL) The next BIO_write() will automatically flush the preceeding application data. If you want to flush do one at the end of the for(;;) loop, as I cant see any code to call SSL_shutdown(ssl) I would not even be sure the last plaintext bytes actually reached the other end before the application terminated, I'd recommend you insert after the for(;;) loop: BIO_ctrl(out, BIO_CTRL_FLUSH, 0, NULL); SSL_shutdown(ssl); SSL_shutdown(ssl); The two SSL_shutdown() cause the shutdown notify to be emitted to the far end and the 2nd one will enforce a flush and wait. > http://www.bw-team.com/openssl.PNG I have taken a look at your graph but I still go with my suggestions to provide a table of timing information as requested before. Also as others have suggested find out which cipher is being used and run the benchmarks on the systems at each end: openssl speed aes-256-cbc aes-128-cbc des-ede3 rc4 The API calls to dump this out are documented "man SSL_get_cipher" but you need to call these anytime after the 1st BIO_write() call is made in the app, this would be the point in time after the initial handshake has completed. > Does this mean that the OpenSSL lib each BIO_write makes the handshake? No, but at setup the initial handshake is mandatory and its a 5 way affair, the cost in time is at least "2 * round-trip-time" plus the CPU wall clock costs for computing PKC in the key exchange. It is the key exchange that can be slow (especially under my original guess that you were only transfering 3Mb of application data, in that hypothetical scenario I would expect PKC cost to be higher than bulk encryption cost for 3Mb of data). But to be clear about your performance claims: You are claiming other SSL client implementations are faster, and if I understand correctly you are using the same SSL server implementation to test against on exactly the same hardware setup both ends to compare (i.e. nothing was changed on the client side except the client applicaiton being used) ? Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Some wird OpenSSL perfomance slowdown
> cout << "Set BIO block size (ex: 4096): "; > cin >> nBioBlockSize; What value are you using for nBioBlockSize? > else > { > BIO_ctrl(out, BIO_CTRL_FLUSH, 0, NULL); > } Why is this here? DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Sergey S. Levin wrote: But which cpu types/frequencies are involved on both sides of the connection and which cipher suite do you use? Server - Celeron 2GHz, Cient - Intel PIV 2GHz. As to the second question - I'm not changing the defaul values in the sources code. I had taken the saccept.c and sconnect.c as the base. 1. Which command changes it? 2. Which cipher suite should I use to increase the perfomance? As Vi(c|k)tor already said, with the above mentioned CPUs there should be no speed problem created by the symmetric encryption. Something else what strikes me: Is the BIO_ctrl(out, BIO_CTRL_FLUSH, 0, NULL) call really necessary? Maybe the flushing has a negative influence on the LAN performance? Ciao, Richard -- Dr. Richard W. Könning Fujitsu Siemens Computers GmbH __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
On Fri, Mar 02, 2007 at 07:47:29PM +0200, Sergey S. Levin wrote: > Hello Richard, > > >But which cpu types/frequencies are involved on both sides of the > >connection and which cipher suite do you use? > > Server - Celeron 2GHz, Cient - Intel PIV 2GHz. > As to the second question - I'm not changing the defaul values in the > sources code. I had taken the saccept.c and sconnect.c as the base. > 1. Which command changes it? > 2. Which cipher suite should I use to increase the perfomance? All the available cipher-suites should be able to give reasonable performance. Use: openssl speed aes-256-cbc aes-128-cbc des-ede3 rc4 to estimate the expected throughput. On a 1.0GHz G4 laptop (not very fast by today's standards) I get (0.9.8d): --- The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes rc4 115558.17k 136281.31k 141916.65k 142890.26k 141116.23k aes-128 cbc 46802.45k51413.37k52360.24k52556.33k52390.01k aes-256 cbc 38766.81k41876.09k42495.54k42638.51k42541.89k des ede3 10826.44k11154.70k11244.89k11266.88k11256.52k --- Even 3DES at ~11MB/s will still fill an 100Mbps ethernet link. Is the client to server application protocol streaming or RPC-like half-duplex lock-step send/ack/repeat? AES-128 is a good choice, RC4 is faster, but should be avoided for security reasons. On a more "competitive" Opteron: --- The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes rc4 352435.01k 364963.95k 412739.58k 425921.54k 430820.01k aes-128 cbc 61725.30k 107617.51k 137287.34k 148495.02k 149626.88k aes-256 cbc 52085.21k84101.80k 101958.40k 107398.14k 108276.39k des ede3 17907.50k17924.14k18002.94k17805.65k17995.09k --- So here AES-128 and AES-256 can in principle reach ~1Gbps. If your problem is protocol latency (rather than CPU for encryption), switching ciphers won't help. -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Hello Richard, But which cpu types/frequencies are involved on both sides of the connection and which cipher suite do you use? Server - Celeron 2GHz, Cient - Intel PIV 2GHz. As to the second question - I'm not changing the defaul values in the sources code. I had taken the saccept.c and sconnect.c as the base. 1. Which command changes it? 2. Which cipher suite should I use to increase the perfomance? Thanks, Serge __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Sergey S. Levin wrote: I dont see any timing code in the middle to separate the timings for the SSL cryptographic setup phase from the application data transfer phase. I think you are doing a piggybacked connection setup so your first application data write is performing the SSL connection setup implicitly. Does this mean that the OpenSSL lib each BIO_write makes the handshake? No. But which cpu types/frequencies are involved on both sides of the connection and which cipher suite do you use? Ciao, Richard -- Dr. Richard W. Könning Fujitsu Siemens Computers GmbH __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Hello Rick, SW crypto aint cheap. It can consume lots of CPU cycles. If the system was nearly CPU saturated with a "plain" transfer, then the overhead of the crypto can very definitely take the throughput down considerably. 1. If i use FileZilla and SSL connection - it works on 100% of speed. 2. The processor load is just 5% so, this should not be the crypto problem. Thank you, Serge __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
SW crypto aint cheap. It can consume lots of CPU cycles. If the system was nearly CPU saturated with a "plain" transfer, then the overhead of the crypto can very definitely take the throughput down considerably. rick jones one of these days I need to make an SSL version of netperf :) __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Hello Darryl, Thank you for a reply. From glancing at your code it looks like your bulk data transfer is something like 300 lots of nBioBlockSize, and I presume nBioBlockSize is <= 10k, so thats only 3Mb of data. The nBioBlockSize is 4096 Bytes. The transfer is 300 * buf_size where the buf_suze if 1 MB. So I'm transferring 300 MB. The problem is that the network speed is 100 Mbps, but the transfer speed of the code provided just 40-60 Mbps on my computer and 4.8 Mbps on customer's. I understand that OpenSSL is a great lib. I just want to find out where I'm wrong in the code, because I had developed a big app and experiencing the speed problems. I dont see any timing code in the middle to separate the timings for the SSL cryptographic setup phase from the application data transfer phase. I think you are doing a piggybacked connection setup so your first application data write is performing the SSL connection setup implicitly. Yes, you are right. The overall app have no any additional code. Here is the link for the transfer speed graph picture: http://www.bw-team.com/openssl.PNG I dont see any timing code in the middle to separate the timings for the SSL cryptographic setup phase from the application data transfer phase. I think you are doing a piggybacked connection setup so your first application data write is performing the SSL connection setup implicitly. Does this mean that the OpenSSL lib each BIO_write makes the handshake? Thank you again, Serge Levin __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Some wird OpenSSL perfomance slowdown
Sergey S. Levin wrote: Why the data transfer speed of the OpenSSL client and server is nearly 10 times slower then when using the regular sockets? The code of the standard samples of client and servers are used. Are you also measuring the time it takes to setup the SSL connection or are you only measuring throughput as your comment implies. From glancing at your code it looks like your bulk data transfer is something like 300 lots of nBioBlockSize, and I presume nBioBlockSize is <= 10k, so thats only 3Mb of data. I dont see any timing code in the middle to separate the timings for the SSL cryptographic setup phase from the application data transfer phase. I think you are doing a piggybacked connection setup so your first application data write is performing the SSL connection setup implicitly. So may I suggest you insert code to emit high-precision timing for: * application startup * just before entering the for(;;) loop * after the first BIO_write() completes * at application termination Then you report your findings and an outline report of the hardware and network connection at both ends back to the list. You might want to also increase the number of iterations so that 4Gb of application data is transfered and see if the overall throughput is nearer the speed you were expecting. (This reduces the cost of the SSL connection setup in the overall figure you are comparing against). Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Some wird OpenSSL perfomance slowdown
Hello all, Why the data transfer speed of the OpenSSL client and server is nearly 10 times slower then when using the regular sockets? The code of the standard samples of client and servers are used. The code for client is: char host[MAX_PATH]; BIO *out; char buf[1024*10],*p; SSL_CTX *ssl_ctx=NULL; SSL *ssl; BIO *ssl_bio; int i,len,off,ret=1; int chunk_size; int buf_size = 1024*1024; int nBioBlockSize; cout << "Please, enter the server host and port as following host:port (ex: ASUS:443):\n"; cin >> host; cout << "Set BIO block size (ex: 4096): "; cin >> nBioBlockSize; p = new char[buf_size]; srand( (unsigned)time( NULL ) ); for (int j=0; j} __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]