Hi,

I have a BMC5825 card from Silicom that is supposed to do over
10'000 rsa per second.

In practice Proto Balance can do about 1900 fresh SSL connections
per second, on an Intel Core2 Duo 2.2Ghz. But I think more work
can vastly improve this.

(Without the card I get about 700 per second - thus the card
improves the performance by about 270%)

I compiled with -O1 -g -pg and the gprof output is below.

I replaced OPENSSL_cleanse() {...} with { memset(); } already
- IT WAS THE TOP FUNCTION IN MY FIRST GPROF RUN!!!!!

My test does not use sessions. It downloads a minimal web
page, "<HTML></HTMl>", with 200 clients concurrently.

The malloc at the top is surprisingly expensive: it is called
mostly from EVP_DigestInit_ex(). Refactoring to eliminate
this malloc would be worthwhile I think.

The card supports hardware SHA1 and MD5 - but it's not used
because OpenSSL divides each md operation into an init(),
update() and final() stage. But the card wants a one shot.
So the crypto card API does not fit the software API

:-(

OpenSSL *really* needs to be fixed to properly support
hardware md's

I see Silicom's BMC586x/BMC5861/BMC5862 OpenSSL patch
plugs in code everywhere to directly call their card's
SSL signing function - a sorry solution indeed.

By eliminating the top 6 functions listed below, another 30%
cpu can be saved at least.

Kinds regards

-paul


------=========------


Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 15.22      0.28     0.28   861099     0.00     0.00  malloc      <<========
!!!!!!
  8.15      0.43     0.15
md5_block_asm_host_order
  6.52      0.55     0.12   461101     0.00     0.00  sha1_block_host_order
  4.89      0.64     0.09   234451     0.00     0.00  sha1_block_data_order
  3.80      0.71     0.07   340275     0.00     0.00
sslconnection_thread_bas
  2.72      0.76     0.05   725772     0.00     0.00  SHA1_Update
  2.72      0.81     0.05   673096     0.00     0.00  asn1_i2d_ex_primitive
  2.17      0.85     0.04        1     0.04     1.51  _thread_os_thread
  2.17      0.89     0.04                             RC4
  1.63      0.92     0.03   818060     0.00     0.00  asn1_ex_i2c
  1.63      0.95     0.03   438186     0.00     0.00  HMAC_Init_ex
  1.63      0.98     0.03   355077     0.00     0.00  SHA1_Final
  1.63      1.01     0.03    82992     0.00     0.00  ssl3_read_bytes
  1.09      1.03     0.02  1361354     0.00     0.00  EVP_MD_CTX_cleanup

Reply via email to