We applied the patch and got some improvement, but there is still a bit
of trouble.  Here are the timing values for a stripped-down version of
the code:

Before (all calls start at the same time):
Called PingCore in 0.076844 seconds (0) pid 20829 
Called PingCore in 0.130758 seconds (0) pid 20830
Called PingCore in 0.161245 seconds (0) pid 20834
Called PingCore in 0.160294 seconds (0) pid 20832
<snip>...
Called PingCore in 0.185752 seconds (0) pid 20839
Called PingCore in 0.362372 seconds (0) pid 20843
Called PingCore in 0.379132 seconds (0) pid 20846

After (all calls start at the same time):
<snip>...
Called PingCore in 0.225353 seconds (0) pid 7063 
Called PingCore in 0.238172 seconds (0) pid 7062
Called PingCore in 0.254867 seconds (0) pid 7059
Called PingCore in 0.262224 seconds (0) pid 7069
Called PingCore in 0.273392 seconds (0) pid 7067
Called PingCore in 0.284404 seconds (0) pid 7061

Omitting the call entirely, we get:
<snip>...
Called PingCore in 0.114693 seconds (0) pid 6503 
Called PingCore in 0.109111 seconds (0) pid 6504
Called PingCore in 0.118763 seconds (0) pid 6512
Called PingCore in 0.111967 seconds (0) pid 6513
Called PingCore in 0.111785 seconds (0) pid 6505


And in the full system we still see:
Called PingCore in 0.451584 seconds (0) pid 9519 
Called PingCore in 0.583949 seconds (0) pid 9517
Called PingCore in 0.703487 seconds (0) pid 9531
Called PingCore in 0.842056 seconds (0) pid 9516
Called PingCore in 0.919599 seconds (0) pid 9530
<snip>...
Called PingCore in 1.757831 seconds (0) pid 9528
Called PingCore in 1.779220 seconds (0) pid 9535
Called PingCore in 1.848178 seconds (0) pid 9526

So we still have another bottleneck to track down.  Will report on this
when it becomes apparent.

One potential suggestion might be to add an API that initializes the
montgomery context without a lock, and let the client worry about the
locking semantics.  Since we always use a separate DH context for each
connection, this would be a huge help.  Still another option might be
to initialize the Mont context at DH_new time since I see no standard
way to tell the system not to use this method, so it must now be
established as *the* way to calculate values.

Anyway, I can say that the patch did help.

On Mon, 2014-05-05 at 13:16 -0400, Geoffrey Thorpe wrote: 
> Hi Daniel,
> 
> 
> Great. I already posted a patch to openssl-dev, it's at;
> http://marc.info/?l=openssl-dev&m=139923556631116&w=2
> 
> 
> If you want to apply it ("git apply" or "git am") directly, the raw
> email patch can be downloaded using;
> http://marc.info/?l=openssl-dev&m=139923556631116&q=raw
> 
> 
> 
> Thanks in advance for any feedback on this.
> 
> 
> 
> Cheers,
> Geoff
> 
> 
> 
> 
> On Mon, May 5, 2014 at 1:07 PM, Daniel Sands <dnsa...@sandia.gov>
> wrote:
>         I'm using the RHEL6 standard distro.  I can certainly test a
>         patch for
>         you. 
>         
>         On Sat, 2014-05-03 at 00:01 -0400, Geoffrey Thorpe wrote:
>         > I hadn't noticed this serialisation before, thanks. I'll try
>         to send a
>         > patch over the weekend some time, in case you're able to
>         test? If so,
>         > what version (or git branch) are you using?
>         >
>         >
>         > Cheers,
>         > Geoff
>         >
>         >
>         >
>         >
>         > On Fri, May 2, 2014 at 6:54 PM, Daniel Sands
>         <dnsa...@sandia.gov>
>         > wrote:
>         >         Examining performance bottlenecks on a busy server,
>         we
>         >         discovered that
>         >         connections are being forced to serialize due to
>         calls to
>         >         DH_generate_key.  I looked through the source, and
>         if I
>         >         understand it
>         >         correctly, the code locks a common DH mutex when it
>         uses
>         >         Montgomery
>         >         Multiplication, and due to the way it sets a flag,
>         it always
>         >         does.
>         >
>         >         But I have not yet found any reason why this
>         requires critical
>         >         section
>         >         protection.  I do not see any global variables that
>         are being
>         >         manipulated by the call, nor any resource
>         contention.  Is it
>         >         possible
>         >         that the mutex lock is a holdover from earlier
>         code?  Could
>         >         this locking
>         >         behavior be removed?
>         >
>         >
>         >
>         ______________________________________________________________________
>         >         OpenSSL Project
>         >         http://www.openssl.org
>         >         Development Mailing List
>         >         openssl-dev@openssl.org
>         >         Automated List Manager
>         >         majord...@openssl.org
>         >
>         >
>         >
>         
>         
>         ______________________________________________________________________
>         OpenSSL Project
>         http://www.openssl.org
>         Development Mailing List
>         openssl-dev@openssl.org
>         Automated List Manager
>         majord...@openssl.org
>         
>         
> 
> 


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to