Howard Chu writes:

 > A bit of a minor mystery. Not a problem, just a curiosity. If
 > someone knew off the top of their head a reason for it, that'd be
 > cool, but otherwise no sweat.

It's possible, although unlikley, that the optimized code has worse
cache behaviour.  No way to know better without doing some profiling.

Andrew.


 > 
 > -------- Original Message --------
 > Subject: Re: commit: ldap/servers/slapd connection.c daemon.c proto-slap.h 
 > syncrepl.c
 > Date: Tue, 27 Nov 2007 05:17:04 -0800
 > From: Howard Chu <[EMAIL PROTECTED]>
 > To: [EMAIL PROTECTED]
 > References: <[EMAIL PROTECTED]> 
 > <[EMAIL PROTECTED]>  <[EMAIL PROTECTED]> 
 > <[EMAIL PROTECTED]>
 > 
 > Howard Chu wrote:
 > > Howard Chu wrote:
 > >> Howard Chu wrote:
 > >>> For reference, the peak throughput with back-null on the previous code 
 > >>> was
 > >>> only 7,800 auths/sec (with 8 client threads). With this patch it's 11,140
 > >>> auths/sec.
 > 
 > Those numbers are for Windows Server 2003 x86_64 on a Celestica A8440 with 4 
 > Opteron 875s, using OpenLDAP compiled with gcc 4.3.0. The following numbers 
 > are for Linux 2.6.23.1 x86_64, on the same machine, compiled first with gcc 
 > 4.1.2 and then later with gcc 4.2.2. There's no disk I/O in these tests.
 > 
 > >>> In both cases the throughput declines as more client threads are
 > >>> used. (Compare to 35,553 auths/sec for the same machine running Linux, 
 > >>> and no
 > >>> drop in throughput all the way up to hundreds/thousands of connections.)
 > 
 > > Re-running on Linux with a non-optimized build, peaked at 40,101 
 > > auths/sec. (I
 > > guess HEAD has sped up a bit more in the past week or so...)
 > 
 > OK, this is odd. The code compiled without optimization peaks at 40K 
 > auths/sec
 > at around 124-132 client threads. The code compiled with -O2 peaks at 37K sec
 > at around 128 client threads.
 > 
 > The -O2 build is faster from about 4 to 24 client threads. From 28 on up, the
 > nonoptimized code is faster at every load level. I was originally using gcc
 > 4.1.2 but I'm seeing the same result now using gcc 4.2.2. Also, slapd is only
 > configured with 8 worker threads in all of these tests. Strange that whatever
 > optimizations the compiler has generated speeds things up for lighter load,
 > but works against it under heavier load.
 > -- 
 >    -- Howard Chu
 >    Chief Architect, Symas Corp.  http://www.symas.com
 >    Director, Highland Sun        http://highlandsun.com/hyc/
 >    Chief Architect, OpenLDAP     http://www.openldap.org/project/

-- 
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 
1TE, UK
Registered in England and Wales No. 3798903

Reply via email to