Howard Chu writes: > A bit of a minor mystery. Not a problem, just a curiosity. If > someone knew off the top of their head a reason for it, that'd be > cool, but otherwise no sweat.
It's possible, although unlikley, that the optimized code has worse cache behaviour. No way to know better without doing some profiling. Andrew. > > -------- Original Message -------- > Subject: Re: commit: ldap/servers/slapd connection.c daemon.c proto-slap.h > syncrepl.c > Date: Tue, 27 Nov 2007 05:17:04 -0800 > From: Howard Chu <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > References: <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> > > Howard Chu wrote: > > Howard Chu wrote: > >> Howard Chu wrote: > >>> For reference, the peak throughput with back-null on the previous code > >>> was > >>> only 7,800 auths/sec (with 8 client threads). With this patch it's 11,140 > >>> auths/sec. > > Those numbers are for Windows Server 2003 x86_64 on a Celestica A8440 with 4 > Opteron 875s, using OpenLDAP compiled with gcc 4.3.0. The following numbers > are for Linux 2.6.23.1 x86_64, on the same machine, compiled first with gcc > 4.1.2 and then later with gcc 4.2.2. There's no disk I/O in these tests. > > >>> In both cases the throughput declines as more client threads are > >>> used. (Compare to 35,553 auths/sec for the same machine running Linux, > >>> and no > >>> drop in throughput all the way up to hundreds/thousands of connections.) > > > Re-running on Linux with a non-optimized build, peaked at 40,101 > > auths/sec. (I > > guess HEAD has sped up a bit more in the past week or so...) > > OK, this is odd. The code compiled without optimization peaks at 40K > auths/sec > at around 124-132 client threads. The code compiled with -O2 peaks at 37K sec > at around 128 client threads. > > The -O2 build is faster from about 4 to 24 client threads. From 28 on up, the > nonoptimized code is faster at every load level. I was originally using gcc > 4.1.2 but I'm seeing the same result now using gcc 4.2.2. Also, slapd is only > configured with 8 worker threads in all of these tests. Strange that whatever > optimizations the compiler has generated speeds things up for lighter load, > but works against it under heavier load. > -- > -- Howard Chu > Chief Architect, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/ -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903