> -----Original Message-----
> From: Nadav Har'El [mailto:[EMAIL PROTECTED]
> Sent: Sunday, November 09, 2003 10:38 PM
> To: Oleg Kobets
> Cc: Zvi Har'El; My Own Private List
> Subject: Re: Redhat 9 slowness - continued
> 
> 
> On Sun, Nov 09, 2003, Oleg Kobets wrote about "Re: Redhat 9 
> slowness - continued":
> > Oh, well.
> > But the question remains, why is it slower ?
> 
> Continuing the Redhat 9 saga:
> 
> I previously thought that the slowdown had something to do 
> with the dynamic
> linking slowdown. I no longer think so - I think the 
> /lib/tls/* libraries
> are slower in doing some operation than /lib/i686/* libraries are.
> 
> I now ran hspell on a very big corpus of about 3 million Hebrew words,
> on my Pentium 1500 running Redhat 9. With the default 
> /lib/tls libraries,
> hspell took 8 CPU seconds (user time). With LD_LIBRARY_PATH=/lib/i686,
> the time is down to 4.5 seconds!!
> 
> The 3.5 second difference of course cannot be attributed to 
> slow dynamic
> linking - it's the /lib/tls that suck. My guess is that some common C
> function that hspell uses, perhaps even the stdio, strlen(), 
> or who knows
> what, is much slower in the tls version.
>  I wonder what is that slow-poke function that I should 
> avoid... I can't
> even profile this problem, because there is only one variant of the
> profiling libraries (glibc-profile doesn't appear to contain "tls" and
> "i686" variants).
> 
> Another possibility is that the tls version wasn't optimized 
> for 686 while
> the i686 libraries were - though I doubt this can explain the 
> almost half-
> speed performance I'm seeing.

I read a benchmark of C vs. C++ vs.C# just a few weeks ago. They also
compared different compilers and libraries (all under windows).

Just to give a few examples:
gcc/glibc's atoi() is 50% slower than VC6 runtime with Intel's massively
optimizing compiler.
string switching (i.e. C switch on strings) is twice as fast in VC6/Intel
than gcc/glibc.
string tokenization is 20% slower on gcc/glibc than Intel.


AFAIK, TLS libraries, aside from requiring more actual code about stack
allocation, hold several stacks or one segmented stack (for several
threads), and will tend to use more space and hence perform worse on L1/L2
caches.

Overall, I wouldn't be surprised that a TLS implementation, on top of
optimizations, will amount to slower performance.

Shachar.


This electronic message contains information from Verint Systems, which may
be privileged and confidential.  The information is intended to be for the
use of the individual(s) or entity named above.  If you are not the intended
recipient, be aware that any disclosure, copying, distribution or use of the
contents of this information is prohibited.  If you have received this
electronic message in error, please notify us by replying to this email.

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to