J?rgen Keil wrote:
>> The compiler team has now confirmed this is a bug in the inlined
>> versions of the mem*() functions used with -xbuiltin and is working
>> on a fix. Once this is in a patch, we can ask the Nevada CBE team
>> to include it in the standard Nevada CBE - but I don't know how long
>> either stage of that will take.
>>
>> They've also provided a workaround flag that disables individual
>> inlined functions (-Wc,-Qinline-memcpy=0) - is this causing serious
>> problems that we need to apply the workaround until the CBE is patched?
>
> Inlining these functions has the negative side effect that you
> can't interpose on these function calls any more.
>
> And the highly optimized platform specific implementations from
> libc_psr.so.1 are bypassed...
>
> So, does avoiding the function call and inlining these functions have
> a significant performance advantage?
It's been a long time since we've checked. It was added under CR 4931763,
which says (in the evaluation you can't see on bugs.os.o, but which has no
confidential info):
> x11perf testing shows this gives a very small (about 0.1%) improvement on
> Xmarks
> on m64. Since we're down about 0.5% now, this will bring us about 1/5th of
> the
> way back, so we'll go ahead and do it and continue to look at other
> possibilities.
>
> Test results were:
>
> without -xbuiltin:
> Xmark = 16.7015
> Xmark = 16.6301
> Xmark = 16.6637
> Xmark = 16.6948
> Xmark = 16.6441
> average = 16.6668
> variance (sigma^2) = 0.000771974
> std dev = 0.0277844
>
>
> with -xbuiltin:
> Xmark = 16.7423
> Xmark = 16.6975
> Xmark = 16.6724
> Xmark = 16.6325
> Xmark = 16.6718
> average = 16.6833
> variance (sigma^2) = 0.00130287
> std dev = 0.0360953
That would have been an Ultra 10 with builtin m64 graphics, which is what our
performance benchmark machines were at the time, and using Studio 8 on
Solaris 10 pre-release (it was checked into s10_45).
--
-Alan Coopersmith- alan.coopersmith at sun.com
Sun Microsystems, Inc. - X Window System Engineering