On Sun December 4 2005 6:37 am, Kristian Poul Herkild wrote:
> Robert Crawford wrote:
> > On Sun December 4 2005 4:11 am, Kristian Poul Herkild wrote:
> >
> >
> >
> > -mfpmath=sse is not a good idea, the consensus is it actually lowers
> > performance.  -msse -mmmx -m3dnow are redundant (implied by
> > -march=athlon-xp), and should be removed from your cflags line, but
> > SHOULD be placed in your USE= line, wthout the - sign, like this:
> >
> > USE="mmx 3dnow sse"
> >
> > If you use gcc-3.4.4, these flags should work fine (I've used them for a
> > long time- no problems).
> >
> > CFLAGS="-march=athlon-xp -O3 -pipe -fomit-frame-pointer -fweb -ftracer
> > -fprefetch-loop-arrays -ffast-math -falign-functions=64 -fno-ident"
> >
> > CXXFLAGS="${CFLAGS} -fvisibility-inlines-hidden"
>
> Hmm... according to this thread
> http://forums.gentoo.org/viewtopic.php?t=43648 and the GCC manual -march
> does not imply -mmx -msse -m3dnow, nor does it imply mfpmath=sse. I know
> of no consensus of -mfpmath=sse lowering performance. Actually, I only
> know of the opposite from the LFS-community as well as Gentoo Wiki.
>
> I don't want to start a flamewar on this, so if you have other and more
> correct information than me, then please share it :)
>
> -Kristian Poul Herkild

No flame war- if my conclusions/understanding is incorrect, I'd love to know, 
and make corrections!

I think that almost 3 year old thread refers to  -march=cpu ( now deprecated 
for -mtune), not -march=athlon-xp (the actual architecture). -march="cpu 
type" or -mtune generates not only code for say, athlon-xp, but also code for 
the entire family of i386 cpus. Thus the resulting binary is functional with 
different older cpus. 

On the other hand, -march=athlon-xp generates only code that  works with an 
athlon-xp cpu, thus would be more "tuned" to that cpu (less bloat). At least 
that's the theory- why compile in code you don't need and use for other cpus?  
My understanding of man gcc is that -march=athlon-xp does enable mmx 3dnow 
sse support.

In other words, from a "freshmeat" article:
-------------------------------------------------------------------------------------------------
"-march implies -mcpu, so when you use -march, there's no need to use -mcpu. 

 -mcpu generates code tuned for the specified CPU, but it does not alter the 
ABI and the set of available instructions, so you can still run the resulting 
binary on other CPUs. 

 When you use -march, you generate code for the specified machine type, and 
the available instructions will be used, which means that you probably cannot 
run the binary on other machine types."
----------------------------------------------------------------------------------------------------

For example, from this thread http://forums.gentoo.org/viewtopic.php?p=275851, 
page 3, bottom:

"If I compile with -march=athlon-xp, sse, 3dnow, and mmx are enabled (through 
the -D__athlon_sse__ -D__tune_athlon__ -D__tune_athlon_sse__ -D__SSE__ 
-D__MMX__ -D__3dNOW__ -D__3dNOW_A__ macros). When I add, for example -mmmx, 
-mno-mmx appears after -mmmx in the "options enabled" list in the output of 
gcc -Q -v -march=athlon-xp -mmmx. However, -D__MMX__ doesn't go away, so MMX 
is still used. In short -mmmx, -msse, and -m3dnow are unneccessary, but they 
don't hurt.undefined"

Over the years, I've read similar statements by experienced people on hundreds 
of posts on many forums and groups- sorry I can't point you to them off the 
top of my head. If you can wade through the huge cflags central Gentoo forum 
threads (an ordeal in itself), you will probably reach the same conclusions I 
have.

Also, as I understand it from some recent posts, compiling in mmx 3dnow sse 
support is pointless bloat in any programs that don't use it, thus putting 
them in USE= makes much more sense than cflags.

As for-mfpmath=sse, every benchmark testing article (and several more recent 
forum posts I've seen indicate no real performance gain, and in many cases, 
degraded performance, at least with AMD cpus. That's contrary to what man gcc 
generally says, but people who have actually run tests tend to disagree. Keep 
in mind that the version of gcc used and cpu type (AMD or INTEL) also 
influences the results. However, If anyone knows of more recent info on this 
flag, please post a link.

Robert Crawford

-- 
gentoo-user@gentoo.org mailing list

Reply via email to