Sami Näätänen <[EMAIL PROTECTED]> posted
[EMAIL PROTECTED], excerpted below, on  Tue, 09 Dec
2008 14:23:30 +0200:

> My system is an Intel quad core core2 with a 2.4 GHz clock speed coupled
> with a 4GB of memory. No overclocking etc. Want this to be stable. :)
> 
> I'm just curious what people use as their stable CFLAGS in amd64 Gentoo?
> (Sorry if this has been up lately, but I just switched to 64bit env
> so...)
> 
> 
> Here is mine and some explanation of why (And I use ~arch system with
> gcc 4.3)

Well, you say you want stable, but then say you use ~arch, so I see 
you're not too stick in the mud. =:^)

Here's mine, for a dual Opteron 290:

CFLAGS="-march=opteron-sse3 -pipe -O2 -frename-registers -fweb -fmerge-
all-constants -fgcse-sm -fgcse-las -fgcse-after-reload -ftree-vectorize -
fdirectives-only -freorder-blocks-and-partition -combine"

CXXFLAGS="-march=opteron-sse3 -pipe -O2 -frename-registers -fweb -fmerge-
all-constants -fgcse-sm -fgcse-las -fgcse-after-reload -ftree-vectorize -
fdirectives-only"

You can look them up in the gcc manpage, or look back a year or so when I 
explained most of them, altho that was a couple gcc versions ago and they 
weren't quite the same.

But my basic strategy is this:  Because memory is so much slower than 
cache on a modern processor, in general it should pay to optimize for 
size even if it costs a few CPU cycles once in awhile.  Thus, until 
fairly recently I used -Os, but with gcc-4.3, decided to switch to -O2 
since gcc is getting smarter about such optimizations with -O2 now, and 
the few additional size optimizations with -Os now tend to be at the 
expense of cache (think -freorder-blocks-and-partition).  In any case, I 
certainly don't want -O3 or too much loop unrolling and inlining, at the 
expense of cache.

-frename-registers and -fweb are useful for taking advantage of the 
additional registers x86_64 has.  -fdirectives-only is there because it 
works better with ccache, which I use.  You know about -ftree-vectorize 
and -combine is discussed elsewhere on-thread.  -fmerge-all-constants 
isn't strictly C standard, but I've had absolutely zero issues with it, 
and it's going to help with cache.  -freorder-blocks-and-partition won't 
work on most C++ code, thus (along with -combine) the reason I split 
CFLAGS and CXXFLAGS, but it tells gcc to keep hot code together so it 
stays in cache better.  The various -fgcse-* options make gcc stricter 
about global common subexpression elimination (gcse) under various 
conditions.  This shouldn't add to size and may in fact reduce size by 
reducing instruction count (or moving it out of loops, size neutral), but 
it can increase compile time, the reason a few of them are enabled at -O3 
only, by default.

-combine is the one that causes the most problems, handled per trouble-
package as mentioned in the other thread using /etc/portage/env/* files.  
The -fredorder-blocks-and-partition can in some cases as well, but if you 
don't have either of those in CXXFLAGS, you'll avoid a lot of the problem 
right there.  Those are the only C(XX)FLAGS I have had issues with 
lately.  The others have worked just fine.

With quad-core you will likely be interested in upping your MAKEOPTS job 
count as well.  Just be aware that it too can cause issues at times.  
Again, however, it's easily worked around per-package as you come across 
them using the env/* files to set MAKEOPTS=-j1 or whatever.

Since you mentioned running ~arch, and assuming your PM is still portage, 
you may also want to take a look at the emerge's --jobs and --load-
average options, for parallel emerges, if you haven't already.  If you 
use them you'll probably find --keep-going useful as well, so it doesn't 
stop just because one of the parallel merges failed.

Finally, if you haven't already, consider pointing PORTAGE_TMPDIR at a 
tmpfs.  With 4 gig memory it should speed things up dramatically, and the 
worst-case is that it uses swap, sending to disk what would be 100% 
guaranteed to go to disk if you had PORTAGE_TMPDIR on disk.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


Reply via email to