On 11/04/2010 12:27, Mick wrote:
On Sunday 11 April 2010 11:43:26 zeera...@gmail.com wrote:
On Sun, Apr 11, 2010 at 03:20:50AM +0100, Kerin Millar wrote:
On 10/04/2010 23:06, luis jure wrote:
hello list,

after many years without a hardware upgrade, i'll be receiving my new
computer next week: intel i7 920 cpu, 6 GB ram, asus p6t mobo.

i'm pretty excited, i imagine that at first i'll be shocked at the
difference with the ancient machine i'm using now.

now my question: searching a bit for the best compilation flags for
this processor, i found this at gentoo-wiki:

CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -msse4 -mcx16 -msahf -O2 -pipe"
CXXFLAGS="${CFLAGS}"
(http://en.gentoo-wiki.com/wiki/Safe_Cflags/Intel)

on the other hand, a thread at http://forums.gentoo.org says that the
wiki page is outdated, and that -march=native should do the job without
any further tweaks like -msse4 etc.

That is correct; -march=native will indeed do the job. The CFLAGS
example you cite is clearly an interpretation of the flags that the
native target would result in anyway.

With respect to my Intel Xeon E3113, -march=native appears to equate to:

-march=core2 -mtune=core2 -msahf -msse4.1 --param l1-cache-size=32
--param l1-cache-line-size=64

In short, use "native" and let the compiler take care of the details.

Cheers,

--Kerin

There's a thread in Installing Gentoo where a dev (can't remember which),
  that says native isn't the best option, but the best option indeed is to
  specify your arch. See these threads:
  http://forums.gentoo.org/viewtopic-t-821639.html
http://forums.gentoo.org/viewtopic-t-821370.html

OK, but:

$ gcc -### -march=native -E /usr/include/stdlib.h 2>&1 | grep
"/usr/libexec/gcc/.*cc1"
  "/usr/libexec/gcc/x86_64-pc-linux-gnu/4.3.4/cc1" "-E" "-quiet"
"/usr/include/stdlib.h" "-D_FORTIFY_SOURCE=2" "-march=core2" "-mcx16" "-msahf"
"--param" "l1-cache-size=32" "--param" "l1-cache-line-size=64" "-mtune=core2"

the above shows that it uses smaller cache sizes than what my cpu has
according to lshw:

Hmm. Well, as far as I'm aware, Nehalem - like my Wolfdale-based processor - has 64KB of L1 cache per core, with 32KB serving as an instruction cache and 32KB serving as a data cache. My tentative supposition would be that gcc is taking into account the size of the instruction cache. As for the cache line size, that's measured in bytes, and is indeed 64B on the majority of (if not all) x86 processors, Nehalem included.

However, the result you're getting from lshw does seem somewhat contradictory. gcc uses a cpuid instruction to determine the appropriate values, but you might also like to check using sysfs:

$ paste <(cat /sys/devices/system/cpu/cpu0/cache/index?/type) <(cat /sys/devices/system/cpu/cpu0/cache/index?/size) | sed -re 's/\W+/: /'

On my system that results in the following:

Data: 32K
Instruction: 32K
Unified: 6144K

Perhaps I should stick with march=core2 and additionally be adding "--param"
and the L0, L1, L2 cache sizes?

I would suggest to leave it alone. At least, not without raising it with a gcc developer or someone with a formal understanding of CPU architecture.

Cheers,

--Kerin


Reply via email to