On 11/04/2010 12:27, Mick wrote:
On Sunday 11 April 2010 11:43:26 zeera...@gmail.com wrote:
On Sun, Apr 11, 2010 at 03:20:50AM +0100, Kerin Millar wrote:
On 10/04/2010 23:06, luis jure wrote:
hello list,
after many years without a hardware upgrade, i'll be receiving my new
computer next week: intel i7 920 cpu, 6 GB ram, asus p6t mobo.
i'm pretty excited, i imagine that at first i'll be shocked at the
difference with the ancient machine i'm using now.
now my question: searching a bit for the best compilation flags for
this processor, i found this at gentoo-wiki:
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -msse4 -mcx16 -msahf -O2 -pipe"
CXXFLAGS="${CFLAGS}"
(http://en.gentoo-wiki.com/wiki/Safe_Cflags/Intel)
on the other hand, a thread at http://forums.gentoo.org says that the
wiki page is outdated, and that -march=native should do the job without
any further tweaks like -msse4 etc.
That is correct; -march=native will indeed do the job. The CFLAGS
example you cite is clearly an interpretation of the flags that the
native target would result in anyway.
With respect to my Intel Xeon E3113, -march=native appears to equate to:
-march=core2 -mtune=core2 -msahf -msse4.1 --param l1-cache-size=32
--param l1-cache-line-size=64
In short, use "native" and let the compiler take care of the details.
Cheers,
--Kerin
There's a thread in Installing Gentoo where a dev (can't remember which),
that says native isn't the best option, but the best option indeed is to
specify your arch. See these threads:
http://forums.gentoo.org/viewtopic-t-821639.html
http://forums.gentoo.org/viewtopic-t-821370.html
OK, but:
$ gcc -### -march=native -E /usr/include/stdlib.h 2>&1 | grep
"/usr/libexec/gcc/.*cc1"
"/usr/libexec/gcc/x86_64-pc-linux-gnu/4.3.4/cc1" "-E" "-quiet"
"/usr/include/stdlib.h" "-D_FORTIFY_SOURCE=2" "-march=core2" "-mcx16" "-msahf"
"--param" "l1-cache-size=32" "--param" "l1-cache-line-size=64" "-mtune=core2"
the above shows that it uses smaller cache sizes than what my cpu has
according to lshw:
Hmm. Well, as far as I'm aware, Nehalem - like my Wolfdale-based
processor - has 64KB of L1 cache per core, with 32KB serving as an
instruction cache and 32KB serving as a data cache. My tentative
supposition would be that gcc is taking into account the size of the
instruction cache. As for the cache line size, that's measured in bytes,
and is indeed 64B on the majority of (if not all) x86 processors,
Nehalem included.
However, the result you're getting from lshw does seem somewhat
contradictory. gcc uses a cpuid instruction to determine the appropriate
values, but you might also like to check using sysfs:
$ paste <(cat /sys/devices/system/cpu/cpu0/cache/index?/type) <(cat
/sys/devices/system/cpu/cpu0/cache/index?/size) | sed -re 's/\W+/: /'
On my system that results in the following:
Data: 32K
Instruction: 32K
Unified: 6144K
Perhaps I should stick with march=core2 and additionally be adding "--param"
and the L0, L1, L2 cache sizes?
I would suggest to leave it alone. At least, not without raising it with
a gcc developer or someone with a formal understanding of CPU architecture.
Cheers,
--Kerin