Re: [Cooker] compiler flags

2000-08-25 Thread Guillaume Cottenceau

Karl Mitchell [EMAIL PROTECTED] writes:

 It's possibly also worth noting that Athlon/Duron chips optimise best on -O1 at
 present under gcc-2.95. I don't know why. -Karl

better with -O1 than with -O2 or superior!??


-- 
Guillaume Cottenceau -- Distribution Developer for MandrakeSoft
http://www.mandrakesoft.com/~gc/




Re: [Cooker] compiler flags

2000-08-25 Thread Guillaume Cottenceau

Antony Suter [EMAIL PROTECTED] writes:

  Guillaume Cottenceau wrote:
  
   till, you should use:
  
   export CXXFLAGS="-O3 -fomit-frame-pointer -fno-exceptions -fno-rtti -pipe -s 
-mpentium -mcpu=pentium -march=pentium -ffast-math -fexpensive-optimizations"
 
 I think that this is in error. I understand from gcc 2.95 documentation that
 only one of the flags -m, -mcpu and -march need be specified.
 
 My understanding is that:-
 "-mcpu=pentium" specifies to optimise for the pentium but allow to run on
 other cpus.
 "-march=pentium" specifies to optimise fully for the pentium, dropping
 compatibility for other cpus.

actually the basic flags for mdk rpm's is:

-march=pentium -cpu=pentiumpro

it is supposed to:

. use opcodes of Pentium
. schedule for PentiumPro

titi wanted to do that because this should optimize good for P-2 and P-3,
since this architecture is close to ppro but far from pentium.


however let's note i had to do -march=pentium for some c++ code such as
gtkmm and clanlib, because gcc has a bug which prevents from successful
linkage with that parameter.


-- 
Guillaume Cottenceau -- Distribution Developer for MandrakeSoft
http://www.mandrakesoft.com/~gc/




Re: [Cooker] compiler flags

2000-08-25 Thread Karl Mitchell

Guillaume Cottenceau wrote:

 Karl Mitchell [EMAIL PROTECTED] writes:

  It's possibly also worth noting that Athlon/Duron chips optimise best on -O1 at
  present under gcc-2.95. I don't know why. -Karl

 better with -O1 than with -O2 or superior!??

Faster with -O1 than -O2 or -O3 under almost all criteria for speed. I can't
remember the source though and the figures I've got (below) aren't a good
comparison. If you replace -O1 with -O3 on test B (below) you'll see what I mean.

By the way, I would urge against using gcc -ffast-math, as it can cause actual
errors in floating point routines. I am a numerical modeller and I found for one of
my floating point iterative routines the cumulative error was a few percent in one
case. This could have been more to do with the version of egcs I was using, but I
doubt it. Admittedly this doesn't matter for all programs, but perhaps something
like 'octave' would suffer?

Cheers,

-Karl

A) CFLAGS = -s -static -Wall
B) CFLAGS = -s -static -Wall -O1
C) CFLAGS = -s -static -O3 -fomit-frame-pointer -Wall -mpentiumpro
 -march=pentiumpro -fforce-addr -fforce-mem -malign-loops=2
 -malign-functions=4 -malign-jumps=2 -funroll-loops
 -fexpensive-optimisations -malign-double -fschedule-insns2
 -mwide-multiply

Based on K6-233 Index

Mem Index   Integer index   FP Index
A   1.349   1.014   1.824
B   3.605   2.284   5.003
C   3.609   3.245   7.175

--
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-Karl Mitchell=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|   Dept. of Geomatic Engineering   |Secretary, ISPRS WG IV/5 |
| University College London |   "Extraterrestrial Mapping"|
|   Gower Street, LONDON WC1E 6BT   |  +44 (0)20 7504-2744|
 [EMAIL PROTECTED]=-=-==-=-=-=-=-=-=-=-=-

   "Science is like sex: sometimes something useful comes out, but
  that is not the reason we are doing it" -- Richard Feynman







Re: [Cooker] compiler flags

2000-08-25 Thread Guillaume Cottenceau

Karl Mitchell [EMAIL PROTECTED] writes:


[...]

  better with -O1 than with -O2 or superior!??
 
 Faster with -O1 than -O2 or -O3 under almost all criteria for speed. I can't
 remember the source though and the figures I've got (below) aren't a good
 comparison. If you replace -O1 with -O3 on test B (below) you'll see what I mean.
 
 By the way, I would urge against using gcc -ffast-math, as it can cause actual
 errors in floating point routines. I am a numerical modeller and I found for one of
 my floating point iterative routines the cumulative error was a few percent in one
 case. This could have been more to do with the version of egcs I was using, but I
 doubt it. Admittedly this doesn't matter for all programs, but perhaps something
 like 'octave' would suffer?

I don't know precisely why this has been chosen. I think this is because,
this is default parameters, and packagers are supposed to override them
when needed.


[...]

 A) CFLAGS = -s -static -Wall
 B) CFLAGS = -s -static -Wall -O1
 C) CFLAGS = -s -static -O3 -fomit-frame-pointer -Wall -mpentiumpro
  -march=pentiumpro -fforce-addr -fforce-mem -malign-loops=2
  -malign-functions=4 -malign-jumps=2 -funroll-loops
  -fexpensive-optimisations -malign-double -fschedule-insns2
  -mwide-multiply
 
 Based on K6-233 Index
 
 Mem Index   Integer index   FP Index
 A   1.349   1.014   1.824
 B   3.605   2.284   5.003
 C   3.609   3.245   7.175

I don't understand.

You did not explain the numbers:

I suppose than B is faster than A because there is no optim for A. So
given this, I interpret your numbers are "speed index" (opposed to "time
consumed"), e.g. the greater the faster.

So, For all three things, C is faster than B.

Why do you say -O1 is faster than -O2, then? Is it something to do with
the other parameters? (so why did you put other parameters for C..!?)



-- 
Guillaume Cottenceau -- Distribution Developer for MandrakeSoft
http://www.mandrakesoft.com/~gc/




Re: [Cooker] compiler flags

2000-08-25 Thread Giuseppe Ghibo'

Guillaume Cottenceau wrote:
 
 Karl Mitchell [EMAIL PROTECTED] writes:
 
 [...]
 
   better with -O1 than with -O2 or superior!??
 
  Faster with -O1 than -O2 or -O3 under almost all criteria for speed. I can't
  remember the source though and the figures I've got (below) aren't a good
  comparison. If you replace -O1 with -O3 on test B (below) you'll see what I mean.
 
  By the way, I would urge against using gcc -ffast-math, as it can cause actual
  errors in floating point routines. I am a numerical modeller and I found for one of
  my floating point iterative routines the cumulative error was a few percent in one
  case. This could have been more to do with the version of egcs I was using, but I
  doubt it. Admittedly this doesn't matter for all programs, but perhaps something
  like 'octave' would suffer?
 
 I don't know precisely why this has been chosen. I think this is because,
 this is default parameters, and packagers are supposed to override them
 when needed.
 
 [...]
 
  A) CFLAGS = -s -static -Wall
  B) CFLAGS = -s -static -Wall -O1
  C) CFLAGS = -s -static -O3 -fomit-frame-pointer -Wall -mpentiumpro
   -march=pentiumpro -fforce-addr -fforce-mem -malign-loops=2
   -malign-functions=4 -malign-jumps=2 -funroll-loops
   -fexpensive-optimisations -malign-double -fschedule-insns2
   -mwide-multiply
 
  Based on K6-233 Index
 
  Mem Index   Integer index   FP Index
  A   1.349   1.014   1.824
  B   3.605   2.284   5.003
  C   3.609   3.245   7.175
 
 I don't understand.
 
 You did not explain the numbers:
 
 I suppose than B is faster than A because there is no optim for A. So
 given this, I interpret your numbers are "speed index" (opposed to "time
 consumed"), e.g. the greater the faster.
 
 So, For all three things, C is faster than B.
 
 Why do you say -O1 is faster than -O2, then? Is it something to do with
 the other parameters? (so why did you put other parameters for C..!?)
 
 --
 Guillaume Cottenceau -- Distribution Developer for MandrakeSoft
 http://www.mandrakesoft.com/~gc/

Probably he is referring to an article appeared on cpureview showing
that sometimes on K7 -O1 would produce better results than -O2 or more.
Those index seems the Byte index benchmark.

Definitively I think we should do better benchmark for K7, Duron
maybe using ssbench benchmark

http://nastol.astro.lu.se/~stefans/bench.html 

Volunteers?

I remember that we've already choosen to avoid -ffast-math for mathematical
packages (gnuplot, octave, etc.). Maybe we should explicitely use it into
the spec files, sort of

CFLAGS="$RPM_OPT_FLAGS -fno-fast-math"?

Bye.
Giuseppe.




Re: [Cooker] compiler flags

2000-08-24 Thread Francis Galiegue

On Fri, 25 Aug 2000, Antony Suter wrote:

 
 I think that this is in error. I understand from gcc 2.95 documentation that
 only one of the flags -m, -mcpu and -march need be specified.
 
 My understanding is that:-
 "-mcpu=pentium" specifies to optimise for the pentium but allow to run on
 other cpus.
 "-march=pentium" specifies to optimise fully for the pentium, dropping
 compatibility for other cpus.
 

That's it. The cpu= stuff provides optimal instruction ordering for
Pentiums (note that this same order differs for PPro and upper!) whereas the
arch= introduces arch specific instructions.

 I think that it is non optimal to specify both -mcpu and -march at the same
 time.

Well, it is - for Pentiums. Ppros run fine with it too, at least faster than
without any of these. I don't know what the -mpentium is for.

Oh, one last thing: using fast-math is tricky, especially for apps expecting
full IEEE compliance. Try at all possible NOT to use it.

-- 
Francis Galiegue, [EMAIL PROTECTED]
"Programming is a race between programmers, who try and make more and more
idiot-proof software, and universe, which produces more and more remarkable
idiots. Until now, universe leads the race"  -- R. Cook





Re: [Cooker] compiler flags

2000-08-24 Thread Karl Mitchell

It's possibly also worth noting that Athlon/Duron chips optimise best on -O1 at
present under gcc-2.95. I don't know why. -Karl

--
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-Karl Mitchell=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|   Dept. of Geomatic Engineering   |Secretary, ISPRS WG IV/5 |
| University College London |   "Extraterrestrial Mapping"|
|   Gower Street, LONDON WC1E 6BT   |  +44 (0)20 7504-2744|
 [EMAIL PROTECTED]=-=-==-=-=-=-=-=-=-=-=-

   "Science is like sex: sometimes something useful comes out, but
  that is not the reason we are doing it" -- Richard Feynman







Re: [Cooker] compiler flags

2000-08-24 Thread Antony Suter

Francis Galiegue wrote:
 
 On Fri, 25 Aug 2000, Antony Suter wrote:
 
 
  I think that this is in error. I understand from gcc 2.95 documentation that
  only one of the flags -m, -mcpu and -march need be specified.
 
  My understanding is that:-
  "-mcpu=pentium" specifies to optimise for the pentium but allow to run on
  other cpus.
  "-march=pentium" specifies to optimise fully for the pentium, dropping
  compatibility for other cpus.
 
 
 That's it. The cpu= stuff provides optimal instruction ordering for
 Pentiums (note that this same order differs for PPro and upper!) whereas the
 arch= introduces arch specific instructions.
 
  I think that it is non optimal to specify both -mcpu and -march at the same
  time.
 
 Well, it is - for Pentiums. Ppros run fine with it too, at least faster than
 without any of these. I don't know what the -mpentium is for.

I am still not sure what is optimal (when im compiling only for my own
machine).
It is best to specify both -mcpu=pentiumpro and -march=pentiumpro for my
Pentium III machine?

 Oh, one last thing: using fast-math is tricky, especially for apps expecting
 full IEEE compliance. Try at all possible NOT to use it.

Its a standard option for cross compiling the kernel in mdk 7.1 (im not sure
about cooker).

All the mandrake kernels ive seen, in source code form, specify all three of
-m, -mcpu and -march. Also the kernels specify them multiple times in
different places with multiple instances of other flags as well. So the
kernel ends up with long redundant option strings on the gcc compile line.

--
- Antony Suter  ([EMAIL PROTECTED])  "Examiner"  openpgp:71ADFC87
- "And how do you store the nuclear equivalent of the universal solvent?"