Re: Why not 03 ?

2014-06-02 Thread Henrique de Moraes Holschuh
On Mon, 02 Jun 2014, Xavier Roche wrote:
> On Mon, Jun 02, 2014 at 10:36:01AM -0300, Henrique de Moraes Holschuh wrote:
> > As long as you have a way to regression-test.  And I don't mean performance
> > regressions, either.  Although issues with -O3 are rare, they're not unheard
> > of.
> 
> Looking at the `man gcc' page, I fail to see, outside compiler bugs, what 
> could cause issues at 03 vs. O2.

It is a moving target.

For GCC 4.9:
-O3 turns on all optimizations specified by -O2 and also turns on the:
-finline-functions, -funswitch-loops, -fpredictive-commoning,
-fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize,
-fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.

For GCC 4.7:
-O3 turns on all optimizations specified by -O2 and also turns on the:
-finline-functions, -funswitch-loops, -fpredictive-commoning,
-fgcse-after-reload, -ftree-vectorize and -fipa-cp-clone options.

And compiler bugs _are_ an issue.  Which is why testing done on an
optimization level is not strictly valid for other optimization levels.  You
can ignore this, but it may eventually bite you (especially on less common
arches and optimization modes).

> I have the feeling that most "dangerous" (ie. breaking dirty code, or code 
> using non-specified C behavior) features are already on O2:
>   * -fstrict-aliasing (code aliasing the same pointer wirh a different type)
>   * -fstrict-overflow (signed arithmetic overflow being undefined)

Optimizations related to memory reuse are _always_ a bit dangerous to enable
blindly on anything that has complex signal handlers, needs to be able to
"secure clobber" memory, implements multithreading syncronization directly,
etc.

> Outside architecture issues, such as "will produce bytecode unsupported by
> old processors", what typical optimizations can harm us at O3 ?

All of the optimizations in -O3 are known to be harmful[1] in certain
situations, otherwise they'd be in the -O2 set in the first place.

Fortunately, only compiler bugs and code with undefined behaviour will cause
incorrect program behaviour [unrelated to performance] when you change among
-Os,-Og,-O0,-O1,-O2,-O3.


[1] supposedly they are only possibly harmful to performance.  When it
causes miscompiling, it is a bug.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140602174106.ga2...@khazad-dum.debian.net



Re: Why not 03 ?

2014-06-02 Thread Xavier Roche
On Mon, Jun 02, 2014 at 10:36:01AM -0300, Henrique de Moraes Holschuh wrote:
> As long as you have a way to regression-test.  And I don't mean performance
> regressions, either.  Although issues with -O3 are rare, they're not unheard
> of.

Looking at the `man gcc' page, I fail to see, outside compiler bugs, what could 
cause issues at 03 vs. O2.

I have the feeling that most "dangerous" (ie. breaking dirty code, or code 
using non-specified C behavior) features are already on O2:
  * -fstrict-aliasing (code aliasing the same pointer wirh a different type)
  * -fstrict-overflow (signed arithmetic overflow being undefined)

Outside architecture issues, such as "will produce bytecode unsupported by old 
processors", what typical optimizations can harm us at O3 ?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140602165606.GA14725@proliant.localnet



Re: Why not 03 ?

2014-06-02 Thread Henrique de Moraes Holschuh
On Mon, 02 Jun 2014, Thomas Goirand wrote:
> On 06/02/2014 05:07 AM, Julien Cristau wrote:
> > For a lot of scientific packages, the upstream authors don't know what
> > they're doing. So I'm not sure that's much of an argument.
> 
> [citation needed]
> 
> Also, it's easy to just play with the -O option and see what's faster.

As long as you have a way to regression-test.  And I don't mean performance
regressions, either.  Although issues with -O3 are rare, they're not unheard
of.  And all bets are off when the C code has undefined behavior.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140602133601.ga25...@khazad-dum.debian.net



Re: Why not 03 ?

2014-06-02 Thread Salvo Tomaselli

> What do we lose if we follow upstream's compiler options ?  As noted, the
> program may fail to build on other architectures than amd64.  I do not think
> that the unavailability of such non-core packages on other architectures is
> a problem (no user base), 
No, the problem is that it would compile just fine and then refuse to run, 
leading to (justified) bug reports.


-- 
Salvo Tomaselli

"Io non mi sento obbligato a credere che lo stesso Dio che ci ha dotato di
senso, ragione ed intelletto intendesse che noi ne facessimo a meno."
-- Galileo Galilei

http://ltworf.github.io/ltworf/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/1652010.TyemNprM1Y@hal9000



Re: Why not 03 ?

2014-06-02 Thread Thomas Goirand
On 06/02/2014 05:07 AM, Julien Cristau wrote:
> For a lot of scientific packages, the upstream authors don't know what
> they're doing. So I'm not sure that's much of an argument.

[citation needed]

Also, it's easy to just play with the -O option and see what's faster.
So IMO, it's a package maintainer decision, and I can't see where
exactly we have an issue.

Thomas


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/538c2173.5070...@debian.org



Re: Why not 03 ?

2014-06-01 Thread Charles Plessy
Le Sun, Jun 01, 2014 at 11:07:32PM +0200, Julien Cristau a écrit :
> On Fri, May 30, 2014 at 07:21:08 +0900, Charles Plessy wrote:
> 
> > Perhaps we can stop overriding this option ?  For a lot of scientific
> > packages, -O3 is chosen by the upstream author, and I always feel bad
> > that if we make the programs slower by overriding it to -O2, it will
> > reflect poorly on Debian as a distribution for scientific works.
> > 
> For a lot of scientific packages, the upstream authors don't know what
> they're doing.  So I'm not sure that's much of an argument.

I think that such generalisations make Debian an unwelcoming place.  Also,
let's remember how this attitude backfires: Debian is also seen as a
distribution that breaks upstream sofware by carrying a higher than average
quantity of patches that the package maintainer doesn't fully understand.

What do we lose if we follow upstream's compiler options ?  As noted, the
program may fail to build on other architectures than amd64.  I do not think
that the unavailability of such non-core packages on other architectures is a
problem (no user base), and if they are a distraction to the porters, let's
restrict the build to amd64 more systematically: less work for everybody.

What we gain if we follow upstream's compiler options is that we will
distribute a software that is closer to what the users run when they compile it
themselves.  This is the principle of least surprise, and I do not see a reason
to deviate from it systematically, hence DEB_CFLAGS_MAINT_APPEND should be used
to override Upstream's defaults if needed, rather than the reverse.

Cheers,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140601231427.ga4...@falafel.plessy.net



Re: Why not 03 ?

2014-06-01 Thread Julien Cristau
On Fri, May 30, 2014 at 07:21:08 +0900, Charles Plessy wrote:

> Perhaps we can stop overriding this option ?  For a lot of scientific
> packages, -O3 is chosen by the upstream author, and I always feel bad
> that if we make the programs slower by overriding it to -O2, it will
> reflect poorly on Debian as a distribution for scientific works.
> 
For a lot of scientific packages, the upstream authors don't know what
they're doing.  So I'm not sure that's much of an argument.

Cheers,
Julien


signature.asc
Description: Digital signature


Re: Why not 03 ?

2014-06-01 Thread Bernhard R. Link
* Julian Taylor  [140601 14:29]:
> I would not go into detail about O2 or O3 in the policy.
> The meaning of these flags is very compiler specific. E.g. clang will
> enable vectorization already at O2 and adds almost no extra passes with O3.
>
> I think it would be better to simply state:
> If the upstream optimization options differ from the ones of the default
> debian toolchain it is recommended to override the debian defaults to
> match the ones upstream uses during packaging.
> Upstream usually has choosen particular options for a reason, they know
> their software best.

I think one of the examples here was scientific software. Assuming
"upstream knows what they do" is very unlikely to be true there.

I'd rather argue for a "unless you know what you do, use -O2", which
is almost the current state. (I'd rather argue that currentl too much
software uses something different to -O2 for no good and too often bad
reasons).

Bernhard R. Link
-- 
F8AC 04D5 0B9B 064B 3383  C3DA AFFC 96D1 151D FFDC


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140601124250.gb2...@client.brlink.eu



Re: Why not 03 ?

2014-06-01 Thread Julian Taylor
On 01.06.2014 05:39, Steve Langasek wrote:
> On Sun, Jun 01, 2014 at 12:37:18AM +0200, Tollef Fog Heen wrote:
>> ]] Steve Langasek 
> 
>>> FWIW, the recent port of Ubuntu to ppc64el uses -O3 as the default, because
>>> IBM has broad experience in resolving performance issues for their own
>>> hardware and have found that -O3 gives an overall better experience for
>>> their customers.  It will be difficult for Debian to gather the same kind of
>>> information across all its architectures, but we shouldn't conclude, just
>>> because it's difficult to know the right answer, that -O2 is definitely the
>>> right answer.
> 
>> It sounds like we want to stop recommending any particular level in
>> Policy and just let the architecture toolchain default to the
>> recommended value for that architecture, and only override when there's
>> a need.
> 
> It seems that I believed the policy language on this to be much stronger
> than it actually is.  Looking at policy, I see:
> 
>  By default, when a package is being built, any binaries created should
>  include debugging information, as well as being compiled with
>  optimization.
> 
> It then presents CFLAGS = -O2 [...] as an example, but apparently this is
> only an example.
> 
> Still, I think we're better off improving the policy language to explain
> when we think -O3 should be used instead of -O2, and when it should not,
> rather than having a free-for-all in the archive.  Even to make this change
> on a per-architecture basis warrants more extensive profiling than porters
> are probably prepared to do; I certainly don't want maintainers to override
> it "when there's a need" without the project providing some guidance on what
> constitutes sufficient "need".
> 

I would not go into detail about O2 or O3 in the policy.
The meaning of these flags is very compiler specific. E.g. clang will
enable vectorization already at O2 and adds almost no extra passes with O3.

I think it would be better to simply state:
If the upstream optimization options differ from the ones of the default
debian toolchain it is recommended to override the debian defaults to
match the ones upstream uses during packaging.
Upstream usually has choosen particular options for a reason, they know
their software best.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/538b1ca8.9000...@googlemail.com



Re: Why not 03 ?

2014-05-31 Thread Steve Langasek
On Sun, Jun 01, 2014 at 12:37:18AM +0200, Tollef Fog Heen wrote:
> ]] Steve Langasek 

> > FWIW, the recent port of Ubuntu to ppc64el uses -O3 as the default, because
> > IBM has broad experience in resolving performance issues for their own
> > hardware and have found that -O3 gives an overall better experience for
> > their customers.  It will be difficult for Debian to gather the same kind of
> > information across all its architectures, but we shouldn't conclude, just
> > because it's difficult to know the right answer, that -O2 is definitely the
> > right answer.

> It sounds like we want to stop recommending any particular level in
> Policy and just let the architecture toolchain default to the
> recommended value for that architecture, and only override when there's
> a need.

It seems that I believed the policy language on this to be much stronger
than it actually is.  Looking at policy, I see:

 By default, when a package is being built, any binaries created should
 include debugging information, as well as being compiled with
 optimization.

It then presents CFLAGS = -O2 [...] as an example, but apparently this is
only an example.

Still, I think we're better off improving the policy language to explain
when we think -O3 should be used instead of -O2, and when it should not,
rather than having a free-for-all in the archive.  Even to make this change
on a per-architecture basis warrants more extensive profiling than porters
are probably prepared to do; I certainly don't want maintainers to override
it "when there's a need" without the project providing some guidance on what
constitutes sufficient "need".

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: Digital signature


Re: Why not 03 ?

2014-05-31 Thread Henrique de Moraes Holschuh
On Sun, 01 Jun 2014, Tollef Fog Heen wrote:
> > FWIW, the recent port of Ubuntu to ppc64el uses -O3 as the default, because
> > IBM has broad experience in resolving performance issues for their own
> > hardware and have found that -O3 gives an overall better experience for
> > their customers.  It will be difficult for Debian to gather the same kind of
> > information across all its architectures, but we shouldn't conclude, just
> > because it's difficult to know the right answer, that -O2 is definitely the
> > right answer.
> 
> It sounds like we want to stop recommending any particular level in
> Policy and just let the architecture toolchain default to the
> recommended value for that architecture, and only override when there's
> a need.

No.  People mess with this rather cluelessly.  As long as it is not "MUST
use -Ofoo", we really should say something in policy.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140531230024.ga8...@khazad-dum.debian.net



Re: Why not 03 ?

2014-05-31 Thread Tollef Fog Heen
]] Steve Langasek 

> FWIW, the recent port of Ubuntu to ppc64el uses -O3 as the default, because
> IBM has broad experience in resolving performance issues for their own
> hardware and have found that -O3 gives an overall better experience for
> their customers.  It will be difficult for Debian to gather the same kind of
> information across all its architectures, but we shouldn't conclude, just
> because it's difficult to know the right answer, that -O2 is definitely the
> right answer.

It sounds like we want to stop recommending any particular level in
Policy and just let the architecture toolchain default to the
recommended value for that architecture, and only override when there's
a need.

-- 
Tollef Fog Heen
UNIX is user friendly, it's just picky about who its friends are


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/m2k391lgm9@rahvafeir.err.no



Re: Why not 03 ?

2014-05-31 Thread Moritz Mühlenhoff
Charles Plessy  schrieb:
> Perhaps we can stop overriding this option ?  For a lot of scientific
> packages, -O3 is chosen by the upstream author, and I always feel bad
> that if we make the programs slower by overriding it to -O2, it will
> reflect poorly on Debian as a distribution for scientific works.

You already can by adding this to your rules file:

export DEB_CFLAGS_MAINT_APPEND  = -O3

Cheers,
Moritz


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/slrnloj3c0.2jo@inutil.org



Re: Why not 03 ?

2014-05-30 Thread James Cloos
> "SL" == Steve Langasek  writes:

SL> The current default of -O2 is based on the fact that adding -O3 may give
SL> worse results than -O2.

On x86_64 I've yet to find anything which is slow enough to notice where
moving to O3 helped.

The memory pressure from the larger code segments overwhelms any benefit
from reduced prediction pressure.

They may exist.  But I didn't find any.

I only tested (or could test) on my K10.

-JimC
--
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/m3wqd2eo4w@carbon.jhcloos.org



Re: Why not 03 ?

2014-05-30 Thread James Cloos
> "CP" == Charles Plessy  writes:

CP> Perhaps we can stop overriding this option ?  For a lot of scientific
CP> packages, -O3 is chosen by the upstream author, and I always feel bad
CP> that if we make the programs slower by overriding it to -O2, it will
CP> reflect poorly on Debian as a distribution for scientific works.

On my gentoo box, *everything* was slower with O3.

I ended up with this list, which enables the parts of O3 which do not
unroll too much, and therefore do not bloat the text sections like O3:

  -O2 -fgcse-after-reload -ftree-partial-pre -ftree-vectorize
  -fpredictive-commoning -fvect-cost-model -frename-registers
  -floop-interchange -floop-strip-mine -floop-block

The last three (-floop-*) apply only when the graphite extensions are
compiled into gcc (as I do).  They are no-ops otherwise.

Empirically, on x86_64 at least, that is optimal.

-JimC
--
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/m338fqg2wz@carbon.jhcloos.org



Re: Why not 03 ?

2014-05-30 Thread Julian Taylor
On 30.05.2014 09:40, Xavier Roche wrote:
> On Fri, May 30, 2014 at 11:10:29AM +1000, Russell Stuart wrote:
>> In particular -O3 turns on auto-vectorisation.  It can provide a big
>> speed up to programs that can take advantage of it
> [...]
>> As others have pointed our -O3 turns on optimisations that help on some
>> architectures and hinder on others.  Vectorisation sort of falls into
>> that category: hinder becomes "fail with a SIGILL".
> 
> On x86-64, AFAICS, you have at least SSE2 and 16 XMM registers, whatever the 
> processor is. Yes, you can not enable AVX(2), but you still can do 
> interesting vector optimizations with the most common x86-64 processor.
> 
> (*) http://en.wikipedia.org/wiki/X86-64
> (*) http://en.wikipedia.org/wiki/Advanced_Vector_Extensions
> 
> 

to be able to make use of the autovectorizer in non trivial loops you
usually need more options than just O3.
The C standard is very strict in regards to floating point semantics,
e.g. they are not associative, there may be signaling nans, errno may
need to be set, memory can alias etc.
This normally prevents autovectorizers from working without adding extra
flags telling the compiler about special circumstances of a loop.
You can do this via gccs function attribute e.g. adding
-funsafe-math-optimizations to a function where the compiler may go
crazy (OpenMP 4.0 also introduces pragma SIMD for this purposes).

So enabling O3 by default will most likely not gain us much for most
cases as application that profit as a whole from vectorization are not
that common and if they do they are usually to complex to allow
autovectorization without patches.

Also it would only be effective on amd64, x32, arm64 and ppc64el as
those are the only platforms that have mandatory SIMD instructions.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/538861e7.9020...@googlemail.com



Re: Why not 03 ?

2014-05-30 Thread Rebecca N. Palmer

Bottom line: the vectorisation provided -O3 can provide big speed ups to
some scientific programs, but it is ineffective on Debian because by
necessity it tells gcc to compile code for lowest common denominator CPU
which doesn't have the necessary instructions.


Ineffective on i386, but amd64 always has at least SSE2.

You can turn on -O3 (or -ftree-vectorize if you just want the 
vectorization) in a single package with DEB_CFLAGS_MAINT_APPEND and 
DEB_CXXFLAGS_MAINT_APPEND : 
https://wiki.debian.org/HardeningWalkthrough#My_package_builds_with_optimisation_flags_other_than_-O2.2C_e.g._-Os 
.  However, given previous messages, please first check that your 
package actually benefits from it.


There is or was also a "hwcaps" mechanism for having multiple versions 
of a binary for different CPUs, but I've never tried to use it.  For 
pocl (ITP #676504) the speed difference between -march=corei7-avx and 
plain amd64 is about 20%; I haven't measured it on i386, and other 
packages may be very different.



--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/53883c5d.6020...@bham.ac.uk



Re: Why not 03 ?

2014-05-30 Thread Xavier Roche
On Fri, May 30, 2014 at 11:10:29AM +1000, Russell Stuart wrote:
> In particular -O3 turns on auto-vectorisation.  It can provide a big
> speed up to programs that can take advantage of it
[...]
> As others have pointed our -O3 turns on optimisations that help on some
> architectures and hinder on others.  Vectorisation sort of falls into
> that category: hinder becomes "fail with a SIGILL".

On x86-64, AFAICS, you have at least SSE2 and 16 XMM registers, whatever the 
processor is. Yes, you can not enable AVX(2), but you still can do interesting 
vector optimizations with the most common x86-64 processor.

(*) http://en.wikipedia.org/wiki/X86-64
(*) http://en.wikipedia.org/wiki/Advanced_Vector_Extensions


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140530074007.GA7927@proliant.localnet



Re: Why not 03 ?

2014-05-29 Thread Steve Langasek
On Fri, May 30, 2014 at 07:36:24AM +0900, Mike Hommey wrote:
> On Thu, May 29, 2014 at 12:21:06PM -0700, Russ Allbery wrote:
> > Xavier Roche  writes:

> > > I have a rather silly question: most (all ?) packages are built by
> > > default with -02 - something which is inherited from autotool's '-g -O2'
> > > default flagsd, I presume.

> > > Is -O3 considered too dangerous ? (AFAICS, potential issues are mainly
> > > present in O2) Or is it considered worthless because the performance
> > > gain would be really low ?

> > Historically, -O3 has usually been slower than -O2 for a lot of software
> > because the aggressive loop unrolling increases code size and interferes
> > with processor caching strategies.  I don't know if that's now been fixed
> > in GCC, but that's probably much of the historical reason.

> That's still true.

On which architectures, and for which software?  Maybe this is a problem
only on i386, or only for firefox?

The current default of -O2 is based on the fact that adding -O3 may give
worse results than -O2.  But that's not the same thing as -O2 *always*
giving *better* results than -O3, or giving better results often enough to
warrant it being the default.  There must be a tipping point where it makes
sense in terms of overall efficiency to switch to -O3 as the default
(perhaps on a per-arch basis), and let -O2 be the exception rather than the
rule.

I don't think we have a clear idea of where that tipping point is, because I
don't think anyone has measured this systematically since the -O2 default
was set in policy nearly two decades ago, before amd64 even existed.

FWIW, the recent port of Ubuntu to ppc64el uses -O3 as the default, because
IBM has broad experience in resolving performance issues for their own
hardware and have found that -O3 gives an overall better experience for
their customers.  It will be difficult for Debian to gather the same kind of
information across all its architectures, but we shouldn't conclude, just
because it's difficult to know the right answer, that -O2 is definitely the
right answer.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: Digital signature


Re: Why not 03 ?

2014-05-29 Thread Russell Stuart
On Fri, 2014-05-30 at 07:21 +0900, Charles Plessy wrote:
> For a lot of scientific packages, -O3 is chosen by the upstream 
> author, and I always feel bad that if we make the programs slower 
> by overriding it to -O2, it will reflect poorly on Debian as a 
> distribution for scientific works.

In particular -O3 turns on auto-vectorisation.  It can provide a big
speed up to programs that can take advantage of it - and yes many
scientific programs fall into that category.  Big as in 300% [0].  So
you are correct in saying not turning it on will make Debian look slow
compared to a system that takes advantage of it.

Unfortunately the instructions need to get the speed up vary by CPU.
Not only is AMD is different to Intel, Intel turns them on and off
depending on the intended market.

This breaks Debian's "One binary rules them all model" unless the
upstream has gone to extraordinary lengths.  As in providing multiple
compiled versions of the same code path, and choosing the best one at
run time based on CPU model.  Projects that do that generally use hand
crafted assembler, usually inlined in C code.  Note that means they will
run fast without -O3.

As others have pointed our -O3 turns on optimisations that help on some
architectures and hinder on others.  Vectorisation sort of falls into
that category: hinder becomes "fail with a SIGILL".  That doesn't happen
normally because of another fail safe: even with -O3 gcc only generates
instructions the target CPU can execute [1].  Debian tells gcc to
generate code for a generic CPU.

Bottom line: the vectorisation provided -O3 can provide big speed ups to
some scientific programs, but it is ineffective on Debian because by
necessity it tells gcc to compile code for lowest common denominator CPU
which doesn't have the necessary instructions.


[0] http://felix.abecassis.me/2012/08/sse-vectorizing-conditional-code/
[1] See the -march option of gcc.  In particular, -march=native.


signature.asc
Description: This is a digitally signed message part


Re: Why not 03 ?

2014-05-29 Thread Mike Hommey
On Thu, May 29, 2014 at 12:21:06PM -0700, Russ Allbery wrote:
> Xavier Roche  writes:
> 
> > I have a rather silly question: most (all ?) packages are built by
> > default with -02 - something which is inherited from autotool's '-g -O2'
> > default flagsd, I presume.
> 
> > Is -O3 considered too dangerous ? (AFAICS, potential issues are mainly
> > present in O2) Or is it considered worthless because the performance
> > gain would be really low ?
> 
> Historically, -O3 has usually been slower than -O2 for a lot of software
> because the aggressive loop unrolling increases code size and interferes
> with processor caching strategies.  I don't know if that's now been fixed
> in GCC, but that's probably much of the historical reason.

That's still true.

Mike


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140529223624.gb8...@glandium.org



Re: Why not 03 ?

2014-05-29 Thread Charles Plessy
Le Thu, May 29, 2014 at 12:21:06PM -0700, Russ Allbery a écrit :
> Xavier Roche  writes:
> 
> > I have a rather silly question: most (all ?) packages are built by
> > default with -02 - something which is inherited from autotool's '-g -O2'
> > default flagsd, I presume.
> 
> > Is -O3 considered too dangerous ? (AFAICS, potential issues are mainly
> > present in O2) Or is it considered worthless because the performance
> > gain would be really low ?
> 
> Historically, -O3 has usually been slower than -O2 for a lot of software
> because the aggressive loop unrolling increases code size and interferes
> with processor caching strategies.  I don't know if that's now been fixed
> in GCC, but that's probably much of the historical reason.
> 
> My impression is that most people using GCC use -O2, so it's the
> best-tested path.

Perhaps we can stop overriding this option ?  For a lot of scientific
packages, -O3 is chosen by the upstream author, and I always feel bad
that if we make the programs slower by overriding it to -O2, it will
reflect poorly on Debian as a distribution for scientific works.

Have a nice day,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140529222108.ga23...@falafel.plessy.net



Re: Why not 03 ?

2014-05-29 Thread Russ Allbery
Xavier Roche  writes:

> I have a rather silly question: most (all ?) packages are built by
> default with -02 - something which is inherited from autotool's '-g -O2'
> default flagsd, I presume.

> Is -O3 considered too dangerous ? (AFAICS, potential issues are mainly
> present in O2) Or is it considered worthless because the performance
> gain would be really low ?

Historically, -O3 has usually been slower than -O2 for a lot of software
because the aggressive loop unrolling increases code size and interferes
with processor caching strategies.  I don't know if that's now been fixed
in GCC, but that's probably much of the historical reason.

My impression is that most people using GCC use -O2, so it's the
best-tested path.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87ha48gzml@windlord.stanford.edu



Why not 03 ?

2014-05-29 Thread Xavier Roche
Hi folks,

I have a rather silly question: most (all ?) packages are built by default with 
-02 - something which is inherited from autotool's '-g -O2' default flagsd, I 
presume.

Is -O3 considered too dangerous ? (AFAICS, potential issues are mainly present 
in O2) Or is it considered worthless because the performance gain would be 
really low ?


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140529183802.GA30349@proliant.localnet