Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Bin.Cheng
On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey  wrote:
>
> I have a question about SPEC CPU 2017 and what GCC can and cannot do
> with -flto.  As part of some SPEC analysis I am doing I found that with
> -Ofast, ICC and GCC were not that far apart (especially spec int rate,
> spec fp rate was a slightly larger difference).
>
> But when I added -ipo to the ICC command and -flto to the GCC command,
> the difference got larger.  In particular the 519.lbm_r was more than
> twice as fast with ICC and -ipo, but -flto did not help GCC at all.
>
> There are other tests that also show this type of improvement with -ipo
> like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
> 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone have
> any idea on what ICC is doing that GCC is missing?  Is GCC just not
> agressive enough with its inlining?

IIRC Jun did some investigation before? CCing.

Thanks,
bin
>
> Steve Ellcey
> sell...@marvell.com


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Jun Ma
Bin.Cheng  于2019年2月15日周五 下午5:12写道:

> On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey  wrote:
> >
> > I have a question about SPEC CPU 2017 and what GCC can and cannot do
> > with -flto.  As part of some SPEC analysis I am doing I found that with
> > -Ofast, ICC and GCC were not that far apart (especially spec int rate,
> > spec fp rate was a slightly larger difference).
> >
> > But when I added -ipo to the ICC command and -flto to the GCC command,
> > the difference got larger.  In particular the 519.lbm_r was more than
> > twice as fast with ICC and -ipo, but -flto did not help GCC at all.
> >
> > There are other tests that also show this type of improvement with -ipo
> > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
> > 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone have
> > any idea on what ICC is doing that GCC is missing?  Is GCC just not
> > agressive enough with its inlining?
>
> IIRC Jun did some investigation before? CCing.
>
> Thanks,
> bin
> >
> > Steve Ellcey
> > sell...@marvell.com

ICC is doing much more than GCC in ipo, especially memory layout
optimizations. See https://software.intel.com/en-us/node/522667.
ICC is more aggressive in array transposition/structure splitting
/field reordering. However, these optimizations have been removed
from GCC long time ago.
As for case lbm_r, IIRC a loop with memory access which stride is 20 is
most time-consuming.  ICC will optimize the array(maybe structure?)
and vectorize the loop under ipo.

Thanks
Jun


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Hi-Angel
I never could understand, why field reordering was removed from GCC? I
mean, I know that it's prohibited in C and C++, but, sure, GCC can
detect whether it possibly can influence application behavior, and if
not, just do the reorder.

The veto is important to C/C++ as programming languages, but not to
machine code that is being generated from them. As long as app can't
detect that its fields were reordered through means defined by C/C++,
field reordering by compiler is fine, isn't it?

On Fri, 15 Feb 2019 at 12:49, Jun Ma  wrote:
>
> Bin.Cheng  于2019年2月15日周五 下午5:12写道:
>
> > On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey  wrote:
> > >
> > > I have a question about SPEC CPU 2017 and what GCC can and cannot do
> > > with -flto.  As part of some SPEC analysis I am doing I found that with
> > > -Ofast, ICC and GCC were not that far apart (especially spec int rate,
> > > spec fp rate was a slightly larger difference).
> > >
> > > But when I added -ipo to the ICC command and -flto to the GCC command,
> > > the difference got larger.  In particular the 519.lbm_r was more than
> > > twice as fast with ICC and -ipo, but -flto did not help GCC at all.
> > >
> > > There are other tests that also show this type of improvement with -ipo
> > > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
> > > 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone have
> > > any idea on what ICC is doing that GCC is missing?  Is GCC just not
> > > agressive enough with its inlining?
> >
> > IIRC Jun did some investigation before? CCing.
> >
> > Thanks,
> > bin
> > >
> > > Steve Ellcey
> > > sell...@marvell.com
>
> ICC is doing much more than GCC in ipo, especially memory layout
> optimizations. See https://software.intel.com/en-us/node/522667.
> ICC is more aggressive in array transposition/structure splitting
> /field reordering. However, these optimizations have been removed
> from GCC long time ago.
> As for case lbm_r, IIRC a loop with memory access which stride is 20 is
> most time-consuming.  ICC will optimize the array(maybe structure?)
> and vectorize the loop under ipo.
>
> Thanks
> Jun


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Richard Biener
On February 15, 2019 1:45:10 PM GMT+01:00, Hi-Angel  
wrote:
>I never could understand, why field reordering was removed from GCC?

The implementation simply was seriously broken, bitrotten and unmaintained. 

Richard 

 I
>mean, I know that it's prohibited in C and C++, but, sure, GCC can
>detect whether it possibly can influence application behavior, and if
>not, just do the reorder.
>
>The veto is important to C/C++ as programming languages, but not to
>machine code that is being generated from them. As long as app can't
>detect that its fields were reordered through means defined by C/C++,
>field reordering by compiler is fine, isn't it?
>
>On Fri, 15 Feb 2019 at 12:49, Jun Ma  wrote:
>>
>> Bin.Cheng  于2019年2月15日周五 下午5:12写道:
>>
>> > On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey 
>wrote:
>> > >
>> > > I have a question about SPEC CPU 2017 and what GCC can and cannot
>do
>> > > with -flto.  As part of some SPEC analysis I am doing I found
>that with
>> > > -Ofast, ICC and GCC were not that far apart (especially spec int
>rate,
>> > > spec fp rate was a slightly larger difference).
>> > >
>> > > But when I added -ipo to the ICC command and -flto to the GCC
>command,
>> > > the difference got larger.  In particular the 519.lbm_r was more
>than
>> > > twice as fast with ICC and -ipo, but -flto did not help GCC at
>all.
>> > >
>> > > There are other tests that also show this type of improvement
>with -ipo
>> > > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
>> > > 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone
>have
>> > > any idea on what ICC is doing that GCC is missing?  Is GCC just
>not
>> > > agressive enough with its inlining?
>> >
>> > IIRC Jun did some investigation before? CCing.
>> >
>> > Thanks,
>> > bin
>> > >
>> > > Steve Ellcey
>> > > sell...@marvell.com
>>
>> ICC is doing much more than GCC in ipo, especially memory layout
>> optimizations. See https://software.intel.com/en-us/node/522667.
>> ICC is more aggressive in array transposition/structure splitting
>> /field reordering. However, these optimizations have been removed
>> from GCC long time ago.
>> As for case lbm_r, IIRC a loop with memory access which stride is 20
>is
>> most time-consuming.  ICC will optimize the array(maybe structure?)
>> and vectorize the loop under ipo.
>>
>> Thanks
>> Jun



Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Jakub Jelinek
On Fri, Feb 15, 2019 at 02:12:27PM +0100, Richard Biener wrote:
> On February 15, 2019 1:45:10 PM GMT+01:00, Hi-Angel  
> wrote:
> >I never could understand, why field reordering was removed from GCC?
> 
> The implementation simply was seriously broken, bitrotten and unmaintained. 

Which of course doesn't mean somebody else can't submit a new
implementation, as long as it would be properly maintained and would avoid
the issues the old implementation had.  Just it is better not to have it if
it causes lots of wrong-code issues and there is nobody to fix those.

Jakub


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Ramana Radhakrishnan
On Fri, Feb 15, 2019 at 1:16 PM Jakub Jelinek  wrote:
>
> On Fri, Feb 15, 2019 at 02:12:27PM +0100, Richard Biener wrote:
> > On February 15, 2019 1:45:10 PM GMT+01:00, Hi-Angel  
> > wrote:
> > >I never could understand, why field reordering was removed from GCC?
> >
> > The implementation simply was seriously broken, bitrotten and unmaintained.
>
> Which of course doesn't mean somebody else can't submit a new
> implementation, as long as it would be properly maintained and would avoid
> the issues the old implementation had.  Just it is better not to have it if
> it causes lots of wrong-code issues and there is nobody to fix those.

I also remember a cauldron talk in the recent past about this. It was in Prague.
Ah , here's a youtube video of it. :

https://www.youtube.com/watch?v=vhV75sys0Nw



Ramana



>
> Jakub


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Ian Lance Taylor
On Fri, Feb 15, 2019 at 4:46 AM Hi-Angel  wrote:
>
> I never could understand, why field reordering was removed from GCC? I
> mean, I know that it's prohibited in C and C++, but, sure, GCC can
> detect whether it possibly can influence application behavior, and if
> not, just do the reorder.
>
> The veto is important to C/C++ as programming languages, but not to
> machine code that is being generated from them. As long as app can't
> detect that its fields were reordered through means defined by C/C++,
> field reordering by compiler is fine, isn't it?

In my opinion field reordering is very hard for the compiler to do
correctly and trivial for a human programmer to do correctly.  So in
practice the best approach is for the compiler, or some other tool, to
say "you should reorder the fields here."  As far as I can see, the
only real reason to implement field reordering in a compiler is for
benchmark cracking, since benchmarks typically don't let you modify
the source code.  It's not a useful optimization in practice other
than for benchmarks.

(Array transformations and struct splitting, on the other hand, can be useful.)

Ian



> On Fri, 15 Feb 2019 at 12:49, Jun Ma  wrote:
> >
> > Bin.Cheng  于2019年2月15日周五 下午5:12写道:
> >
> > > On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey  wrote:
> > > >
> > > > I have a question about SPEC CPU 2017 and what GCC can and cannot do
> > > > with -flto.  As part of some SPEC analysis I am doing I found that with
> > > > -Ofast, ICC and GCC were not that far apart (especially spec int rate,
> > > > spec fp rate was a slightly larger difference).
> > > >
> > > > But when I added -ipo to the ICC command and -flto to the GCC command,
> > > > the difference got larger.  In particular the 519.lbm_r was more than
> > > > twice as fast with ICC and -ipo, but -flto did not help GCC at all.
> > > >
> > > > There are other tests that also show this type of improvement with -ipo
> > > > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
p> > > > 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone have
> > > > any idea on what ICC is doing that GCC is missing?  Is GCC just not
> > > > agressive enough with its inlining?
> > >
> > > IIRC Jun did some investigation before? CCing.
> > >
> > > Thanks,
> > > bin
> > > >
> > > > Steve Ellcey
> > > > sell...@marvell.com
> >
> > ICC is doing much more than GCC in ipo, especially memory layout
> > optimizations. See https://software.intel.com/en-us/node/522667.
> > ICC is more aggressive in array transposition/structure splitting
> > /field reordering. However, these optimizations have been removed
> > from GCC long time ago.
> > As for case lbm_r, IIRC a loop with memory access which stride is 20 is
> > most time-consuming.  ICC will optimize the array(maybe structure?)
> > and vectorize the loop under ipo.
> >
> > Thanks
> > Jun


GCC 8.3 Release Candidate available from gcc.gnu.org

2019-02-15 Thread Jakub Jelinek
The first release candidate for GCC 8.3 is available from

 https://gcc.gnu.org/pub/gcc/snapshots/8.3.0-RC-20190215/
 ftp://gcc.gnu.org/pub/gcc/snapshots/8.3.0-RC-20190215/

and shortly its mirrors.  It has been generated from SVN revision 268935.

I have so far bootstrapped and tested the release candidate on
x86_64-linux and i686-linux.  Please test it and report any issues to
bugzilla.

If all goes well, I'd like to release 8.3 on Friday, February 22nd.


GCC 8.3 Status Report (2019-02-15)

2019-02-15 Thread Jakub Jelinek
Status
==

The GCC 8 branch is now frozen for blocking regressions and documentation
fixes only, all changes to the branch require a RM approval now.


Quality Data


Priority  #   Change from last report
---   ---
P10
P2  193   -  11
P3   29   +   4
P4  163   -   2
P5   24
---   ---
Total P1-P3 222   -   7
Total   409   -   9


Previous Report
===

https://gcc.gnu.org/ml/gcc/2019-02/msg00034.html


Re: [EXT] Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Steve Ellcey
On Fri, 2019-02-15 at 17:48 +0800, Jun Ma wrote:
> 
> ICC is doing much more than GCC in ipo, especially memory layout 
> optimizations. See https://software.intel.com/en-us/node/522667.
> ICC is more aggressive in array transposition/structure splitting
> /field reordering. However, these optimizations have been removed
> from GCC long time ago.  
> As for case lbm_r, IIRC a loop with memory access which stride is 20 is 
> most time-consuming.  ICC will optimize the array(maybe structure?) 
> and vectorize the loop under ipo.
>  
> Thanks
> Jun

Interesting.  I tried using '-qno-opt-mem-layout-trans' on ICC
along with '-Ofast -ipo' and that had no affect on the speed.  I also
tried '-no-vec' and that had no affect either.  The only thing that 
slowed down ICC was '-ip-no-inlining' or '-fno-inline'.  I see that
'-Ofast -ipo' resulted in everything (except libc functions) getting
inlined into the main program when using ICC.  GCC did not do that, but
if I forced it to by using the always_inline attribute, GCC could
inline everything into main the way ICC does.  But that did not speed
up the GCC executable.

Steve Ellcey
sell...@marvell.com


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Joel Sherrill
On Fri, Feb 15, 2019 at 9:02 AM Ian Lance Taylor  wrote:

> On Fri, Feb 15, 2019 at 4:46 AM Hi-Angel  wrote:
> >
> > I never could understand, why field reordering was removed from GCC? I
> > mean, I know that it's prohibited in C and C++, but, sure, GCC can
> > detect whether it possibly can influence application behavior, and if
> > not, just do the reorder.
> >
> > The veto is important to C/C++ as programming languages, but not to
> > machine code that is being generated from them. As long as app can't
> > detect that its fields were reordered through means defined by C/C++,
> > field reordering by compiler is fine, isn't it?
>
> In my opinion field reordering is very hard for the compiler to do
> correctly and trivial for a human programmer to do correctly.  So in
> practice the best approach is for the compiler, or some other tool, to
> say "you should reorder the fields here."  As far as I can see, the
> only real reason to implement field reordering in a compiler is for
> benchmark cracking, since benchmarks typically don't let you modify
> the source code.  It's not a useful optimization in practice other
> than for benchmarks.
>

Hasn't GNAT sorted Ada elements in records (e.g. structures) by size
since near its initial addition to GCC in the mid-90s? This results in the
largest elements up front and minimizes the need for alignment gaps.

I know Ada is traditionally more strongly typed than C/C++, but tf it can
be done for Ada programs reliably, why could it not be reliable in C?

>
> (Array transformations and struct splitting, on the other hand, can be
> useful.)
>

--joel

>
> Ian
>
>
>
> > On Fri, 15 Feb 2019 at 12:49, Jun Ma  wrote:
> > >
> > > Bin.Cheng  于2019年2月15日周五 下午5:12写道:
> > >
> > > > On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey 
> wrote:
> > > > >
> > > > > I have a question about SPEC CPU 2017 and what GCC can and cannot
> do
> > > > > with -flto.  As part of some SPEC analysis I am doing I found that
> with
> > > > > -Ofast, ICC and GCC were not that far apart (especially spec int
> rate,
> > > > > spec fp rate was a slightly larger difference).
> > > > >
> > > > > But when I added -ipo to the ICC command and -flto to the GCC
> command,
> > > > > the difference got larger.  In particular the 519.lbm_r was more
> than
> > > > > twice as fast with ICC and -ipo, but -flto did not help GCC at all.
> > > > >
> > > > > There are other tests that also show this type of improvement with
> -ipo
> > > > > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
> p> > > > 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone
> have
> > > > > any idea on what ICC is doing that GCC is missing?  Is GCC just not
> > > > > agressive enough with its inlining?
> > > >
> > > > IIRC Jun did some investigation before? CCing.
> > > >
> > > > Thanks,
> > > > bin
> > > > >
> > > > > Steve Ellcey
> > > > > sell...@marvell.com
> > >
> > > ICC is doing much more than GCC in ipo, especially memory layout
> > > optimizations. See https://software.intel.com/en-us/node/522667.
> > > ICC is more aggressive in array transposition/structure splitting
> > > /field reordering. However, these optimizations have been removed
> > > from GCC long time ago.
> > > As for case lbm_r, IIRC a loop with memory access which stride is 20 is
> > > most time-consuming.  ICC will optimize the array(maybe structure?)
> > > and vectorize the loop under ipo.
> > >
> > > Thanks
> > > Jun
>


Re: riscv64 dep. computation

2019-02-15 Thread Jim Wilson
On Thu, Feb 14, 2019 at 11:33 PM Paulo Matos  wrote:
> Are global variables not supposed to alias each other?
> If I indeed do that, gcc still won't group loads and stores:
> https://cx.rv8.io/g/rFjGLa

I meant something like
struct foo_t x, y;
and now they clearly don't alias.  As global pointers they may still alias.

Jim


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Richard Kenner
> Hasn't GNAT sorted Ada elements in records (e.g. structures) by size
> since near its initial addition to GCC in the mid-90s? 

No, it wasn't done early on and it was never done in that major a way
now.  Most reordering (possibly all; I'm not sure) is done between
objects of variable and fixed size, not between objects of differing
fixed sizes.

> I know Ada is traditionally more strongly typed than C/C++, but tf it can
> be done for Ada programs reliably, why could it not be reliable in C?

I don't see it as a reliability issue, but one of expectations.  One might
be using a struct to map some hardware layout or records in a file so that
reordering fields could break things.


Re: riscv64 dep. computation

2019-02-15 Thread Paulo Matos



On 15/02/2019 19:15, Jim Wilson wrote:
> On Thu, Feb 14, 2019 at 11:33 PM Paulo Matos  wrote:
>> Are global variables not supposed to alias each other?
>> If I indeed do that, gcc still won't group loads and stores:
>> https://cx.rv8.io/g/rFjGLa
> 
> I meant something like
> struct foo_t x, y;
> and now they clearly don't alias.  As global pointers they may still alias.
> 

Ah ok, of course. Like that it makes sense they don't alias.

Thanks,

-- 
Paulo Matos


Re: GCC missing -flto optimizations? SPEC lbm benchmark

2019-02-15 Thread Eric Botcazou
> Hasn't GNAT sorted Ada elements in records (e.g. structures) by size
> since near its initial addition to GCC in the mid-90s? This results in the
> largest elements up front and minimizes the need for alignment gaps.

No, that's a serious misconception, since one of the features of GNAT is to be 
compatible with C by default as much as possible.  But we started to do some 
reordering recently when the records don't have (direct) equivalents in C.

-- 
Eric Botcazou


Re: GCC 8.3 Release Candidate available from gcc.gnu.org

2019-02-15 Thread Bill Seurer

On 02/15/19 10:13, Jakub Jelinek wrote:

The first release candidate for GCC 8.3 is available from

  https://gcc.gnu.org/pub/gcc/snapshots/8.3.0-RC-20190215/
  ftp://gcc.gnu.org/pub/gcc/snapshots/8.3.0-RC-20190215/

and shortly its mirrors.  It has been generated from SVN revision 268935.

I have so far bootstrapped and tested the release candidate on
x86_64-linux and i686-linux.  Please test it and report any issues to
bugzilla.

If all goes well, I'd like to release 8.3 on Friday, February 22nd.



I bootstrapped and tested on powerpc64.  power 7 BE, power 8 BE, power 8 
LE, and power 9 LE all went well.


--

-Bill Seurer



gcc-8-20190215 is now available

2019-02-15 Thread gccadmin
Snapshot gcc-8-20190215 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/8-20190215/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-8-branch 
revision 268949

You'll find:

 gcc-8-20190215.tar.xzComplete GCC

  SHA256=27a88eec101063e31b67057d6870cc0641639191a94c3aa36a497b5f746fc20e
  SHA1=95873a3d6e64f4d0f2ec3f73ef7ffda270d79931

Diffs from 8-20190208 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Demoussage de toiture, couverture, ramonage

2019-02-15 Thread Société d’artisans LEON
Découvrez la société d’artisans LEON située à NOZAY en Essonne. Nous 
sommes les spécialistes régionaux de la toiture. Nous exerçons dans ce domaine 
de père en fils depuis 4 générations. Les tuiles, liteaux, gouttières n’ont 
aucuns secrets pour nous, pas plus que le zinc ou l’étanchéisation.Notre équipe 
est uniquement familiale, ce qui explique NOS PRIX ABORDABLES, sans sacrifier 
la qualité !Quelques unes de nos prestations :- Ramonage- Couverture- 
Peinture- Isolation- Nettoyage et démoussage de toitures- Peinture- 
RavalementBien évidemment, nous avons la garantie décennale.A 
bientôt.Appelez-nous au  01.60.14.44.43Ecrivez à lobry. leon / a/ 
aliceadsl.frRm 453607913 RM910MaçonnerieSi vous souhaitez vous désinscrire de 
notre newsletter, vous pouvez répondre en indiquant le mot stopNous avons 
volontairement modifié la forme du numéro de téléphone pour des raisons liées à 
la bonne délivrance du message E-mailing envoyé par FOP KUKSA KIEV 
- office 195, #24 Chavdar Str., Kyiv 02140, Ukraine - reg. 
num. 2 065 000  049975

Re: Parallelize the compilation using Threads

2019-02-15 Thread Oleg Endo
On Tue, 2019-02-12 at 15:12 +0100, Richard Biener wrote:
> On Mon, Feb 11, 2019 at 10:46 PM Giuliano Belinassi
>  wrote:
> > 
> > Hi,
> > 
> > I was just wondering what API should I use to spawn threads and
> > control
> > its flow. Should I use OpenMP, pthreads, or something else?
> > 
> > My point what if we break compatibility with something. If we use
> > OpenMP, I'm afraid that we will break compatibility with compilers
> > not
> > supporting it. On the other hand, If we use pthread, we will break
> > compatibility with non-POSIX systems (Windows).
> 
> I'm not sure we have a thread abstraction for the host - we do have
> one for the target via libgcc gthr.h though.  For prototyping I'd
> resort
> to this same interface and fixup the host != target case as needed.

Or maybe, in the year 2019, we could assume that most c++ compilers
which are used to compile GCC support c++11 and come with an adequate
 implementation...  yeah, I know, sounds jacked :)

Cheers,
Oleg