[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-10 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Alan Modra  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Alan Modra  ---
Fixed, and the missing testcase went in rev 239343

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-10 Thread amodra at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #17 from Alan Modra  ---
Author: amodra
Date: Wed Aug 10 23:12:11 2016
New Revision: 239342

URL: https://gcc.gnu.org/viewcvs?rev=239342=gcc=rev
Log:
[LRA] Reload of slow mems

pr71680.c -m64 -O1 -mlra, ira output showing two problem insns.
(insn 7 5 26 3 (set (reg:SI 159 [ a ])
(mem/c:SI (reg/f:DI 158) [1 a+0 S4 A8])) pr71680.c:13 464
{*movsi_internal1}
 (expr_list:REG_EQUIV (mem/c:SI (reg/f:DI 158) [1 a+0 S4 A8])
(nil)))
(insn 26 7 27 3 (set (reg:DI 162)
(unspec:DI [
(fix:SI (subreg:SF (reg:SI 159 [ a ]) 0))
] UNSPEC_FCTIWZ)) pr71680.c:13 372 {fctiwz_sf}
 (expr_list:REG_DEAD (reg:SI 159 [ a ])
(nil)))
Insn 26 requires that reg 159 be of class FLOAT_REGS.

first lra action:
deleting insn with uid = 7.
Changing pseudo 159 in operand 1 of insn 26 on equiv [r158:DI]
  Creating newreg=164, assigning class ALL_REGS to subreg reg r164
   26: r162:DI=unspec[fix(r164:SI#0)] 7
  REG_DEAD r159:SI
Inserting subreg reload before:
   30: r164:SI=[r158:DI]
[snip]
  Change to class FLOAT_REGS for r164

Well, that didn't do much.  lra tried the equiv mem, found that didn't
work, and had to reload.  Effectively getting back to the two original
insns but r159 replaced with r164.  simplify_operand_subreg did not do
anything in this case because SLOW_UNALIGNED_ACCESS was true (wrongly
for power8, but that's beside the point).  So now we have, using
abbreviated rtl notation:
r164:SI=[r158:DI]
r162:DI=unspec[fix(r164:SI)]
The problem here is that the first insn isn't valid, due to the rs6000
backend not supporting SImode in fprs, and r164 must be an fpr to make
the second insn valid.

next lra action:
  Creating newreg=165 from oldreg=164, assigning class GENERAL_REGS to r165
   30: r165:SI=[r158:DI]
Inserting insn reload after:
   31: r164:SI=r165:SI
so now we have
r165:SI=[r158:DI]
r164:SI=r165:SI
r162:DI=unspec[fix(r164:SI)]

This ought to be good on power8, except for one little thing.
r165 is GENERAL_REGS so the first insn is good, a gpr load from mem.
r164 is FLOAT_REGS, making the last insn good, a fctiwz.
The second insn ought to be a sldi, mtvsrd, xscvspdpn combination, but
that is only supported for SFmode.  So lra continue on reloading the
second insn, but in vain because it never tries anything other than
SImode and as noted above, SImode is not valid in fprs.

What this patch does is arrange to emit the two reloads needed for the
SLOW_UNALIGNED_ACCESS case at once, moving the subreg to the second
insn in order to switch modes, producing:

r164:SI=[r158:DI]
r165:SF=r164:SI#0
r162:DI=unspec[fix(r165:SF)]

I've also tidied a couple of other things:
1) "old" is unnecessary as it duplicated "operand".
2) Rejecting mem subregs due to SLOW_UNALIGNED_ACCESS only makes sense
if the original mode was not slow.

PR target/71680
* lra-constraints.c (simplify_operand_subreg): Allow subreg
mode for mem when SLOW_UNALIGNED_ACCESS if inner mode is also
slow.  Emit two reloads for slow mem case, first loading in
fast innermode, then converting to required mode.
testsuite/
* gcc.target/powerpc/pr71680.c: New.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-constraints.c
trunk/gcc/testsuite/ChangeLog

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-09 Thread amodra at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #16 from Alan Modra  ---
Author: amodra
Date: Wed Aug 10 05:43:36 2016
New Revision: 239317

URL: https://gcc.gnu.org/viewcvs?rev=239317=gcc=rev
Log:
[RS6000] e500 part of pr71680

The fallback part of HARD_REGNO_CALLER_SAVE_MODE, choose_hard_reg_mode,
returns DFmode for SImode when TARGET_E500_DOUBLE.  This confuses
lra when attempting to save ctr around a call.

PR target/71680
* config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Return
SImode for TARGET_E500_DOUBLE when given SImode.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.h

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-09 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #15 from Arseny Solokha  ---
(In reply to Alan Modra from comment #14)
> Arseny, you might like to try this.  I don't have the means at the moment to
> properly test e500 support (ie. run the gcc testsuite) without building a
> whole lot of infrastructure.

This patch fixed the original issue and survived regression testing w/o
introducing any new regression.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-05 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #14 from Alan Modra  ---
Created attachment 39056
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39056=edit
save SImode regs in SImode

Arseny, you might like to try this.  I don't have the means at the moment to
properly test e500 support (ie. run the gcc testsuite) without building a whole
lot of infrastructure.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-02 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #13 from Alan Modra  ---
The e500 issue is quite different, and is not fixed by my lra patch.  From the
lra dump,

  Creating newreg=436, assigning class NO_REGS to save r436
  536: r192:SI=0x1
  REG_EQUAL 0x1
Add reg<-save after:
  621: r362:SI#0=r436:DF

  184: NOTE_INSN_BASIC_BLOCK 21
Add save<-reg after:
  620: r436:DF=r362:SI#0

So r362 is being saved for some reason, then later:

  Reassigning non-reload pseudos
   Assign 66 to r362 (freq=4000)

so r362 is finding its way into ctr.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-08-02 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Alan Modra  changed:

   What|Removed |Added

   Keywords||patch
 Status|NEW |ASSIGNED
URL||https://gcc.gnu.org/ml/gcc-
   ||patches/2016-08/msg00113.ht
   ||ml
   Assignee|unassigned at gcc dot gnu.org  |amodra at gmail dot com

--- Comment #12 from Alan Modra  ---
Bug reproduced on powerpc-e500v2-linux-gnuspe.  The testcase needs -Os -mlra
-fstack-protector -fPIC, or you need to configure gcc so that -fPIC and
-fstack-protector are on by default.  I haven't yet tested whether the lra
patch I posted for comment #1 testcase also fixes the e500 testcase.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-07-31 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #11 from Arseny Solokha  ---
(In reply to Alan Modra from comment #10)
> Arseny, I could not reproduce the problem using your testcase, and I tried a
> dozen or so revisions around 20160626 buiding powerpc-e500v2-linux-gnuspe
> cross-compilers on an x86_64-linux host.  Please specify the svn revision
> number or git commit that you used, and your gcc configure parameters.

I test weekly gcc snapshots from ftp://gcc.gnu.org/pub/gcc/snapshots. I'm still
able to reproduce it w/ 7.0.0_alpha20160731. svn revisions for 20160626 and
20160731 are r237793 and r238930, respectively, according to [1,2].

% powerpc-e500v2-linux-gnuspe-gcc-7.0.0-alpha20160731 -v 
Using built-in specs.
COLLECT_GCC=powerpc-e500v2-linux-gnuspe-gcc-7.0.0-alpha20160731
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731/lto-wrapper
Target: powerpc-e500v2-linux-gnuspe
Configured with:
/var/tmp/portage/cross-powerpc-e500v2-linux-gnuspe/gcc-7.0.0_alpha20160731/work/gcc-7-20160731/configure
--host=x86_64-pc-linux-gnu --target=powerpc-e500v2-linux-gnuspe
--build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/powerpc-e500v2-linux-gnuspe/gcc-bin/7.0.0-alpha20160731
--includedir=/usr/lib/gcc/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731/include
--datadir=/usr/share/gcc-data/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731
--mandir=/usr/share/gcc-data/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731/man
--infodir=/usr/share/gcc-data/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731/include/g++-v7
--with-python-dir=/share/gcc-data/powerpc-e500v2-linux-gnuspe/7.0.0-alpha20160731/python
--enable-languages=c,c++ --enable-obsolete --enable-secureplt --disable-werror
--with-system-zlib --disable-nls --enable-checking=yes --enable-libstdcxx-time
--enable-poison-system-directories
--with-sysroot=/usr/powerpc-e500v2-linux-gnuspe --disable-bootstrap
--enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --disable-altivec
--disable-fixed-point --enable-e500-double --enable-targets=all
--disable-libgcj --enable-libgomp --disable-libmudflap --disable-libssp
--disable-libcilkrts --disable-libmpx --disable-vtable-verify --disable-libvtv
--disable-libquadmath --enable-lto --with-isl --disable-isl-version-check
--enable-libsanitizer --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 7.0.0-alpha20160731 20160731 (experimental) (GCC)

I can attach RTL dumps if necessary.


[1] https://gcc.gnu.org/ml/gcc/2016-06/msg00155.html
[2] https://gcc.gnu.org/ml/gcc/2016-07/msg00253.html

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-07-29 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #10 from Alan Modra  ---
Arseny, I could not reproduce the problem using your testcase, and I tried a
dozen or so revisions around 20160626 buiding powerpc-e500v2-linux-gnuspe
cross-compilers on an x86_64-linux host.  Please specify the svn revision
number or git commit that you used, and your gcc configure parameters.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-07-27 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #9 from Alan Modra  ---
lra doesn't load in SFmode due to the following condition in
lra-constraints.c:simplify_operand_subreg

  /* If we change address for paradoxical subreg of memory, the
 address might violate the necessary alignment or the access might
 be slow.  So take this into consideration.  We should not worry
 about access beyond allocated memory for paradoxical memory
 subregs as we don't substitute such equiv memory (see processing
 equivalences in function lra_constraints) and because for spilled
 pseudos we allocate stack memory enough for the biggest
 corresponding paradoxical subreg.  */
  if (MEM_P (reg)
  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg))
  || MEM_ALIGN (reg) >= GET_MODE_ALIGNMENT (mode)))

MEM_ALIGN here is 8 bits (from #pragma pack, and yes, the mem really is only
byte aligned), and rs6000.h does say that this access might be slow if in
SFmode.  It's true that an unaligned floating point storage access on power
might cause an alignment trap, so leaving aside the issue that mode only maps
loosely to register class, I think the rs6000.h definition of
SLOW_UNALIGNED_ACCESS is correct and lra is doing the right thing here.  Reload
is wrong to use a fp load (if mem align was always accurate, which it isn't).

Hmm, a change that doesn't cure this problem, but the condition might be better
as

  if (MEM_P (reg)
  && (! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg))
  || SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg))
  || MEM_ALIGN (reg) >= GET_MODE_ALIGNMENT (mode)))

ie. if both innermode and mode are slow then lra may as well go ahead and use
the subreg mode.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-07-27 Thread amodra at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Alan Modra  changed:

   What|Removed |Added

 CC||amodra at gmail dot com

--- Comment #8 from Alan Modra  ---
From -m32 -O2 dumps of Anton's testcase:

(insn 8 9 33 3 (set (reg:SI 160 [ a ])
(mem/c:SI (reg/f:SI 162) [1 a+0 S4 A8]))
/home/alan/src/tmp/pr71680.c:13 464 {*movsi_internal1}
 (expr_list:REG_EQUIV (mem/c:SI (reg/f:SI 162) [1 a+0 S4 A8])
(expr_list:REG_EQUAL (mem/c:SI (symbol_ref:SI ("a") [flags 0x84] 
) [1 a+0 S4 A8])
(nil

Looking at the insn that loads reg 160 above, you'll notice movsi_internal1,
which doesn't have an alternative to allow an fpr as is required by insn 32. 
What's more, SImode isn't allowed in fprs (see rs6000_hard_regno_mode_ok) so it
doesn't make sense to add such an alternative.

What happens in reload is that reg 160 equiv mem is substituted into insn 32
then reload cleverly reloads the subreg:
Reloads for insn # 32
Reload 0: reload_in (SF) = (mem/c:SF (reg/f:SI 31 31 [162]) [1 a+0 S4 A8])
FLOAT_REGS, RELOAD_FOR_INPUT (opnum = 1), can't combine
reload_in_reg: (subreg:SF (reg:SI 160 [ a ]) 0)
reload_reg_rtx: (reg:SF 44 12)

So we load from mem in SFmode, and insn 8 is deleted.  lra apparently doesn't
use the trick of changing to the mode of the subreg.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #7 from Segher Boessenkool  ---
We have an insn:

(insn 32 33 34 3 (set (reg:DI 165)
(unspec:DI [
(fix:SI (subreg:SF (reg:SI 160 [ a ]) 0))
] UNSPEC_FCTIWZ)) 71680.c:11 334 {fctiwz_sf}
 (expr_list:REG_DEAD (reg:SI 160 [ a ])
(nil)))

160 is allocated memory by IRA.  LRA does:

Changing pseudo 160 in operand 1 of insn 32 on equiv [r162:SI]
  Creating newreg=167, assigning class ALL_REGS to subreg reg r167
   32: r165:DI=unspec[fix(r167:SI#0)] 7
  REG_DEAD r160:SI
Inserting subreg reload before:
   37: r167:SI=[r162:SI]

and from then on it keeps loading 167 into another (new) SImode reg,
which never magically becomes a float reg ;-)

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #6 from Segher Boessenkool  ---
Yes, but it fails with -m32 too.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #5 from Arseny Solokha  ---
(In reply to Segher Boessenkool from comment #4)
> Oh never mind, I forgot -mlra, duh.
> 
> Confirmed, also on BE, -O2 -mlra fails already.

I can't make it fail on 32-bit BE, though. Segher, is your machine powerpc64?

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-06-28
 Ever confirmed|0   |1

--- Comment #4 from Segher Boessenkool  ---
Oh never mind, I forgot -mlra, duh.

Confirmed, also on BE, -O2 -mlra fails already.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

--- Comment #3 from Segher Boessenkool  ---
anton: What flags for your test case?  I fail to reproduce it.

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread jiwang at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Jiong Wang  changed:

   What|Removed |Added

 CC||jiwang at gcc dot gnu.org

--- Comment #2 from Jiong Wang  ---
Does a revert of r237277 fix this issue?

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread anton at samba dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Anton Blanchard  changed:

   What|Removed |Added

 CC||anton at samba dot org

--- Comment #1 from Anton Blanchard  ---
I'm hitting this on powerpc64le with the following test case:

#pragma pack(1)
struct {
float f0;
} a;

void foo(int);

int main(void)
{
for (;;)
foo((int)a.f0);
}

[Bug target/71680] [7 Regression] ICE: Max. number of generated reload insns per insn is achieved (90) w/ -Os -mlra

2016-06-28 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org
   Target Milestone|--- |7.0