Hi! On Sat, Jan 27, 2018 at 11:16:00AM -0500, Michael Meissner wrote: > By allowing the ++/-- support for int indexes, it causes IV-OPTS to not > properly optimize the loop. > > This patch restores the 240841 behavior of not allowing ++/-- for scalar > floating point types. The loop looks like: > > .L3: > lfdx 0,4,9 > lfdx 12,5,9 > fadd 0,0,12 > stfdx 0,3,9 > addi 9,9,8 > bdnz .L3 > > Which is small enough to align the loop to 32 bytes.
More importantly ;-), this is better than doing three increments in the loop (as the other case does). Which then as a bonus makes an IV choice that allows bdnz win. > However, it does cause slight changes in running Spec 2006 on a little endian > power8 system: > > 471.omnetpp: 1.3% faster > 434.zeusmp: 1.9% slower > 435.gromacs: 2.0% faster > 482.sphinx3: 1.4% faster That looks good. Would be nice to figure out what causes the zeusmp regression. Also, with or without this patch, it seems we give pretty wrong costs to ivopts. We should investigate that, too (for GCC 9, unless there is some obvious and simple thing we do wrong currently). > PR target/81550 > * config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): If > floating point scalars are allowed in traditional Altivec > registers, don't allow PRE_{INC,DEC,MODIFY} on those floating > point types. > Index: gcc/config/rs6000/rs6000.c > =================================================================== > --- gcc/config/rs6000/rs6000.c (revision 257024) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -2990,6 +2990,8 @@ rs6000_setup_reg_addr_masks (void) > && !VECTOR_MODE_P (m2) > && !FLOAT128_VECTOR_P (m2) > && !complex_p > + && (m != E_DFmode || !TARGET_VSX) > + && (m != E_SFmode || !TARGET_P8_VECTOR) > && !small_int_vsx_p) > { > addr_mask |= RELOAD_REG_PRE_INCDEC; (Maybe add a comment?) This is okay for trunk. Thanks! Segher