On Tue, Nov 11, 2014 at 01:10:01AM -0600, Segher Boessenkool wrote:
> On Mon, Nov 10, 2014 at 05:36:24PM -0500, Michael Meissner wrote:
> > However, the double pattern is completely broken.  This cannot go in.
> 
> [snip]
> 
> > It is unacceptable to have to do the inner loop doing a load, vector add, 
> > and
> > store in the loop.
> 
> Before the patch, the final reduction used *vsx_reduc_splus_v2df; after
> the patch, it is *vsx_reduc_plus_v2df_scalar.  The former does a vector
> add, the latter a float add.  And it uses the same pseudoregister for the
> accumulator throughout.  IRA decides a register is more expensive than
> memory for this, I suppose because it wants both V2DF and DF?  It doesn't
> seem to like the subreg very much.

I haven't looked into in detail (I've been a little busy with th upper regs
patch), but I suspect the problem is that 128-bit and 64-bit types cannot
overlap (i.e. rs6000_cannot_change_mode_class returns true).  This is due to
the fact that scalars in VSX registers occupy the upper 64-bits, which would
not match the compiler's notion of that it should be in the bottom 64-bits.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Reply via email to