Re: [PATCH], V7, #6 of 7, Fix issues with vector extract and prefixed instructions

Michael Meissner Tue, 03 Dec 2019 10:21:02 -0800

On Tue, Nov 26, 2019 at 01:20:20PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Nov 14, 2019 at 06:09:09PM -0500, Michael Meissner wrote:
> > In this case, the current code re-uses the temporary for calculating the 
> > offset
> > of the element to load up the address of the vector, losing the offset.
> 
> Reusing the same pseudo is *always* a bad idea.  You get better
> optimisation if most code is "SSA-like": write to every pseudo only
> once.  Of course you need to violate this where you woule have PHIs in
> actual SSA.


Yes.  I was describing what the current code does (and why I'm fixing it).  It
is a bug.

> > I needed to add a new constraint (em) in addition to new predicate 
> > functions.
> > I discovered that with the predicate function alone, the register allocator
> > would re-create the address.  The constraint prevents this combination.
> 
> It's a reasonable thing to have as a contraint, too.
> 
> Once again, how should things work in inline assembler?  Can we allow
> prefixed addressing there, or should "m" in inline asm mean "em"?

At the moment, I think we should just not allow prefixed addresses in asm
constructs at all.

> 
> > I also modified the vector extract code to generate a single PC-relative 
> > load
> > if the vector has a PC-relative address and the offset is constant.
> 
> That should be handled by the RTL optimisers anyway, no?  Or is this
> for post-reload splitters (a bad idea always).

You have it backwards.  It is the RTL optimizers that combines the vector
extract with the load in the first place.  My code allows the combination and
generates a single instruction.  Otherwise, it would do a PLA and then do the
vector extract with the address loaded.

> >     * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
> >     for optimizing extracting a constant vector element from a vector
> >     that uses a prefixed address.  If the element number is variable
> >     and the address uses a prefixed address, abort.
> 
> It doesn't abort.  Erm, wait, it *does* ICE.  Please make that more
> prominent (in the code).  It's not clear why you mention it in the
> changelog while allt he other details are just "add support"?
> 
> I find the control flow very hard to read here.
> 
> > +  /* Optimize PC-relative addresses with a constant offset.  */
> > +  else if (pcrel_p && CONST_INT_P (element_offset))
> > +    {
> > +      rtx addr2 = addr;
> 
> addr isn't used after this anyway, so you can clobber it just fine.

Generally I prefer not to reuse variables.

> > +  /* With only one temporary base register, we can't support a PC-relative
> > +     address added to a variable offset.  This is because the PADDI 
> > instruction
> > +     requires RA to be 0 when doing a PC-relative add (i.e. no register to 
> > add
> > +     to).  */
> > +  else if (pcrel_p)
> > +    gcc_unreachable ();
> 
> That comment suggests we just ICE when we get unwanted input.  Such a
> comment belongs where we prevent such code from being formed in the
> first place (or nowhere, if it is obvious).

The constraint and predicate is where we prevent it from occuring.  The
gcc_unreachable is just there as insurance that it didn't get recreated later.
During testing before I added the constraint, I found the register allocator
combining the two, even though the predicate didn't allow the combination.  So
the test is just to ICE out if the combination took place.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH], V7, #6 of 7, Fix issues with vector extract and prefixed instructions

Reply via email to