https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97205

--- Comment #11 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
(In reply to rguent...@suse.de from comment #10)
> On Wed, 28 Oct 2020, bernd.edlinger at hotmail dot de wrote:
> 
> > --- Comment #9 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
> > (In reply to rguent...@suse.de from comment #7)
> > 
> > Ah, yes, using a similar strategy as we did in r274986
> > where we aligned param_decls, I would say this
> > completely untested patch goes in the same direction:
> > 
> > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> > index f3f17d3..030d0cb 100644
> > --- a/gcc/cfgexpand.c
> > +++ b/gcc/cfgexpand.c
> > @@ -366,7 +366,18 @@ align_local_variable (tree decl, bool really_expand)
> >    unsigned int align;
> > 
> >    if (TREE_CODE (decl) == SSA_NAME)
> > -    align = TYPE_ALIGN (TREE_TYPE (decl));
> > +    {
> > +      tree type = TREE_TYPE (decl);
> > +      machine_mode mode = TYPE_MODE (type);
> > +
> > +      align = TYPE_ALIGN (type);
> > +      if (mode != BLKmode
> > +         && align < GET_MODE_ALIGNMENT (mode)
> > +         && ((optab_handler (movmisalign_optab, mode)
> > +              != CODE_FOR_nothing)
> > +             || targetm.slow_unaligned_access (mode, align)))
> 
> I wouldn't even restrict it this way - after all if we have an
> SSA_NAME its address hasn't been exposed and thus any different
> user requested alignment isn't exposed.  Instead we can work
> to produce optimal code which means aligning according to the mode
> (that's what the RA would do when spilling the 'register').
> 

Agreed.

> Note that this would argue we could fix the types for SSA names
> upfront but there might be some complication when we have to
> adjust the underlying decl and that's a PARM_DECL.
> 

I would like to avoid changing the types to be more aligned,
just make the MEM_ALIGN larger than what the TYPE_ALIGN
requires.  Also in the case of the incoming param_decls,
the decl_rtl may reflect the true alignment of the stack,
which may be higher than required by the type.
The backend looks at the MEM_ALIGN and can in some cases
generate better code, based on the MEM_ALIGN and, it
can assert when we have a misaligned MEM_P, also based
on the value in MEM_ALIGN.

> > +       align = GET_MODE_ALIGNMENT (mode);
> > +    }
> >    else
> >      {
> >        align = LOCAL_DECL_ALIGNMENT (decl);
> > @@ -1022,6 +1033,17 @@ expand_one_stack_var_at (tree decl, rtx base, 
> > unsigned
> > base_align,
> >      }
> > 
> >    set_rtl (decl, x);
> > +
> > +  if (TREE_CODE (decl) == SSA_NAME
> > +      && GET_MODE (x) != BLKmode
> > +      && MEM_ALIGN (x) < GET_MODE_ALIGNMENT (GET_MODE (x))
> > +      && ((optab_handler (movmisalign_optab, GET_MODE (x))
> > +          != CODE_FOR_nothing)
> > +         || targetm.slow_unaligned_access (GET_MODE (x), MEM_ALIGN (x))))
> 
> I failed to track down where we'd expand this to a possibly
> unaligned mem - but is this just bogus MEM_ALIGN set?  Can we instead
> fix that somehow?
> 

No, this fixes the assertion in the back-end, while the first hunk
makes the memory at least mode aligned, the MEM_ALIGN needs to
be adjusted here, GET_MODE_ALIGNMENT (GET_MODE (x)) is a lower
bound of the true alignment.

Also in the case of the PARM_DECLs it is possible that we take
advantage from a MEM_ALIGN which happens to be larger than
what's implied by TYPE_ALIGN.

> > +    {
> > +      gcc_checking_assert (GET_MODE_ALIGNMENT (GET_MODE (x)) <= 
> > base_align);
> > +      set_mem_align (x, GET_MODE_ALIGNMENT (GET_MODE (x)));
> > +    }
> >  }
> > 
> >  class stack_vars_data
> > @@ -1327,13 +1349,11 @@ expand_one_stack_var_1 (tree var)
> >      {
> >        tree type = TREE_TYPE (var);
> >        size = tree_to_poly_uint64 (TYPE_SIZE_UNIT (type));
> > -      byte_align = TYPE_ALIGN_UNIT (type);
> >      }
> >    else
> > -    {
> > -      size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
> > -      byte_align = align_local_variable (var, true);
> > -    }
> > +    size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
> > +
> > +  byte_align = align_local_variable (var, true);
> > 
> >    /* We handle highly aligned variables in expand_stack_vars.  */
> >    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
> > 
> > 
> > > Also, when an SSA name gets a stack location, we should instead use
> > > the underlying decl for the MEM_EXPR, not the SSA name.
> > > 
> > 
> > At least the MEM_EXPRs from this test case don't do that.
> 
> Figured that.  Since we don't always have an underlying decl it's
> probably error-prone.
> 

I tried this, but it did not find anything so far:

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 3706f0a..e7e9b3d 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -2196,6 +2196,7 @@ set_mem_expr (rtx mem, tree expr)
 {
   mem_attrs attrs (*get_mem_attrs (mem));
   attrs.expr = expr;
+  gcc_checking_assert (expr == NULL || TREE_CODE (expr) != SSA_NAME);
   set_mem_attrs (mem, &attrs);
 }


> > > > But isn't it possible that expand_expr_real_1
> > > > returns the original unaligned decl here?
> > > > 
> > > >             case GIMPLE_SINGLE_RHS:
> > > >               {
> > > >                 r = expand_expr_real (gimple_assign_rhs1 (g), target,
> > > >                                       tmode, modifier, alt_rtl,
> > > >                                       inner_reference_p);
> > > >                 break;
> > > >               }
> > > >             default:
> > > >               gcc_unreachable ();
> > > >             }
> > > >           set_curr_insn_location (saved_loc);
> > > >           if (REG_P (r) && !REG_EXPR (r))
> > > >             set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r);
> > > >           return r;
> > 
> > My only concern left is if this code path might somehow return an
> > misaligned MEM_P from that gimple_assign_rhs, but probably
> > there is an obvious reason why that may never happen.
> 
> never say never ... but unless we have a testcase we just claim so.

Reply via email to