https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97205
--- Comment #11 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- (In reply to rguent...@suse.de from comment #10) > On Wed, 28 Oct 2020, bernd.edlinger at hotmail dot de wrote: > > > --- Comment #9 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- > > (In reply to rguent...@suse.de from comment #7) > > > > Ah, yes, using a similar strategy as we did in r274986 > > where we aligned param_decls, I would say this > > completely untested patch goes in the same direction: > > > > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > > index f3f17d3..030d0cb 100644 > > --- a/gcc/cfgexpand.c > > +++ b/gcc/cfgexpand.c > > @@ -366,7 +366,18 @@ align_local_variable (tree decl, bool really_expand) > > unsigned int align; > > > > if (TREE_CODE (decl) == SSA_NAME) > > - align = TYPE_ALIGN (TREE_TYPE (decl)); > > + { > > + tree type = TREE_TYPE (decl); > > + machine_mode mode = TYPE_MODE (type); > > + > > + align = TYPE_ALIGN (type); > > + if (mode != BLKmode > > + && align < GET_MODE_ALIGNMENT (mode) > > + && ((optab_handler (movmisalign_optab, mode) > > + != CODE_FOR_nothing) > > + || targetm.slow_unaligned_access (mode, align))) > > I wouldn't even restrict it this way - after all if we have an > SSA_NAME its address hasn't been exposed and thus any different > user requested alignment isn't exposed. Instead we can work > to produce optimal code which means aligning according to the mode > (that's what the RA would do when spilling the 'register'). > Agreed. > Note that this would argue we could fix the types for SSA names > upfront but there might be some complication when we have to > adjust the underlying decl and that's a PARM_DECL. > I would like to avoid changing the types to be more aligned, just make the MEM_ALIGN larger than what the TYPE_ALIGN requires. Also in the case of the incoming param_decls, the decl_rtl may reflect the true alignment of the stack, which may be higher than required by the type. The backend looks at the MEM_ALIGN and can in some cases generate better code, based on the MEM_ALIGN and, it can assert when we have a misaligned MEM_P, also based on the value in MEM_ALIGN. > > + align = GET_MODE_ALIGNMENT (mode); > > + } > > else > > { > > align = LOCAL_DECL_ALIGNMENT (decl); > > @@ -1022,6 +1033,17 @@ expand_one_stack_var_at (tree decl, rtx base, > > unsigned > > base_align, > > } > > > > set_rtl (decl, x); > > + > > + if (TREE_CODE (decl) == SSA_NAME > > + && GET_MODE (x) != BLKmode > > + && MEM_ALIGN (x) < GET_MODE_ALIGNMENT (GET_MODE (x)) > > + && ((optab_handler (movmisalign_optab, GET_MODE (x)) > > + != CODE_FOR_nothing) > > + || targetm.slow_unaligned_access (GET_MODE (x), MEM_ALIGN (x)))) > > I failed to track down where we'd expand this to a possibly > unaligned mem - but is this just bogus MEM_ALIGN set? Can we instead > fix that somehow? > No, this fixes the assertion in the back-end, while the first hunk makes the memory at least mode aligned, the MEM_ALIGN needs to be adjusted here, GET_MODE_ALIGNMENT (GET_MODE (x)) is a lower bound of the true alignment. Also in the case of the PARM_DECLs it is possible that we take advantage from a MEM_ALIGN which happens to be larger than what's implied by TYPE_ALIGN. > > + { > > + gcc_checking_assert (GET_MODE_ALIGNMENT (GET_MODE (x)) <= > > base_align); > > + set_mem_align (x, GET_MODE_ALIGNMENT (GET_MODE (x))); > > + } > > } > > > > class stack_vars_data > > @@ -1327,13 +1349,11 @@ expand_one_stack_var_1 (tree var) > > { > > tree type = TREE_TYPE (var); > > size = tree_to_poly_uint64 (TYPE_SIZE_UNIT (type)); > > - byte_align = TYPE_ALIGN_UNIT (type); > > } > > else > > - { > > - size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var)); > > - byte_align = align_local_variable (var, true); > > - } > > + size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var)); > > + > > + byte_align = align_local_variable (var, true); > > > > /* We handle highly aligned variables in expand_stack_vars. */ > > gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT); > > > > > > > Also, when an SSA name gets a stack location, we should instead use > > > the underlying decl for the MEM_EXPR, not the SSA name. > > > > > > > At least the MEM_EXPRs from this test case don't do that. > > Figured that. Since we don't always have an underlying decl it's > probably error-prone. > I tried this, but it did not find anything so far: diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 3706f0a..e7e9b3d 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -2196,6 +2196,7 @@ set_mem_expr (rtx mem, tree expr) { mem_attrs attrs (*get_mem_attrs (mem)); attrs.expr = expr; + gcc_checking_assert (expr == NULL || TREE_CODE (expr) != SSA_NAME); set_mem_attrs (mem, &attrs); } > > > > But isn't it possible that expand_expr_real_1 > > > > returns the original unaligned decl here? > > > > > > > > case GIMPLE_SINGLE_RHS: > > > > { > > > > r = expand_expr_real (gimple_assign_rhs1 (g), target, > > > > tmode, modifier, alt_rtl, > > > > inner_reference_p); > > > > break; > > > > } > > > > default: > > > > gcc_unreachable (); > > > > } > > > > set_curr_insn_location (saved_loc); > > > > if (REG_P (r) && !REG_EXPR (r)) > > > > set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r); > > > > return r; > > > > My only concern left is if this code path might somehow return an > > misaligned MEM_P from that gimple_assign_rhs, but probably > > there is an obvious reason why that may never happen. > > never say never ... but unless we have a testcase we just claim so.