On Tue, Jul 21, 2020 at 7:16 AM Sunil Pandey <skpg...@gmail.com> wrote:
>
> On Mon, Jul 20, 2020 at 5:06 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Sat, Jul 18, 2020 at 7:57 AM Sunil Pandey <skpg...@gmail.com> wrote:
> > >
> > > On Fri, Jul 17, 2020 at 1:22 AM Richard Biener
> > > <richard.guent...@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 17, 2020 at 7:15 AM Sunil Pandey <skpg...@gmail.com> wrote:
> > > > >
> > > > > Any comment on revised patch? At least,  in finish_decl, decl global 
> > > > > attributes are populated.
> > > >
> > > > +static void
> > > > +ix86_lower_local_decl_alignment (tree decl)
> > > > +{
> > > > +  unsigned new_align = LOCAL_DECL_ALIGNMENT (decl);
> > > >
> > > > please use the macro-expanded call here since we want to amend
> > > > ix86_local_alignment to _not_ return a lower alignment when
> > > > called as LOCAL_DECL_ALIGNMENT (by adding a new parameter
> > > > to ix86_local_alignment).  Can you also amend the patch in this
> > > > way?
> > > >
> > > > +  if (new_align < DECL_ALIGN (decl))
> > > > +    SET_DECL_ALIGN (decl, new_align);
> > > >
> > > > diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> > > > index 81bd2ee94f0..1ae99e30ed1 100644
> > > > --- a/gcc/c/c-decl.c
> > > > +++ b/gcc/c/c-decl.c
> > > > @@ -5601,6 +5601,8 @@ finish_decl (tree decl, location_t init_loc, tree 
> > > > init,
> > > >      }
> > > >
> > > >    invoke_plugin_callbacks (PLUGIN_FINISH_DECL, decl);
> > > > +  /* Lower local decl alignment.  */
> > > > +  lower_decl_alignment (decl);
> > > >  }
> > > >
> > > > should come before plugin hook invocation, likewise for the 
> > > > cp_finish_decl case.
> > > >
> > > > +/* Lower DECL alignment.  */
> > > > +
> > > > +void
> > > > +lower_decl_alignment (tree decl)
> > > > +{
> > > > +  if (VAR_P (decl)
> > > > +      && !is_global_var (decl)
> > > > +      && !DECL_HARD_REGISTER (decl))
> > > > +    targetm.lower_local_decl_alignment (decl);
> > > > +}
> > > >
> > > > please avoid this function, it's name sounds too generic and it's not 
> > > > worth
> > > > adding a public API for two calls.
> > > >
> > > > Alltogether this should avoid the x86 issue leaving left-overs (your 
> > > > identified
> > > > inliner case) as missed optimization [for the linux kernel which 
> > > > appearantly
> > > > decided that -mpreferred-stack-boundary=2 is a good ABI to use].
> > > >
> > > > Richard.
> > > >
> > > >
> > > Revised patch attached.
> >
> > @@ -16776,7 +16783,7 @@ ix86_data_alignment (tree type, unsigned int
> > align, bool opt)
> >
> >  unsigned int
> >  ix86_local_alignment (tree exp, machine_mode mode,
> > -                     unsigned int align)
> > +                     unsigned int align, bool setalign)
> >  {
> >    tree type, decl;
> >
> > @@ -16801,6 +16808,10 @@ ix86_local_alignment (tree exp, machine_mode mode,
> >        && (!decl || !DECL_USER_ALIGN (decl)))
> >      align = 32;
> >
> > +  /* Lower decl alignment.  */
> > +  if (setalign && align < DECL_ALIGN (decl))
> > +    SET_DECL_ALIGN (decl, align);
> > +
> >    /* If TYPE is NULL, we are allocating a stack slot for caller-save
> >       register in MODE.  We will return the largest alignment of XF
> >       and DF.  */
> >
> > sorry for not being clear - the parameter should indicate whether an
> > alignment lower
> > than natural alignment is OK to return thus sth like
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 31757b044c8..19703cbceb9 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -16641,7 +16641,7 @@ ix86_data_alignment (tree type, unsigned int
> > align, bool opt)
> >
> >  unsigned int
> >  ix86_local_alignment (tree exp, machine_mode mode,
> > -                     unsigned int align)
> > +                     unsigned int align, bool may_lower)
> >  {
> >    tree type, decl;
> >
> > @@ -16658,7 +16658,8 @@ ix86_local_alignment (tree exp, machine_mode mode,
> >
> >    /* Don't do dynamic stack realignment for long long objects with
> >       -mpreferred-stack-boundary=2.  */
> > -  if (!TARGET_64BIT
> > +  if (may_lower
> > +      && !TARGET_64BIT
> >        && align == 64
> >        && ix86_preferred_stack_boundary < 64
> >        && (mode == DImode || (type && TYPE_MODE (type) == DImode))
> >
> > I also believe that spill_slot_alignment () should be able to get the
> > lower alignment
> > for long long but not get_stack_local_alignment () (both use
> > STACK_SLOT_ALIGNMENT).
> > Some uses of STACK_SLOT_ALIGNMENT also look fishy with respect to mem 
> > attributes
> > and alignment.
> >
> > Otherwise the patch looks reasonable to salvage a misguided optimization for
> > a non-standard ABI.  If it is sufficient to make the people using that ABI 
> > happy
> > is of course another question.  I'd rather see them stop using it ...
> >
> > That said, I'm hesitant to be the only one OKing this ugliness but I'd
> > immediately
> > OK a patch removing the questionable hunk from ix86_local_alignment ;)
> >
> > Jakub, Jeff - any opinion?
> >
> > Richard.
> >
>
> Revised patch attached.

You are now passing 'true' to ix86_local_alignment for all callers,
that's not correct.
I said at most STACK_SLOT_ALIGNMENT _might_ be able to take the lower alignment
but some callers look suspicious so I wasn't sure.

Which means - please pass false for LOCAL_ALIGNMENT, STACK_SLOT_ALIGNMENT
and LOCAL_DECL_ALIGNMENT.

+  /* Lower local decl alignment.  */
+
+  if (VAR_P (decl)
+      && !is_global_var (decl)
+      && !DECL_HARD_REGISTER (decl))
+    targetm.lower_local_decl_alignment (decl);

the comment is quite useless, just repeating what the code does.
Please rephrase it
as

  /* This is the last point we can lower alignment so give the target the
     chance to do so.  */

and remove the vertical space after the comment.

OK with those changes.
Richard.

> > > > > On Tue, Jul 14, 2020 at 8:37 AM Sunil Pandey <skpg...@gmail.com> 
> > > > > wrote:
> > > > >>
> > > > >> On Sat, Jul 4, 2020 at 9:11 AM Richard Biener
> > > > >> <richard.guent...@gmail.com> wrote:
> > > > >> >
> > > > >> > On July 3, 2020 11:16:46 PM GMT+02:00, Jason Merrill 
> > > > >> > <ja...@redhat.com> wrote:
> > > > >> > >On 6/29/20 5:00 AM, Richard Biener wrote:
> > > > >> > >> On Fri, Jun 26, 2020 at 10:11 PM H.J. Lu <hjl.to...@gmail.com> 
> > > > >> > >> wrote:
> > > > >> > >>>
> > > > >> > >>> On Thu, Jun 25, 2020 at 1:10 AM Richard Biener
> > > > >> > >>> <richard.guent...@gmail.com> wrote:
> > > > >> > >>>>
> > > > >> > >>>> On Thu, Jun 25, 2020 at 2:53 AM Sunil Pandey 
> > > > >> > >>>> <skpg...@gmail.com>
> > > > >> > >wrote:
> > > > >> > >>>>>
> > > > >> > >>>>> On Wed, Jun 24, 2020 at 12:30 AM Richard Biener
> > > > >> > >>>>> <richard.guent...@gmail.com> wrote:
> > > > >> > >>>>>>
> > > > >> > >>>>>> On Tue, Jun 23, 2020 at 5:31 PM Sunil K Pandey via 
> > > > >> > >>>>>> Gcc-patches
> > > > >> > >>>>>> <gcc-patches@gcc.gnu.org> wrote:
> > > > >> > >>>>>>>
> > > > >> > >>>>>>> From: Sunil K Pandey <skpg...@gmail.com>
> > > > >> > >>>>>>>
> > > > >> > >>>>>>> Default for this hook is NOP. For x86, in 32 bit mode, 
> > > > >> > >>>>>>> this hook
> > > > >> > >>>>>>> sets alignment of long long on stack to 32 bits if 
> > > > >> > >>>>>>> preferred
> > > > >> > >stack
> > > > >> > >>>>>>> boundary is 32 bits.
> > > > >> > >>>>>>>
> > > > >> > >>>>>>>   - This patch fixes
> > > > >> > >>>>>>>          gcc.target/i386/pr69454-2.c
> > > > >> > >>>>>>>          gcc.target/i386/stackalign/longlong-1.c
> > > > >> > >>>>>>>   - Regression test on x86-64, no new fail introduced.
> > > > >> > >>>>>>
> > > > >> > >>>>>> I think the name is badly chosen,
> > > > >> > >TARGET_LOWER_LOCAL_DECL_ALIGNMENT
> > > > >> > >>>>>
> > > > >> > >>>>> Yes, I can change the target hook name.
> > > > >> > >>>>>
> > > > >> > >>>>>> would be better suited (and then asks for 
> > > > >> > >>>>>> LOCAL_DECL_ALIGNMENT to
> > > > >> > >be
> > > > >> > >>>>>> renamed to INCREASE_LOCAL_DECL_ALIGNMENT).
> > > > >> > >>>>>
> > > > >> > >>>>> It seems like LOCAL_DECL_ALIGNMENT macro documentation is
> > > > >> > >incorrect.
> > > > >> > >>>>> It increases as well as decreases alignment based on
> > > > >> > >condition(-m32
> > > > >> > >>>>> -mpreferred-stack-boundary=2)
> > > > >> > >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95885
> > > > >> > >>>>>
> > > > >> > >>>>>>
> > > > >> > >>>>>> You're calling it from do_type_align which IMHO is dangerous
> > > > >> > >since that's
> > > > >> > >>>>>> invoked from FIELD_DECL layout as well.  Instead invoke it 
> > > > >> > >>>>>> from
> > > > >> > >>>>>> layout_decl itself where we do
> > > > >> > >>>>>>
> > > > >> > >>>>>>    if (code != FIELD_DECL)
> > > > >> > >>>>>>      /* For non-fields, update the alignment from the type. 
> > > > >> > >>>>>>  */
> > > > >> > >>>>>>      do_type_align (type, decl);
> > > > >> > >>>>>>
> > > > >> > >>>>>> and invoke the hook _after_ do_type_align.  Also avoid
> > > > >> > >>>>>> invoking the hook on globals or hard regs and only
> > > > >> > >>>>>> invoke it on VAR_DECLs, thus only
> > > > >> > >>>>>>
> > > > >> > >>>>>>    if (VAR_P (decl) && !is_global_var (decl) &&
> > > > >> > >!DECL_HARD_REGISTER (decl))
> > > > >> > >>>>>
> > > > >> > >>>>> It seems like decl property is not fully populated at this 
> > > > >> > >>>>> point
> > > > >> > >call
> > > > >> > >>>>> to is_global_var (decl) on global variable return false.
> > > > >> > >>>>>
> > > > >> > >>>>> $ cat foo.c
> > > > >> > >>>>> long long x;
> > > > >> > >>>>> int main()
> > > > >> > >>>>> {
> > > > >> > >>>>> if (__alignof__(x) != 8)
> > > > >> > >>>>>    __builtin_abort();
> > > > >> > >>>>> return 0;
> > > > >> > >>>>> }
> > > > >> > >>>>>
> > > > >> > >>>>> Breakpoint 1, layout_decl (decl=0x7ffff7ffbb40, 
> > > > >> > >>>>> known_align=0)
> > > > >> > >>>>>      at 
> > > > >> > >>>>> /local/skpandey/gccwork/gccwork/gcc/gcc/stor-layout.c:674
> > > > >> > >>>>> 674     do_type_align (type, decl);
> > > > >> > >>>>> Missing separate debuginfos, use: dnf debuginfo-install
> > > > >> > >>>>> gmp-6.1.2-10.fc31.x86_64 isl-0.16.1-9.fc31.x86_64
> > > > >> > >>>>> libmpc-1.1.0-4.fc31.x86_64 mpfr-3.1.6-5.fc31.x86_64
> > > > >> > >>>>> zlib-1.2.11-20.fc31.x86_64
> > > > >> > >>>>> (gdb) call debug_tree(decl)
> > > > >> > >>>>>   <var_decl 0x7ffff7ffbb40 x
> > > > >> > >>>>>      type <integer_type 0x7fffea801888 long long int DI
> > > > >> > >>>>>          size <integer_cst 0x7fffea7e8d38 constant 64>
> > > > >> > >>>>>          unit-size <integer_cst 0x7fffea7e8d50 constant 8>
> > > > >> > >>>>>          align:64 warn_if_not_align:0 symtab:0 alias-set -1
> > > > >> > >>>>> canonical-type 0x7fffea801888 precision:64 min <integer_cst
> > > > >> > >>>>> 0x7fffea7e8fd8 -9223372036854775808> max <integer_cst
> > > > >> > >0x7fffea806000
> > > > >> > >>>>> 9223372036854775807>
> > > > >> > >>>>>          pointer_to_this <pointer_type 0x7fffea8110a8>>
> > > > >> > >>>>>      DI foo.c:1:11 size <integer_cst 0x7fffea7e8d38 64> 
> > > > >> > >>>>> unit-size
> > > > >> > >>>>> <integer_cst 0x7fffea7e8d50 8>
> > > > >> > >>>>>      align:1 warn_if_not_align:0>
> > > > >> > >>>>>
> > > > >> > >>>>> (gdb) p is_global_var(decl)
> > > > >> > >>>>> $1 = false
> > > > >> > >>>>> (gdb)
> > > > >> > >>>>>
> > > > >> > >>>>>
> > > > >> > >>>>> What about calling hook here
> > > > >> > >>>>>
> > > > >> > >>>>>   603 do_type_align (tree type, tree decl)
> > > > >> > >>>>>   604 {
> > > > >> > >>>>>   605   if (TYPE_ALIGN (type) > DECL_ALIGN (decl))
> > > > >> > >>>>>   606     {
> > > > >> > >>>>>   607       SET_DECL_ALIGN (decl, TYPE_ALIGN (type));
> > > > >> > >>>>>   608       if (TREE_CODE (decl) == FIELD_DECL)
> > > > >> > >>>>>   609         DECL_USER_ALIGN (decl) = TYPE_USER_ALIGN 
> > > > >> > >>>>> (type);
> > > > >> > >>>>>   610       else
> > > > >> > >>>>>   611         /* Lower local decl alignment */
> > > > >> > >>>>>   612         if (VAR_P (decl)
> > > > >> > >>>>>   613             && !is_global_var (decl)
> > > > >> > >>>>>   614             && !DECL_HARD_REGISTER (decl)
> > > > >> > >>>>>   615             && cfun != NULL)
> > > > >> > >>>>>   616           targetm.lower_local_decl_alignment (decl);
> > > > >> > >>>>>   617     }
> > > > >> > >>>>
> > > > >> > >>>> But that doesn't change anything (obviously).  layout_decl
> > > > >> > >>>> is called quite early, too early it looks like.
> > > > >> > >>>>
> > > > >> > >>>> Now there doesn't seem to be any other good place where
> > > > >> > >>>> we are sure to catch the decl before we evaluate things
> > > > >> > >>>> like __alignof__
> > > > >> > >>>>
> > > > >> > >>>> void __attribute__((noipa))
> > > > >> > >>>> foo (__SIZE_TYPE__ align, long long *p)
> > > > >> > >>>> {
> > > > >> > >>>>    if ((__SIZE_TYPE__)p & (align-1))
> > > > >> > >>>>      __builtin_abort ();
> > > > >> > >>>> }
> > > > >> > >>>> int main()
> > > > >> > >>>> {
> > > > >> > >>>>    long long y;
> > > > >> > >>>>    foo (_Alignof y, &y);
> > > > >> > >>>>    return 0;
> > > > >> > >>>> }
> > > > >> > >>>>
> > > > >> > >>>> Joseph/Jason - do you have a good recommendation
> > > > >> > >>>> how to deal with targets where natural alignment
> > > > >> > >>>> is supposed to be lowered for optimization purposes?
> > > > >> > >>>> (this case is for i?86 to avoid dynamic stack re-alignment
> > > > >> > >>>> to align long long to 8 bytes with 
> > > > >> > >>>> -mpreferred-stack-boundary=2)
> > > > >> > >>>>
> > > > >> > >>>> I note that for -mincoming-stack-boundary=2 we do perform
> > > > >> > >>>> dynamic stack re-alignment already.
> > > > >> > >>>>
> > > > >> > >>>> I can't find a suitable existing target macro/hook for this,
> > > > >> > >>>> but my gut feeling is that the default alignment should
> > > > >> > >>>> instead be the lower one and instead the alignment for
> > > > >> > >>>> globals should be raised as optimization?
> > > > >> > >>>>
> > > > >> > >>>
> > > > >> > >>> Here is the updated patch from Sunil.
> > > > >> > >>
> > > > >> > >> It does not address the fundamental issue that during
> > > > >> > >> do_type_align the is_global_var predicate is not
> > > > >> > >> reliable.  This means that for
> > > > >> > >>
> > > > >> > >> int main()
> > > > >> > >> {
> > > > >> > >>    extern long z;
> > > > >> > >> }
> > > > >> > >>
> > > > >> > >> the new hook (with -m32 -mpreferred-stack-boundary=2)
> > > > >> > >> will lower the alignment of 'z' which looks wrong.  During
> > > > >> > >> layout_decl we can unfortunately not distinguish between
> > > > >> > >> locals and globals.  We need to find another spot to adjust
> > > > >> > >> alignment of locals.  For C that might be in finish_decl,
> > > > >> > >> for C++ there's probably another suitable place.
> > > > >> > >
> > > > >> > >cp_finish_decl could work, but layout_decl seems like the right 
> > > > >> > >spot;
> > > > >> > >if
> > > > >> > >the problem is that the appropriate flags currently aren't being 
> > > > >> > >set in
> > > > >> > >
> > > > >> > >time, can't we fix that?
> > > > >> >
> > > > >> > The first and usually only call to layout_decl is from build_decl 
> > > > >> > which gets nothing more than the type... But yes, I also initially 
> > > > >> > thought that's the correct spot but it turns out it isn't.
> > > > >>
> > > > >> I added a new function lower_decl_alignment and invoked from
> > > > >> cp_decl_finish/decl_finish. Attached is revised patch.
> > > > >> >
> > > > >> > >> Note it needs to be a place before the frontends possibly
> > > > >> > >> inspect the alignment of the decl
> > > > >> > >> In C++ constexpr evalualtion might also expose alignment
> > > > >> > >> "early" so we really need a frontend solution here.
> > > > >> > >
> > > > >> > >Yes, we need to know the alignment right after the declaration.
> > > > >> > >
> > > > >> > >Jason
> > > > >> >

Reply via email to