On Tue, Mar 1, 2022 at 5:23 PM Hongtao Liu <crazy...@gmail.com> wrote:
>
> On Wed, Mar 2, 2022 at 6:49 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Tue, Mar 1, 2022 at 7:06 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > On Mon, Feb 28, 2022 at 9:36 PM Hongtao Liu <crazy...@gmail.com> wrote:
> > > >
> > > > On Tue, Mar 1, 2022 at 10:39 AM H.J. Lu via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > On Mon, Feb 28, 2022 at 6:26 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Feb 28, 2022 at 6:03 PM liuhongt <hongtao....@intel.com> 
> > > > > > wrote:
> > > > > > >
> > > > > > > .. in ix86_expand_vector_move and
> > > > > > > ix86_convert_const_wide_int_to_broadcast(called by the former).
> > > > > > >
> > > > > > > ix86_expand_vector_move is called by emit_move_insn which is used 
> > > > > > > by
> > > > > > > many pre_reload passes, ix86_gen_scratch_sse_rtx will break data 
> > > > > > > flow
> > > > > > > when there's explict usage of xmm7/xmm15/xmm31.
> > > > > > >
> > > > > > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> > > > > > > for both w/and w/o --with-cpu=native --with-arch=native.
> > > > > > >
> > > > > > > Ok for trunk?
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >         PR target/104704
> > > > > > >         * config/i386/i386-expand.cc
> > > > > > >         (ix86_convert_const_wide_int_to_broadcast): Replace
> > > > > > >         ix86_gen_scratch_sse_rtx with gen_reg_rtx.
> > > > > > >         (ix86_expand_vector_move): Ditto.
> > > > > > >         * config/i386/sse.md (*vec_dupv4si): Add alternative $r 
> > > > > > > and
> > > > > > >         corresponding splitter after it.
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >         * gcc.target/i386/incoming-11.c: Revert 
> > > > > > > r12-2665-g7f4c3943f795fd.
> > > > > > >         * gcc.target/i386/pr100865-11b.c: Expect vmovdqa or 
> > > > > > > vmovda64.
> > > > > > >         * gcc.target/i386/pr100865-12b.c: Ditto.
> > > > > > >         * gcc.target/i386/pr100865-8b.c: Ditto.
> > > > > > >         * gcc.target/i386/pr100865-9b.c: Ditto.
> > > > > > >         * gcc.target/i386/pr82941-1.c: Expect vzeroupper for ! 
> > > > > > > ia32.
> > > > > > >         * gcc.target/i386/pr82942-1.c: Ditto.
> > > > > > >         * gcc.target/i386/pr82990-1.c: Ditto.
> > > > > > >         * gcc.target/i386/pr82990-3.c: Ditto.
> > > > > > >         * gcc.target/i386/pr82990-5.c: Ditto.
> > > > > > > ---
> > > > > > >  gcc/config/i386/i386-expand.cc               |  6 +--
> > > > > > >  gcc/config/i386/sse.md                       | 41 
> > > > > > > +++++++++++++++-----
> > > > > > >  gcc/testsuite/gcc.target/i386/incoming-11.c  |  2 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr100865-11b.c |  2 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr100865-12b.c |  2 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr100865-8b.c  |  2 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr100865-9b.c  |  2 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr82941-1.c    |  3 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr82942-1.c    |  3 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr82990-1.c    |  3 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr82990-3.c    |  3 +-
> > > > > > >  gcc/testsuite/gcc.target/i386/pr82990-5.c    |  3 +-
> > > > > > >  12 files changed, 45 insertions(+), 27 deletions(-)
> > > > > > >
> > > > > > > diff --git a/gcc/config/i386/i386-expand.cc 
> > > > > > > b/gcc/config/i386/i386-expand.cc
> > > > > > > index faa0191c6dd..75a28cdd89d 100644
> > > > > > > --- a/gcc/config/i386/i386-expand.cc
> > > > > > > +++ b/gcc/config/i386/i386-expand.cc
> > > > > > > @@ -257,7 +257,7 @@ ix86_convert_const_wide_int_to_broadcast 
> > > > > > > (machine_mode mode, rtx op)
> > > > > > >    machine_mode vector_mode;
> > > > > > >    if (!mode_for_vector (broadcast_mode, nunits).exists 
> > > > > > > (&vector_mode))
> > > > > > >      gcc_unreachable ();
> > > > > > > -  rtx target = ix86_gen_scratch_sse_rtx (vector_mode);
> > > > > > > +  rtx target = gen_reg_rtx (vector_mode);
> > > > > >
> > > > > > I think ix86_gen_scratch_sse_rtx should check
> > > > > > currently_expanding_gimple_stmt == NULL
> > > > > > to return gen_reg_rtx (vector_mode) instead.
> > > > >
> > > > > Like this:
> > > > >
> > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > > index b2bf90576d5..6c0e4929914 100644
> > > > > --- a/gcc/config/i386/i386.cc
> > > > > +++ b/gcc/config/i386/i386.cc
> > > > > @@ -23786,7 +23786,7 @@ ix86_optab_supported_p (int op, machine_mode
> > > > > mode1, machine_mode,
> > > > >  rtx
> > > > >  ix86_gen_scratch_sse_rtx (machine_mode mode)
> > > > >  {
> > > > > -  if (TARGET_SSE && !lra_in_progress)
> > > > > +  if (TARGET_SSE && currently_expanding_gimple_stmt)
> > > > >      {
> > > > >        unsigned int regno;
> > > > >        if (TARGET_64BIT)
> > > > > (END)
> > > > Looks like it relies on PR104721.
> > >
> > > I have checked the fix for PR104721.
> > >
> >
> > The proposed patch doesn't fix the testcase in:
> >
> The original patch can, then i prefer my patch to
> currently_expanding_gimple_stmt.
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704
> >
> > I am testing:
> >
> > https://gitlab.com/x86-gcc/gcc/-/merge_requests/28
> >
> > --
> > H.J.

There are 2 kinds of issues in

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

1.

__m512d y, z;

int i;

int
do_test (void)
{
  register int xmm31 __asm ("xmm31") = i;
  asm volatile ("" : "+v" (xmm31));
  z = y;
  register int xmm2 __asm ("xmm2") = xmm31;
  asm volatile ("" : "+v" (xmm2));
  return xmm2;
}

2.

char z[128];

int i;

__attribute__((noipa))
int
do_test (void)
{
  register int xmm31 __asm ("xmm31") = i;
  asm volatile ("" : "+v" (xmm31));
  __builtin_memset (&z, 0, sizeof (z));
  register int xmm2 __asm ("xmm2") = xmm31;
  asm volatile ("" : "+v" (xmm2));
  return xmm2;
}

Your patch fixes #1.  I don't think it fixes #2.

-- 
H.J.

Reply via email to