"H.J. Lu" <hjl.to...@gmail.com> writes:
> On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>> >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
>> >> <gcc-patches@gcc.gnu.org> wrote:
>> >> >
>> >> > When recording store for RTL dead store elimination, check if the source
>> >> > register is set only once to a constant.  If yes, record the constant
>> >> > as the store source.  It eliminates unrolled zero stores after memset 0
>> >> > in a loop where a vector register is used as the zero store source.
>> >> >
>> >> > gcc/
>> >> >
>> >> >         PR rtl-optimization/105638
>> >> >         * dse.cc (record_store): Use the constant source if the source
>> >> >         register is set only once.
>> >> >
>> >> > gcc/testsuite/
>> >> >
>> >> >         PR rtl-optimization/105638
>> >> >         * g++.target/i386/pr105638.C: New test.
>> >> > ---
>> >> >  gcc/dse.cc                               | 19 ++++++++++
>> >> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>> >> >  2 files changed, 63 insertions(+)
>> >> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>> >> >
>> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc
>> >> > index 30c11cee034..0433dd3d846 100644
>> >> > --- a/gcc/dse.cc
>> >> > +++ b/gcc/dse.cc
>> >> > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info)
>> >> >
>> >> >           if (tem && CONSTANT_P (tem))
>> >> >             const_rhs = tem;
>> >> > +         else
>> >> > +           {
>> >> > +             /* If RHS is set only once to a constant, set CONST_RHS
>> >> > +                to the constant.  */
>> >> > +             df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>> >> > +             if (def != nullptr
>> >> > +                 && !DF_REF_IS_ARTIFICIAL (def)
>> >> > +                 && !DF_REF_NEXT_REG (def))
>> >> > +               {
>> >> > +                 rtx_insn *def_insn = DF_REF_INSN (def);
>> >> > +                 rtx def_body = PATTERN (def_insn);
>> >> > +                 if (GET_CODE (def_body) == SET)
>> >> > +                   {
>> >> > +                     rtx def_src = SET_SRC (def_body);
>> >> > +                     if (CONSTANT_P (def_src))
>> >> > +                       const_rhs = def_src;
>> >>
>> >> doesn't DSE have its own tracking of stored values?  Shouldn't we
>> >
>> > It tracks stored values only within the basic block.  When RTL loop
>> > invariant motion hoists a constant initialization out of the loop into
>> > a separate basic block, the constant store value becomes unknown
>> > within the original basic block.
>> >
>> >> improve _that_ if it is not enough?  I also wonder if you need to
>> >
>> > My patch extends DSE stored value tracking to include the constant which
>> > is set only once in another basic block.
>> >
>> >> verify the SET isn't partial?
>> >>
>> >
>> > Here is the v2 patch to check that the constant is set by a non-partial
>> > unconditional load.
>> >
>> > OK for master?
>> >
>> > Thanks.
>> >
>> > H.J.
>> > ---
>> > RTL DSE tracks redundant constant stores within a basic block.  When RTL
>> > loop invariant motion hoists a constant initialization out of the loop
>> > into a separate basic block, the constant store value becomes unknown
>> > within the original basic block.  When recording store for RTL DSE, check
>> > if the source register is set only once to a constant by a non-partial
>> > unconditional load.  If yes, record the constant as the constant store
>> > source.  It eliminates unrolled zero stores after memset 0 in a loop
>> > where a vector register is used as the zero store source.
>> >
>> > gcc/
>> >
>> >       PR rtl-optimization/105638
>> >       * dse.cc (record_store): Use the constant source if the source
>> >       register is set only once.
>> >
>> > gcc/testsuite/
>> >
>> >       PR rtl-optimization/105638
>> >       * g++.target/i386/pr105638.C: New test.
>> > ---
>> >  gcc/dse.cc                               | 22 ++++++++++++
>> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>> >  2 files changed, 66 insertions(+)
>> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>> >
>> > diff --git a/gcc/dse.cc b/gcc/dse.cc
>> > index 30c11cee034..af8e88dac32 100644
>> > --- a/gcc/dse.cc
>> > +++ b/gcc/dse.cc
>> > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info)
>> >
>> >         if (tem && CONSTANT_P (tem))
>> >           const_rhs = tem;
>> > +       else
>> > +         {
>> > +           /* If RHS is set only once to a constant, set CONST_RHS
>> > +              to the constant.  */
>> > +           df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>> > +           if (def != nullptr
>> > +               && !DF_REF_IS_ARTIFICIAL (def)
>> > +               && !(DF_REF_FLAGS (def)
>> > +                    & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
>> > +               && !DF_REF_NEXT_REG (def))
>>
>> Can we introduce a helper for this?  There are already similar tests
>> in ira and loop-iv, and it seems a bit too complex to have to open-code
>> each time.
>
> I can use find_single_def_src in loop-iv.cc:
>
> /* If REGNO has a single definition, return its known value, otherwise return
>    null.  */
>
> rtx
> find_single_def_src (unsigned int regno)

Yeah, reusing that sounds good.  Perhaps we should move it into df-core.cc,
alongside the df_reg_used group of functions.

I think the mode check in your original patch should be kept though,
so how about we change the parameter to an rtx reg and use rtx_equal in:

      rtx set = single_set (DF_REF_INSN (adef));
      if (set == NULL || !REG_P (SET_DEST (set))
          || REGNO (SET_DEST (set)) != regno)
        return NULL_RTX;

rather than the existing !REG_P and REGNO checks.  We should also add:

>
> and do
>
>              /* If RHS is set only once to a constant, set CONST_RHS
>                  to the constant.  */
>               rtx def_src = find_single_def_src (REGNO (rhs));
>               if (def_src != nullptr && CONSTANT_P (def_src))
>                 {
>                   df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>                   if (!(DF_REF_FLAGS (def)
>                         & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))

…this check to the function, since it's needed for correctness even
in the loop-iv.cc use.

Thanks,
Richard

>                   {
>                       rtx_insn *def_insn = DF_REF_INSN (def);
>                       rtx def_body = single_set (def_insn);
>                       if (rhs == SET_DEST (def_body))
>                         const_rhs = def_src;
>                     }
>                 }
>             }
>
> Will it work?
>
> Thanks.
>
>> Thanks,
>> Richard
>>
>> > +             {
>> > +               rtx_insn *def_insn = DF_REF_INSN (def);
>> > +               rtx def_body = PATTERN (def_insn);
>> > +               if (GET_CODE (def_body) == SET)
>> > +                 {
>> > +                   rtx def_src = SET_SRC (def_body);
>> > +                   if (CONSTANT_P (def_src)
>> > +                       && GET_MODE (def_src) == GET_MODE (rhs))
>> > +                     const_rhs = def_src;
>> > +                 }
>> > +             }
>> > +         }
>> >       }
>> >      }
>> >
>> > diff --git a/gcc/testsuite/g++.target/i386/pr105638.C 
>> > b/gcc/testsuite/g++.target/i386/pr105638.C
>> > new file mode 100644
>> > index 00000000000..ff40a459de1
>> > --- /dev/null
>> > +++ b/gcc/testsuite/g++.target/i386/pr105638.C
>> > @@ -0,0 +1,44 @@
>> > +/* { dg-do compile { target { ! ia32 } } } */
>> > +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */
>> > +/* { dg-final { scan-assembler-not "vpxor" } } */
>> > +
>> > +#include <stdint.h>
>> > +#include <vector>
>> > +#include <tr1/array>
>> > +
>> > +class FastBoard {
>> > +public:
>> > +    typedef std::pair<int, int> movescore_t;
>> > +    typedef std::tr1::array<movescore_t, 24> scoredlist_t;
>> > +
>> > +protected:
>> > +    std::vector<int> m_critical;
>> > +
>> > +    int m_boardsize;
>> > +};
>> > +
>> > +class FastState {
>> > +public:
>> > +    FastBoard board;
>> > +
>> > +    int movenum;
>> > +protected:
>> > +    FastBoard::scoredlist_t scoredmoves;
>> > +};
>> > +
>> > +class KoState : public FastState {
>> > +private:
>> > +    std::vector<uint64_t> ko_hash_history;
>> > +    std::vector<uint64_t> hash_history;
>> > +};
>> > +
>> > +class GameState : public KoState {
>> > +public:
>> > +    void foo ();
>> > +private:
>> > +    std::vector<KoState> game_history;
>> > +};
>> > +
>> > +void GameState::foo() {
>> > +    game_history.resize(movenum);
>> > +}

Reply via email to