On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy...@gmail.com> wrote:
> >
> > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道:
> > >
> > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches 
> > > > wrote:
> > > > > From: Kong Lingling <lingling.k...@intel.com>
> > > > >
> > > > > In inline asm, we do not know if the insn can use EGPR, so disable 
> > > > > EGPR
> > > > > usage by default from mapping the common reg/mem constraint to 
> > > > > non-EGPR
> > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> > > > > for inline asm.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >       * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > > > >       ix86_md_asm_adjust.
> > > > >       (ix86_md_asm_adjust): When APX EGPR enabled without specifying 
> > > > > the
> > > > >       target option, map reg/mem constraints to non-EGPR constraints.
> > > > >       * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >       * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > > > ---
> > > > >  gcc/config/i386/i386.cc                       |  44 +++++++
> > > > >  gcc/config/i386/i386.opt                      |   5 +
> > > > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 
> > > > > ++++++++++++++++++
> > > > >  3 files changed, 156 insertions(+)
> > > > >  create mode 100644 
> > > > > gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > > > >
> > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > > index d26d9ab0d9d..9460ebbfda4 100644
> > > > > --- a/gcc/config/i386/i386.cc
> > > > > +++ b/gcc/config/i386/i386.cc
> > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General 
> > > > > Public License
> > > > >  along with GCC; see the file COPYING3.  If not see
> > > > >  <http://www.gnu.org/licenses/>.  */
> > > > >
> > > > > +#define INCLUDE_STRING
> > > > >  #define IN_TARGET_CODE 1
> > > > >
> > > > >  #include "config.h"
> > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, 
> > > > > vec<rtx> & /*inputs*/,
> > > > >    bool saw_asm_flag = false;
> > > > >
> > > > >    start_sequence ();
> > > > > +  /* TODO: Here we just mapped the general r/m constraints to 
> > > > > non-EGPR
> > > > > +   constraints, will eventually map all the usable constraints in 
> > > > > the future. */
> > > >
> > > > I think there should be some constraint which explicitly has all the 32
> > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm 
> > > > wants.
> > > >
> > > > Also, what about the "g" constraint?  Shouldn't there be another for "g"
> > > > without r16..r31?  What about the various other memory
> > > > constraints ("<", "o", ...)?
> > >
> > > I think we should leave all existing constraints as they are, so "r"
> > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > introduce "h" to instructions that have the ability to handle EGPR.
> > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > register class for EVEX SSE registers. This way, asm will be
> > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > constraint "Bt", should allow new registers, and should be added to
> > > the constraint string as a separate constraint, and conditionally
> > > enabled by relevant "isa" (AKA "enabled") attribute.
> >
> > The extended constraint can work for registers, but for memory it is more
> > complicated.
>
> Yes, unfortunately. The compiler assumes that an unchangeable register
> class is used for BASE/INDEX registers. I have hit this limitation
> when trying to implement memory support for instructions involving
> 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> registers, also inside memory operand. (You can see the "hack" in e.g.
> *extzvqi_mem_rex64" and corresponding peephole2 with the original
> *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> register class is the major limitation in the compiler, so perhaps the
> strategy on how to override this limitation should be discussed with
> the register allocator author first. Perhaps adding an insn attribute
> to insn RTX pattern to specify different BASE/INDEX register sets can
> be a better solution than passing insn RTX to the register allocator.
>
> The above idea still does not solve the asm problem on how to select
> correct BASE/INDEX register set for memory operands.
The current approach disables gpr32 for memory operand in asm_operand
by default. but can be turned on by options
ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
supports gpr32).
Only ~ 5% of total instructions don't support gpr32, reversed approach
only gonna get more complicated.

>
> Uros.
> >
> > If we want to use new mem constraints that allow gpr32, then BASE/INDEX
> > reg class still requires per-insn verification, so it means changes
> > on all patterns with vm, and those SSE patterns on opcode map0/1. Also,
> > several legacy insns that are promoted to EVEX encoding space need to be
> > changed. The overall implementation could be 10 times larger than current,
> > which would be quite hard for maintenance.
> >
> > >
> > > Uros.
> > >
> > > > > +  if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32)
> > > > > +    {
> > > > > +      /* Map "r" constraint in inline asm to "h" that disallows 
> > > > > r16-r31
> > > > > +      and replace only r, exclude Br and Yr.  */
> > > > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > > > +     {
> > > > > +       std::string *s = new std::string (constraints[i]);
> > > >
> > > > Doesn't this leak memory (all the time)?
> > > > I must say I don't really understand why you need to use std::string 
> > > > here,
> > > > but certainly it shouldn't leak.
> > > >
> > > > > +       size_t pos = s->find ('r');
> > > > > +       while (pos != std::string::npos)
> > > > > +         {
> > > > > +           if (pos > 0
> > > > > +               && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B'))
> > > > > +             pos = s->find ('r', pos + 1);
> > > > > +           else
> > > > > +             {
> > > > > +               s->replace (pos, 1, "h");
> > > > > +               constraints[i] = (const char*) s->c_str ();
> > > >
> > > > Formatting (space before *).  The usual way for constraints is 
> > > > ggc_strdup on
> > > > some string in a buffer.  Also, one could have several copies or r (or 
> > > > m, memory (doesn't
> > > > that appear just in clobbers?  And that doesn't look like something that
> > > > should be replaced), Bm, e.g. in various alternatives.  So, you
> > > > need to change them all, not just the first hit.  "r,r,r,m" and the 
> > > > like.
> > > > Normally, one would simply walk the constraint string, parsing the 
> > > > special
> > > > letters (+, =, & etc.) and single letter constraints and 2 letter
> > > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC 
> > > > sources).
> > > > Either do it in 2 passes, first one counts how long constraint string 
> > > > one
> > > > will need after the adjustments (and whether to adjust something at 
> > > > all),
> > > > then if needed XALLOCAVEC it and adjust in there, or say use a
> > > > auto_vec<char, 32> for
> > > > it.
> > > >
> > > > > +               break;
> > > > > +             }
> > > > > +         }
> > > > > +     }
> > > > > +      /* Also map "m/memory/Bm" constraint that may use GPR32, 
> > > > > replace them with
> > > > > +      "Bt/Bt/BT".  */
> > > > > +      for (unsigned i = 0; i < constraints.length (); i++)
> > > > > +     {
> > > > > +       std::string *s = new std::string (constraints[i]);
> > > > > +       size_t pos = s->find ("m");
> > > > > +       size_t pos2 = s->find ("memory");
> > > > > +       if (pos != std::string::npos)
> > > > > +         {
> > > > > +           if (pos > 0 && (s->at (pos - 1) == 'B'))
> > > > > +               s->replace (pos - 1, 2, "BT");
> > > > > +           else if (pos2 != std::string::npos)
> > > > > +               s->replace (pos, 6, "Bt");
> > > > > +           else
> > > > > +               s->replace (pos, 1, "Bt");
> > > >
> > > > Formatting, the s->replace calls are indented too much.
> > > >
> > > >         Jakub
> > > >



-- 
BR,
Hongtao

Reply via email to