Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

Jeff Law Sat, 23 Dec 2023 21:27:29 -0800



On 12/23/23 15:46, YunQiang Su wrote:

Jeff Law <jeffreya...@gmail.com> 于2023年12月24日周日 00:51写道：




On 12/23/23 01:58, YunQiang Su wrote:

On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
if 31 or above bits is polluted by an bitops, we will need an
truncate. Let's emit one, and mark let's use the same hardreg
as in and out, the RTL may like:

(insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
          (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
       (nil))

We use /s/u flags to mark it as really needed, as in
combine_simplify_rtx, this insn may be considered as truncated,
so let's skip this combination.

gcc/ChangeLog:
          PR: 104914.
          * combine.cc (try_combine): Skip combine with truncate if
       dest is subreg and has /u/s flags on platforms
       TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true.
       * expr.cc (expand_assignment): Emit a truncate insn, if
       31+ bits is polluted for SImode.

gcc/testsuite/ChangeLog:
       PR: 104914.
       * gcc.target/mips/pr104914.c: New testcase.

I would suggest you show the RTL before/after whatever transformation
has caused problems on your target and explain why you think the
transformation is incorrect.


Before this patch, the RTL is like this
      (insn 19 18 20 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
                (const_int 8 [0x8])
                (const_int 24 [0x18]))
            (subreg:DI (reg:QI 205) 0)) "../xx.c":7:29 -1
         (nil))
       (insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
            (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0)))
"../xx.c":7:29 -1
        (nil))
      (jump_insn 23 20 24 2 (set (pc)
            (if_then_else (lt (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
                    (const_int 0 [0]))
                 (label_ref 32)
                (pc))) "../xx.c":10:5 -1
        (int_list:REG_BR_PROB 440234148 (nil))
       -> 32)

and then, when combine
       (insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
              (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0)))
"../xx.c":7:29 -1
       (nil))
will be convert to
           (note 20 19 23 2 NOTE_INSN_DELETED)
MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true
based on that the hard register is always sign-extended, but here
the hard register is polluted by zero_extract.

If we just patch combine.cc to make it not eat sign_extend, here,
sign_extend will still disappear in the later passes, due to mips define
sign_extend as "emit_note (NOTE_INSN_DELETED)".

So I tried to insert a new truncate RTX here,
     (insn 21 20 24 2 (set (reg/v:DI 200 [ val ])
              (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
          (nil))
This is the RTL for this C code
      int32_t fun (int64_t arg) {
           int32_t a = (int32_t) arg;
           return a;
      }
But, the `reload` pass will get an ICE. I haven't dig the real problem.
If the new RTX is
     (insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
            (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
        (nil))
`reload` pass will happily accept it, and then it is converted to
      # this instruction will be sure the reg is well sign extended.
      `sll $rN, $rN, 0`
hard instruction.

The problem is that simple-rtx (called by combine) will believe that
REG 200 has been truncated to SImode, as the dest has an
subreg:SI.

So, I use /s/u flags to tell combine don't do so.

Focus on the RTL semantics as well as the target specific semantics
because both are critically important here.

I strongly suspect you're just papering over a problem elsewhere.


Yes. I also guess so.  Any new idea?

Well, I see multiple intertwined issues and I think MIPS has largelymucked this up.

At a high level DI -> SI truncation is not a nop on MIPS64. We mustexplicitly sign extend the value from SI->DI to preserve the invariantthat SI mode objects are extended to DImode. If we fail to do that,then the SImode conditional branch patterns simply aren't going to work.

What doesn't make sense to me is that for truncation, the output mode isgoing to be smaller than the input mode. Which makes logical sense andis codified in the documentation:

@deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (poly_uint64 
@var{outprec}, poly_uint64 @var{inprec})
This hook returns true if it is safe to ``convert'' a value of
@var{inprec} bits to one of @var{outprec} bits (where @var{outprec} is
smaller than @var{inprec}) by merely operating on it as if it had only
@var{outprec} bits.  The default returns true unconditionally, which
is correct for most machines.  When @code{TARGET_TRULY_NOOP_TRUNCATION}
returns false, the machine description should provide a @code{trunc}
optab to specify the RTL that performs the required truncation.



Yet the implementation in the mips backend:

static bool
mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
{
  return !TARGET_64BIT || inprec <= 32 || outprec > 32;
}

Can you verify what values are getting in here? If we're being calledwith inprec as 32 and outprec as 64, we're going to return true whichmakes absolutely no sense at all.


Jeff

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

Reply via email to