> From: Uros Bizjak <ubiz...@gmail.com>
> Sent: 13 July 2023 19:21
> 
> On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle <ro...@nextmovesoftware.com>
> wrote:
> >
> > This patch resolves PR target/110588 to catch another case in combine
> > where the i386 backend should be generating a btl instruction.  This
> > adds another define_insn_and_split to recognize the RTL representation
> > for this case.
> >
> > I also noticed that two related define_insn_and_split weren't using
> > the preferred string style for single statement
> > preparation-statements, so I've reformatted these to be consistent in style 
> > with
> the new one.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> >
> >
> > 2023-07-13  Roger Sayle  <ro...@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         PR target/110588
> >         * config/i386/i386.md (*bt<mode>_setcqi): Prefer string form
> >         preparation statement over braces for a single statement.
> >         (*bt<mode>_setncqi): Likewise.
> >         (*bt<mode>_setncqi_2): New define_insn_and_split.
> >
> > gcc/testsuite/ChangeLog
> >         PR target/110588
> >         * gcc.target/i386/pr110588.c: New test case.
> 
> +;; Help combine recognize bt followed by setnc (PR target/110588)
> +(define_insn_and_split "*bt<mode>_setncqi_2"
> +  [(set (match_operand:QI 0 "register_operand")  (eq:QI
> +  (zero_extract:SWI48
> +    (match_operand:SWI48 1 "register_operand")
> +    (const_int 1)
> +    (zero_extend:SI (match_operand:QI 2 "register_operand")))
> +  (const_int 0)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "TARGET_USE_BT && ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(set (reg:CCC FLAGS_REG)
> +        (compare:CCC
> +         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> +         (const_int 0)))
> +   (set (match_dup 0)
> +        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]
> +  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
> 
> I don't think the above transformation is 100% correct, mainly due to the use 
> of
> paradoxical subreg.
> 
> The combined instruction is operating with a zero_extended QImode register, so
> all bits of the register are well defined. You are splitting using 
> paradoxical subreg,
> so you don't know what garbage is there in the highpart of the count register.
> However, BTL/BTQ uses modulo 64 (or 32) of this register, so even with a 
> slightly
> invalid RTX, everything checks out.
> 
> +  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
> 
> You probably need <SWI48>mode instead of SImode here.

The define_insn for *bt is:

(define_insn "*bt<mode>"
  [(set (reg:CCC FLAGS_REG)
        (compare:CCC
          (zero_extract:SWI48
            (match_operand:SWI48 0 "nonimmediate_operand" "r,m")
            (const_int 1)
            (match_operand:SI 1 "nonmemory_operand" "r<S>,<S>"))
          (const_int 0)))]

So <SWI48> isn't appropriate here.

But now you've made me think about it, it's inconsistent that all of the shifts
and rotates in i386.md standardize on QImode for shift counts, but the bit test
instructions use SImode?  I think this explains where the paradoxical SUBREGs
come from, and in theory any_extend from QImode to SImode here could/should 
be handled/unnecessary.

Is it worth investigating a follow-up patch to convert all ZERO_EXTRACTs and
SIGN_EXTRACTs in i386.md to use QImode (instead of SImode)?

Thanks in advance,
Roger
--


Reply via email to