https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124892

--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <[email protected]>:

https://gcc.gnu.org/g:6d3dc4e82f24d7d5199f99bb82cfe4888063fb1b

commit r16-8696-g6d3dc4e82f24d7d5199f99bb82cfe4888063fb1b
Author: Jakub Jelinek <[email protected]>
Date:   Thu Apr 16 10:03:21 2026 +0200

    i386: Fix up TARGET_AVOID_FALSE_DEP_FOR_BMI APX NF splitters [PR124892]

    The following testcase is miscompiled because the 3
    TARGET_AVOID_FALSE_DEP_FOR_BMI APX NF splitters use ix86_expand_clear.
    All other uses of ix86_expand_clear are on either splitters where we know
    something clobbers flags register or sets it at the end of pattern (so
    clearly flags register is not live across the pattern) or in
    define_peephole2 where we explicitly check peep2_regno_dead_p (?,
FLAGS_REG).
    Now, ix86_expand_clear handles right the QI/HImode cases by setting SImode
    instead and based on TARGET_USE_MOV0 and/or optimize_insn_for_size_p
    decides whether to use xor reg, reg form or mov $0, reg.
    Now, for these 3 APX NF splitters there is actually no flags clobber nor
set
    in the pattern and because it is a splitter, we don't know if flags
register
    is live across it (likely yes, otherwise why the APX NF pattern would be
    used) or not.  So, we can't use ix86_expand_clear which could clobber
flags.
    As the splitters are only SWI48, we don't have to worry about QI/HImode
    clearing and so IMHO just want to always use the mov $0, reg form by hand.
    If flags actually isn't live across it, we have
    ;; Attempt to always use XOR for zeroing registers (including FP modes).
    (define_peephole2
      [(set (match_operand 0 "general_reg_operand")
            (match_operand 1 "const0_operand"))]
      "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
       && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
       && peep2_regno_dead_p (0, FLAGS_REG)"
      [(parallel [(set (match_dup 0) (const_int 0))
                  (clobber (reg:CC FLAGS_REG))])]
      "operands[0] = gen_lowpart (word_mode, operands[0]);")
    peephole2 which would turn the mov $0, reg back to xor reg, reg.

    2026-04-16  Jakub Jelinek  <[email protected]>

            PR target/124892
            * config/i386/i386.md (clz<mode>2_lzcnt_nf,
            <lt_zcnt>_<mode>_nf, popcount<mode>2_nf): Emit explicit
            set of (match_dup 0) to (const_int 0) without flags clobber instead
of
            using ix86_expand_clear.

            * gcc.target/i386/apx-pr124892.c: New test.

    Reviewed-by: Hongtao Liu <[email protected]>

Reply via email to