On Wed, Aug 19, 2020 at 4:17 AM Hongtao Liu <crazy...@gmail.com> wrote: > > On Wed, Aug 19, 2020 at 10:17 AM Hongtao Liu <crazy...@gmail.com> wrote: > > > > On Mon, Aug 17, 2020 at 5:34 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > On Fri, Aug 14, 2020 at 10:24 AM Hongtao Liu <crazy...@gmail.com> wrote: > > > > > > > > Enable direct move between masks and gprs in pass_reload with > > > > consideration of cost model. > > > > > > > > Changelog > > > > gcc/ > > > > * config/i386/i386.c (inline_secondary_memory_needed): > > > > No memory is needed between mask regs and gpr. > > > > (ix86_hard_regno_mode_ok): Add condition TARGET_AVX512F for > > > > mask regno. > > > > * config/i386/i386.h (enum reg_class): Add INT_MASK_REGS. > > > > (REG_CLASS_NAMES): Ditto. > > > > (REG_CLASS_CONTENTS): Ditto. > > > > * config/i386/i386.md: Exclude mask register in > > > > define_peephole2 which is available only for gpr. > > > > > > > > gcc/testsuites/ > > > > * gcc.target/i386/pr71453-1.c: New tests. > > > > * gcc.target/i386/pr71453-2.c: Ditto. > > > > * gcc.target/i386/pr71453-3.c: Ditto. > > > > * gcc.target/i386/pr71453-4.c: Ditto. > > > > > > @@ -18571,9 +18571,7 @@ inline_secondary_memory_needed (machine_mode > > > mode, reg_class_t class1, > > > || MAYBE_SSE_CLASS_P (class1) != SSE_CLASS_P (class1) > > > || MAYBE_SSE_CLASS_P (class2) != SSE_CLASS_P (class2) > > > || MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1) > > > - || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2) > > > - || MAYBE_MASK_CLASS_P (class1) != MASK_CLASS_P (class1) > > > - || MAYBE_MASK_CLASS_P (class2) != MASK_CLASS_P (class2)) > > > + || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2)) > > > { > > > gcc_assert (!strict || lra_in_progress); > > > return true; > > > > > > No, this is still needed, the reason is explained in the comment above > > > > Remove my change here. > > > > > inline_secondary_memory_needed: > > > > > > The function can't work reliably when one of the CLASSES is a class > > > containing registers from multiple sets. We avoid this by never > > > combining > > > different sets in a single alternative in the machine description. > > > Ensure that this constraint holds to avoid unexpected surprises. > > > > > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > > > index b24a4557871..74d207c3711 100644 > > > --- a/gcc/config/i386/i386.md > > > +++ b/gcc/config/i386/i386.md > > > @@ -15051,7 +15051,7 @@ > > > (parallel [(set (reg:CC FLAGS_REG) > > > (unspec:CC [(match_dup 0)] UNSPEC_PARITY)) > > > (clobber (match_dup 0))])] > > > - "" > > > + "!MASK_REGNO_P (REGNO (operands[0]))" > > > [(set (reg:CC FLAGS_REG) > > > (unspec:CC [(match_dup 1)] UNSPEC_PARITY))]) > > > > > > @@ -15072,6 +15072,7 @@ > > > (label_ref (match_operand 5)) > > > (pc)))] > > > "REGNO (operands[2]) == REGNO (operands[3]) > > > + && !MASK_REGNO_P (REGNO (operands[1])) > > > && peep2_reg_dead_p (3, operands[0]) > > > && peep2_reg_dead_p (3, operands[2]) > > > && peep2_regno_dead_p (4, FLAGS_REG)" > > > > > > > Changed for upper two define_peepholes. > > > > > Actually, there are several (historic?) peephole2 patterns that assume > > > > I didn't find those patterns. > > > > I looked through i386.md, there are 3 cases > > 1. Since mask registers are only used for "mov/zero_extend/bitwise" > > patterns, peephole2 patterns involving only other patterns are safe to > > use general_registers. > > 2. For those peephole2 patterns containing both > > mov/zero_extend/bitwise and other patterns, implicit conditions such > > as match_dup used in other patterns will make them the same as case 1. > > 3. Peephole2 patterns are safe to use mask registers, since they would > > be eliminated in output patterns. > > > > > register_operand means only integer registers. Just change > > > register_operand to general_reg_operand and eventually > > > nonimmediate_operand to nonimmediate_gr_operand. Do not put additional > > > predicates into insn predicate. > > > > > > > > > > > > Uros. > > > > Update patch. > > > > -- > > BR, > > Hongtao > > This patch.
OK, modulo: +/* { dg-final { scan-assembler-not "%xmm" } } */ It is not clear to me what the testcase is testing here. The scan string is probably too broad and can generate false positives. Is it possible to refine the scans? Uros.