On Mon, Jun 21, 2021 at 10:08 AM Hongtao Liu <crazy...@gmail.com> wrote: > > On Mon, Jun 21, 2021 at 3:28 PM Uros Bizjak via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Mon, Jun 21, 2021 at 6:56 AM liuhongt <hongtao....@intel.com> wrote: > > > > > > The avx512 supports bitwise operations with mask registers, but the > > > throughput of those instructions is much lower than that of the > > > corresponding gpr version, so we would additionally disparages > > > slightly the mask register alternative for bitwise operations in the > > > LRA. > > > > This was the reason for UNSPEC tagged instructions with mask > > registers, used mainly for builtins. > > > > Also, care should be taken if we want mask registers to be used under > > GPR pressure, or it is better to spill GPR registers. In the past, > > DImode values caused a lot of problems with MMX registers on x86-64, > > but we were able to hand-tune the allocator in the way you propose. > > > > Let's try the proposed approach to see what happens. > > > > > Also when allocano cost of GENERAL_REGS is same as MASK_REGS, allocate > > > MASK_REGS first since it has already been disparaged. > > > > > > gcc/ChangeLog: > > > > > > PR target/101142 > > > * config/i386/i386.md: (*anddi_1): Disparage slightly the mask > > > register alternative. > > > (*and<mode>_1): Ditto. > > > (*andqi_1): Ditto. > > > (*andn<mode>_1): Ditto. > > > (*<code><mode>_1): Ditto. > > > (*<code>qi_1): Ditto. > > > (*one_cmpl<mode>2_1): Ditto. > > > (*one_cmplsi2_1_zext): Ditto. > > > (*one_cmplqi2_1): Ditto. > > > * config/i386/i386.c (x86_order_regs_for_local_alloc): Change > > > the order of mask registers to be before general registers. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR target/101142 > > > * gcc.target/i386/spill_to_mask-1.c: Adjust testcase. > > > * gcc.target/i386/spill_to_mask-2.c: Adjust testcase. > > > * gcc.target/i386/spill_to_mask-3.c: Adjust testcase. > > > * gcc.target/i386/spill_to_mask-4.c: Adjust testcase. > > > > OK with a comment addition, see inline. > > > > Thanks, > > Uros. > > > > > --- > > > gcc/config/i386/i386.c | 8 +- > > > gcc/config/i386/i386.md | 20 ++--- > > > .../gcc.target/i386/spill_to_mask-1.c | 89 +++++++++++++------ > > > .../gcc.target/i386/spill_to_mask-2.c | 11 ++- > > > .../gcc.target/i386/spill_to_mask-3.c | 11 ++- > > > .../gcc.target/i386/spill_to_mask-4.c | 11 ++- > > > 6 files changed, 91 insertions(+), 59 deletions(-) > > > > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > > index a61255857ff..a651853ca3b 100644 > > > --- a/gcc/config/i386/i386.c > > > +++ b/gcc/config/i386/i386.c > > > @@ -20463,6 +20463,10 @@ x86_order_regs_for_local_alloc (void) > > > int pos = 0; > > > int i; > > > > > > + /* Mask register. */ > > > + for (i = FIRST_MASK_REG; i <= LAST_MASK_REG; i++) > > > + reg_alloc_order [pos++] = i; > > > > Please add a comment why mask registers should come first. > Thanks for the review, this is the patch i'm check in.
This patch again caused unwanted mask instructions with -m32 in cpuid check code, e.g.: Running target unix/-m32 FAIL: gcc.target/i386/avx512bw-pr70329-1.c execution test FAIL: gcc.target/i386/pr96814.c execution test Debugging pr96814 failure: 0x0804921d <+66>: mov %edx,%ecx 0x0804921f <+68>: cpuid => 0x08049221 <+70>: kmovd %edx,%k0 0x08049225 <+74>: mov %eax,-0x8(%ebp) 0x08049228 <+77>: mov %ebx,-0xc(%ebp) It looks to me that putting mask registers in front of GPR is looking for trouble, since apparently disparaging mask registers does not offset register preference of the new order. Uros.