Hi! For -O1 and above, we force all operands and targets of x86 builtins into registers during expansion, because we expect combiner will do its job. But for -O0, we try to keep them in MEM if the predicate allows it, we just make sure that at most one input operand is a MEM. We still allow MEM destination and one MEM input operand, which is why we ICE - most insns/expanders don't allow that in the predicates and it is fine, but various insns/expanders have e.g. nonimmediate_operand on both target and some input operand and just have a condition that requires that one of those is a register or that both of them are not memory. The expanders don't check the condition though, it is checked only when the insn is being recognized in vregs pass and that is too late. The following patch fixes that by forcing all input operands at -O0 into non-memory if the target is memory.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or are there any define_insn/define_expand where it is desirable to have both input and output operand a MEM (and does it have to be matching)? For various scalar binary and unary expanders the backend already uses a helper that will force something into memory if dest and src are both memory and not rtx_equal_p, but do we have anything like that in anything these two builtin expanders emit? 2017-03-02 Jakub Jelinek <ja...@redhat.com> PR target/79807 * config/i386/i386.c (ix86_expand_multi_arg_builtin): If target is a memory operand, increase num_memory. (ix86_expand_args_builtin): Likewise. * gcc.target/i386/pr79807.c: New test. --- gcc/config/i386/i386.c.jj 2017-03-02 08:08:43.000000000 +0100 +++ gcc/config/i386/i386.c 2017-03-02 17:58:28.396999033 +0100 @@ -34249,6 +34249,8 @@ ix86_expand_multi_arg_builtin (enum insn || GET_MODE (target) != tmode || !insn_data[icode].operand[0].predicate (target, tmode)) target = gen_reg_rtx (tmode); + else if (memory_operand (target, tmode)) + num_memory++; gcc_assert (nargs <= 4); @@ -35534,6 +35536,8 @@ ix86_expand_args_builtin (const struct b || GET_MODE (target) != tmode || !insn_p->operand[0].predicate (target, tmode)) target = gen_reg_rtx (tmode); + else if (memory_operand (target, tmode)) + num_memory++; real_target = target; } else --- gcc/testsuite/gcc.target/i386/pr79807.c.jj 2017-03-02 18:03:30.032023436 +0100 +++ gcc/testsuite/gcc.target/i386/pr79807.c 2017-03-02 18:02:53.000000000 +0100 @@ -0,0 +1,12 @@ +/* PR target/79807 */ +/* { dg-do compile } */ +/* { dg-options "-O0 -mavx -ffloat-store" } */ + +typedef double __v2df __attribute__ ((__vector_size__ (16))); +typedef double __v4df __attribute__ ((__vector_size__ (32))); + +__v2df +foo (__v4df x) +{ + return __builtin_ia32_pd_pd256 (x); +} Jakub