On Tue, Feb 18, 2014 at 11:06 AM, Kirill Yukhin <kirill.yuk...@gmail.com> wrote:
>> >> >> Please don't change srcp pattern, it should be defined similar to >> >> >> vrcpss (aka sse_vmrcpv4sf). You need to switch operand order >> >> >> elsewhere. >> >> > >> >> > No, you are correct. Operands should be swapped as in your patch. >> >> >> >> Eh, sorry that after some more thinking, I have to again revert this >> >> decision. >> >> >> >> The srcp pattern should remain as is, and you should swap operands in >> >> avx512fintrin.h instead: >> > >> > In the bottom there's updated patch. >> > >> > Added "sse" type. mem operand made second. >> > Built-ins & tests fixed. >> > >> > Testing in progress. >> > >> > Is it ok for mainline if pass? >> >> No, you got operand order wrong. >> >> To correctly calculate "memory" attribute, all "sse" type insns expect >> the operands in the way sse_vmrcpv4sf2 is defined. You should keep >> nonimmedate operand as operand_1 and switch operands in builtins and >> insn mnemonics to fulfill required operand order *in the pattern*. > Patch updated. It is in the bottom. > gcc/ > * config/i386/avx512erintrin.h (_mm_rcp28_round_sd): Swap operands. > (_mm_rcp28_round_ss): Ditto. > (_mm_rsqrt28_round_sd): Ditto. > (_mm_rsqrt28_round_ss): Ditto. > * config/i386/avx512erintrin.h (_mm_rcp14_round_sd): Ditto. > (_mm_rcp14_round_ss): Ditto. > (_mm_rsqrt14_round_sd): Ditto. > (_mm_rsqrt14_round_ss): Ditto. > * config/i386/sse.md (rsqrt14<mode>): Make memory first operand. "Put nonimmediate operand as the first input operand." (and in similar way below). > (avx512er_exp2<mode><mask_name><round_saeonly_name>): Set type > attribute to sse. > (<mask_codefor>avx512er_rcp28<mode><mask_name><round_saeonly_name>): > Ditto. > (avx512er_vmrcp28<mode><round_saeonly_name>): Make memory first > operand, set type attribute. > (<mask_codefor>avx512er_rsqrt28<mode><mask_name><round_saeonly_name>): > Set type attribute. > (avx512er_vmrsqrt28<mode><round_saeonly_name>): Make memory first > operand, Set type attribute. > > gcc/testsuite/ > * gcc.target/i386/avx512er-vrcp28sd-2.c: Distinguish src1 and src2. > * gcc.target/i386/avx512er-vrcp28ss-2.c: Call correct intrinsic. > * gcc.target/i386/avx512er-vrsqrt28sd-2.c: Distinguish src1 and src2. > * gcc.target/i386/avx512er-vrsqrt28ss-2.c: Ditto. > * gcc.target/i386/avx512f-vrcp14sd-2.c: Fix reference calculation. > * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto. OK with a slight adjustement to vrcp14 patter below. > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -1551,13 +1551,13 @@ > [(set (match_operand:VF_128 0 "register_operand" "=v") > (vec_merge:VF_128 > (unspec:VF_128 > - [(match_operand:VF_128 1 "register_operand" "v") > - (match_operand:VF_128 2 "nonimmediate_operand" "vm")] > + [(match_operand:VF_128 2 "register_operand" "v") > + (match_operand:VF_128 1 "nonimmediate_operand" "vm")] > UNSPEC_RSQRT14) > (match_dup 1) > (const_int 1)))] > "TARGET_AVX512F" > - "vrsqrt14<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}" > + "vrsqrt14<ssescalarmodesuffix>\t{%1, %2, %0|%0, %2, %1}" This pattern should probably read the same as other vmrsqrt patterns (e.g. sse_vmrsqrtv4sf2 and avx512er_vmrsqrt28...): (vec_merge:VF_128 (unspec:VF_128 [(match_operand:VF_128 1 "nonimmediate_operand" "vm")] UNSPEC_RSQRT14) (match_operand:VF_128 2 "register_operand" "v") (const_int 1)))] "TARGET_AVX512F" "vrsqrt14<ssescalarmodesuffix>\t{%1, %2, %0|%0, %2, %1}" OK with the change above. Thanks, Uros.