https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #2)
> __m128d h(__m128d x, __m128d y, __m128d z){
>     __m128d tem = _mm_mul_pd (x,y);
>     __m128d tem2 = tem + z;
>     __m128d tem3 = tem - z;
>     return __builtin_shuffle (tem2, tem3, (__m128i) {0, 3});
> }
> 
> doesn't quite work (the combiner pattern for fmaddsub is missing).  Tried
> {0, 2} as well.
> 
> :
> .LFB5021:
>         .cfi_startproc
>         vmovapd %xmm0, %xmm3
>         vfmsub132pd     %xmm1, %xmm2, %xmm0
>         vfmadd132pd     %xmm1, %xmm2, %xmm3
>         vshufpd $2, %xmm0, %xmm3, %xmm0

  tem2_6 = .FMA (x_2(D), y_3(D), z_5(D));
  # DEBUG tem2 => tem2_6
  # DEBUG BEGIN_STMT
  tem3_7 = .FMS (x_2(D), y_3(D), z_5(D));
  # DEBUG tem3 => NULL
  # DEBUG BEGIN_STMT
  _8 = VEC_PERM_EXPR <tem2_6, tem3_7, { 0, 3 }>;

Can it be handled in match.pd? rewrite fmaddsub pattern into vec_merge fma fms
<addsub_cst> looks too complex.

Similar for VEC_ADDSUB + MUL -> VEC_FMADDSUB.

Reply via email to