Hi!

On Thu, Jun 11, 2020 at 11:22:33PM -0500, will schmidt wrote:
>   Fix codegen implementation for the builtin vec_pack_to_short_fp32.

> +;; Convert two vector F32 to packed vector f16.
> +(define_expand "convert_4f32_8f16"
> +  [(set (match_operand:V8HI 0 "register_operand" "=v")
> +     (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "v")
> +                   (match_operand:V4SF 2 "register_operand" "v")]
> +                  UNSPEC_CONVERT_4F32_8F16))]
> +  "TARGET_P9_VECTOR"
> +{
> +  rtx rtx_tmp_hi = gen_reg_rtx (V4SImode);
> +  rtx rtx_tmp_lo = gen_reg_rtx (V4SImode);
> +
> +  emit_insn (gen_vsx_xvcvsphp (rtx_tmp_hi, operands[1] ));
> +  emit_insn (gen_vsx_xvcvsphp (rtx_tmp_lo, operands[2] ));
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_lo, rtx_tmp_hi));
> +  else
> +    emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_hi, rtx_tmp_lo));
> +  DONE;
> +})

Why is this different from the 8i16, which doesn't have the swap of the
operands for BE?

If the comment before the pattern would say how it orders the elements,
that might clarify things.  (This really is documenting the unspec, but
there is no better place for documenting this afaics).


> +;; Generate xvcvsphp
> +(define_insn "vsx_xvcvsphp"
> +  [(set (match_operand:V4SI 0 "register_operand" "=wa")
> +        (unspec:V4SI [(match_operand:V4SF 1 "vsx_register_operand" "wa")]
> +                  UNSPEC_VSX_XVCVSPHP))]

(Should be a tab before the unspec as well, not eight spaces).

> --- a/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
> @@ -1,25 +1,37 @@
>  /* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } 
> */

Why the lp64 here?

>  /* { dg-require-effective-target powerpc_p9vector_ok } */

That is implied by p9vector_hw (or it should be, at least :-) )


Segher

Reply via email to