On Thu, Mar 11, 2021 at 07:57:23AM +0800, Xionghu Luo via Gcc-patches wrote:
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index ec068c58aa5..48eb91132a9 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -7000,11 +7000,15 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, 
> rtx idx)
>  
>    gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
>  
> -  gcc_assert (GET_MODE (idx) == E_SImode);
> -
>    machine_mode inner_mode = GET_MODE (val);
>  
> -  rtx tmp = gen_reg_rtx (GET_MODE (idx));
> +  machine_mode idx_mode = GET_MODE (idx);
> +  rtx tmp = gen_reg_rtx (DImode);
> +  if (idx_mode != DImode)
> +    tmp = convert_modes (DImode, idx_mode, idx, 0);
> +  else
> +    tmp = idx;
> +
>    int width = GET_MODE_SIZE (inner_mode);
>  
>    gcc_assert (width >= 1 && width <= 8);
> @@ -7012,9 +7016,7 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, 
> rtx idx)
>    int shift = exact_log2 (width);
>    /* Generate the IDX for permute shift, width is the vector element size.
>       idx = idx * width.  */
> -  emit_insn (gen_ashlsi3 (tmp, idx, GEN_INT (shift)));
> -
> -  tmp = convert_modes (DImode, SImode, tmp, 1);
> +  emit_insn (gen_ashldi3 (tmp, tmp, GEN_INT (shift)));

This looks broken.
The gen_reg_rtx (DImode); call result is completely ignored,
so it wastes one pseudo, and I'm not convinced that idx nor tmp
is guaranteed to be usable as output of shift.
So, shouldn't it be instead:
  rtx tmp = gen_reg_rtx (DImode);
  if (idx_mode != DImode)
    idx = convert_modes (DImode, idx_mode, idx, 0);

...
  emit_insn (gen_ashldi3 (tmp, idx, GEN_INT (shift));
?
Also, dunno what is less and what is more expensive on
rs6000, whether convert_modes with unsigned = 0 or 1.
I think rs6000 only supports very narrow vectors, so even for
say QImode idx anything with MSB set will be out of out of bounds and UB,
so you can sign or zero extend, whatever is more efficient.

> +  machine_mode idx_mode = GET_MODE (idx);
> +  rtx tmp = gen_reg_rtx (DImode);
> +  if (idx_mode != DImode)
> +    tmp = convert_modes (DImode, idx_mode, idx, 0);
> +  else
> +    tmp = idx;
> +
>    gcc_assert (width >= 1 && width <= 4);
>  
>    if (!BYTES_BIG_ENDIAN)
>      {
>        /*  idx = idx * width.  */
> -      emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
> +      emit_insn (gen_muldi3 (tmp, tmp, GEN_INT (width)));
>        /*  idx = idx + 8.  */
> -      emit_insn (gen_addsi3 (tmp, tmp, GEN_INT (8)));
> +      emit_insn (gen_adddi3 (tmp, tmp, GEN_INT (8)));
>      }
>    else
>      {
> -      emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
> -      emit_insn (gen_subsi3 (tmp, GEN_INT (24 - width), tmp));
> +      emit_insn (gen_muldi3 (tmp, idx, GEN_INT (width)));
> +      emit_insn (gen_subdi3 (tmp, GEN_INT (24 - width), tmp));

Ditto.  But the above was even inconsistent, sometimes it
used tmp (for LE) and otherwise idx in whatever mode it has.

        Jakub

Reply via email to