> 1. This patch set scalar_to_vec cost as 2 instead 1 since scalar move
>    instruction is slightly more costly than normal rvv instructions (e.g. 
> vadd.vv).

We can go with 2 or 3 (if needed) for now but should later
really incorporate reg-move costs in this IMHO.  Just like e.g.

static const struct cpu_regmove_cost cortexa57_regmove_cost =
{
  1, /* GP2GP  */
  /* Avoid the use of slow int<->fp moves for spilling by setting
     their cost higher than memmov_cost.  */
  5, /* GP2FP  */
  ...
};

we can add V2FP, V2GP and the reverse.  Then add those to
scalar_to_vec (later vec_to_scalar as well) in adjust_stmt_cost
according to the mode.

> 2. Adjust scalar_to_vec cost accurately according to the splat value, for 
> example,
>    a value like 32872, needs 2 more scalar instructions:
>    so the cost = 2 (scalar instructions) + 2 (scalar move).

>    We adjust the cost like this since it doesn need such many instructions in 
> vectorized codes,
>    wheras they are not needed in scalar codes.

I'm afraid the issue I mentioned (we don't count the constant
synthesis for scalar but would for vector with the change) is
still present.
Even if it does not cause any regressions or problems now it
certainly might in the future, especially with complex constants.
Basically we would not vectorize something containing several
synthesized constants (like popcount) anymore.
Therefore I would advise against it even though the given
example cannot be "solved" unconditionally then.

Regards
 Robin

Reply via email to