On Thu, 15 Jun 2023, Robin Dapp wrote:

> > the minus in 'operand 2 - operand 3' should be a plus if the
> > bias is really zero or -1.  I suppose
> 
> Yes, that somehow got lost from when the bias was still +1.  Maybe
> Juzhe can fix this in the course of his patch.
> 
> > that's quite conservative.  I think you can do better when the
> > loads are aligned, reading an extra byte when ignoring the bias
> > is OK and you at least know the very first element is used.
> > For stores you would need to emit compare&jump for all but
> > the first store of a group though ...
> 
> The implementation is a first shot and yes we could do a bit
> better but limiting to a single rgroup is IMHO the more severe
> restriction.  The pattern wasn't hit very often across SPEC
> either way.  I think overall proper masking is  more important for
> fixed-length vectors while length control might be more useful
> for variable-length vectors.  Just my gut feeling though, you're
> the expert there.
> 
> > That said, I'm still not seeing where you actually apply the bias.
> 
> We do
> 
> +
> +  int partial_load_bias = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo);
> +  if (partial_load_bias != 0)
> +    {
> +      tree adjusted_len = rgc->bias_adjusted_ctrl;
> +      gassign *minus = gimple_build_assign (adjusted_len, PLUS_EXPR,
> +                                           rgc->controls[0],
> +                                           build_int_cst
> +                                           (TREE_TYPE (rgc->controls[0]),
> +                                            partial_load_bias));
> +      gimple_seq_add_stmt (header_seq, minus);
> +    }
> +
> 
> as well as
> 
> +         if (use_bias_adjusted_len)
> +           {
> +             gcc_assert (i == 0);
> +             tree adjusted_len =
> +               make_temp_ssa_name (len_type, NULL, "adjusted_loop_len");
> +             SSA_NAME_DEF_STMT (adjusted_len) = gimple_build_nop ();
> +             rgl->bias_adjusted_ctrl = adjusted_len;
> +           }

Ah, OK.  It's a bit odd to have predicates on define_expand.  The
define_expand pattern is expected to only match either literal 0
or literal -1 (and consistently so for all len_ optabs) and thus
operand 2, the length, needs to be adjusted by the middle-end
to match up with the pattern supplied operand 3.

Richard.

Reply via email to