"juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes:
> Hi, Richard.  Since create_iv has been approved and soon will be commited 
> after
> we bootstrap && regression.
>
> Now, I plan to send patch for "decrement IV".
>
> After reading your comments, I have several questions:
>
> 1. 
>>    if (use_bias_adjusted_len)
>>      return rgl->bias_adjusted_ctrl;
>> +  else if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
>> +    OPTIMIZE_FOR_SPEED))
>> +    {
>> +      tree loop_len = rgl->controls[index];
>> +      poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
>> +      poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
>> +      if (maybe_ne (nunits1, nunits2))
>> + {
>> +   /* A loop len for data type X can be reused for data type Y
>> +      if X has N times more elements than Y and if Y's elements
>> +      are N times bigger than X's.  */
>> +   gcc_assert (multiple_p (nunits1, nunits2));
>> +   unsigned int factor = exact_div (nunits1, nunits2).to_constant ();
>> +   gimple_seq seq = NULL;
>> +   loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len,
>> +    build_int_cst (iv_type, factor));
>> +   if (seq)
>> +     gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
>> + }
>> +      return loop_len;
>> +    }
>>    else
>>      return rgl->controls[index];
>>  }
>
>>  ...here.  That is, the key isn't whether SELECT_VL is available,
>>  but instead whether we've decided to use it for this loop (unless
>>  I'm missing something).
>
> Let's me clarify it again:
>
> I do this here is for Case 2 SLP:
>
> Generate for len : _61 = _75 / 2;
> I think it is similar with ARM SVE using VIEW_CONVER_EXPR to view_convert the 
> mask.
>
> You said we should not let SELECT_VL is available or not to decide it here.
> Could you teach me how to handle this code here? Should I add a target hook 
> like:
> TARGET_SLP_LOOP_LEN_RDIV_BY_FACTOR_P ?

No.  What I mean is: for each vectorised loop, we should make a decision,
in one place only, whether to use SELECT_VL-based control flow or
arithmetic-based control flow for that particular loop.  That decision
depends partly on direct_internal_fn_supported_p (a necessary but not
sufficient condition), partly on whether the loop contains SLP nodes, etc.
We should then record that decision in the loop_vec_info so that it is
available to whichever code needs it.

This is similar to LOOP_VINFO_USING_PARTIAL_VECTORS_P etc.

Thanks,
Richard

Reply via email to