Yeah. I see. Removing it will cause testcase run fail.
Now I found the issue, since you want to store the step in the iv_rgroup.

After I tried, the IR looks correct but create ICE:
0x18c8d41 process_bb
        ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:7933
0x18cb6d9 do_rpo_vn_1
        ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8544
0x18cbd35 do_rpo_vn(function*, edge_def*, bitmap_head*, bool, bool, vn_lookup_ki
        ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8646
0x19d42d2 execute
        ../../../riscv-gcc/gcc/tree-vectorizer.cc:1385

This is the IR:

loop_len_76 = MIN_EXPR <ivtmp_98, 8>;

  loop_len_66 = MIN_EXPR <ivtmp_101, 16>;   -> store the step in rgroup instead 
of LOOP_VINFO
  _103 = loop_len_66;  ---->reuse the MIN VALUE

  loop_len_66 = MIN_EXPR <loop_len_66, 4>;

  _104 = _103 - loop_len_66;  ->use MIN - loop_len_66

  loop_len_65 = MIN_EXPR <_104, 4>;
  _105 = _104 - loop_len_65;
  loop_len_64 = MIN_EXPR <_105, 4>;
  loop_len_63 = _105 - loop_len_64;

Since previously I store the "MIN_EXPR <ivtmp_101, 16>;" in the LOOP_VINFO, not 
the rgroup.

So previously is correct and no ICE:

  loop_len_76 = MIN_EXPR <ivtmp_98, 8>;

 _103 = MIN_EXPR <ivtmp_101, 16>;    -> Step store in the LOOP_VINFO (S)

  loop_len_66 = MIN_EXPR <_103, 4>; 

  _104 = _103 - loop_len_66;  ->  use MIN - loop_len_66

  loop_len_65 = MIN_EXPR <_104, 4>;
  _105 = _104 - loop_len_65;
  loop_len_64 = MIN_EXPR <_105, 4>;
  loop_len_63 = _105 - loop_len_64;

Could you help me with this ?
Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-25 18:19
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V15] VECT: Add decrement IV iteration loop control by 
variable amount support
"juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes:
> Hi, Richard. Thanks for the comments.
>
>>> if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo)
>>>     || !iv_rgc
>>>     || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor
>>> != rgc->max_nscalars_per_iter * rgc->factor))
>>>           {
>   >>           /* See whether zero-based IV would ever generate all-false 
> masks
>    >>             or zero length before wrapping around.  */
>    >>          bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, 
> rgc);
>  
>    >>          /* Set up all controls for this group.  */
>      >>        test_ctrl = vect_set_loop_controls_directly (loop, loop_vinfo,
>     >>                                                      &preheader_seq,
>         >>                                                  &header_seq,
>     >>                                                      loop_cond_gsi, 
> rgc,
>     >>                                                      niters, 
> niters_skip,
>     >>                                                      might_wrap_p);
>  
>    >>  iv_rgc = rgc;
>   >> }
>
>
> Could you tell me why you add:
> (iv_rgc->max_nscalars_per_iter * iv_rgc->factor
>>> != rgc->max_nscalars_per_iter * rgc->factor) ?
 
The patch creates IVs with the following step:
 
      gimple_seq_add_stmt (header_seq, gimple_build_assign (step, MIN_EXPR,
    index_before_incr,
    nitems_step));
 
If nitems_step is the same for two IVs, those IVs will always be equal.
 
So having multiple IVs with the same nitems_step is redundant.
 
nitems_step is calculated as follows:
 
  unsigned int nitems_per_iter = rgc->max_nscalars_per_iter * rgc->factor;
  ...
  poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
  ...
 
  if (nitems_per_iter != 1)
    {
      ...
      tree iv_factor = build_int_cst (iv_type, nitems_per_iter);
      ...
      nitems_step = gimple_build (preheader_seq, MULT_EXPR, iv_type,
  nitems_step, iv_factor);
      ...
    }
 
so nitems_per_step is equal to:
 
  rgc->max_nscalars_per_iter * rgc->factor * VF
 
VF is fixed for a loop, so nitems_step is equal for two different
rgroup_controls if:
 
  rgc->max_nscalars_per_iter * rgc->factor
 
is the same for those rgroup_controls.
 
Please try the example I posted earlier today. I think you'll see that,
without the:
 
  (iv_rgc->max_nscalars_per_iter * iv_rgc->factor
   != rgc->max_nscalars_per_iter * rgc->factor)
 
you'll have two IVs with the same step (because their MIN_EXPRs have
the same bound).
 
Thanks,
Richard
 

Reply via email to