https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104015

--- Comment #3 from avieira at gcc dot gnu.org ---
Hi Kewen,

Thanks for the analysis. The param_vect_partial_vector_usage suggestion seems
valid, but that shouldn't be the root cause. 

 I would expect an unpredicated V8HI epilogue to fail for a V8HI main loop
(unless the main loop was unrolled).

That is what the following code in vect_analyze_loop_2 is responsible for:
  /* If we're vectorizing an epilogue loop, the vectorized loop either needs
     to be able to handle fewer than VF scalars, or needs to have a lower VF
     than the main loop.  */
  if (LOOP_VINFO_EPILOGUE_P (loop_vinfo)
      && !LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
      && maybe_ge (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
                   LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo)))
    return opt_result::failure_at (vect_location,
                                   "Vectorization factor too high for"
                                   " epilogue loop.\n");

So PR103997 is looking at fixing the skipping, because we skip too much now.
You seem to be describing a case where it doesn't skip enough, but like I said
that should be dealt with the code above, so I have a feeling there may be some
confusion here.

I have a patch for the earlier bug at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588330.html 
This is still under review whils we work out a better way of dealing with the
issue. Could you maybe check whether that fixes your failures? I'll start a
cross build for powerpc in the meantime to see if I can check out these tests. 

As for why I don't use LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P on the first loop
vinfo to skip epilogue modes, that's because it is possible to have a
non-predicated main loop with a predicated epilogue. The test I added for
aarch64 with that patch is a motivating case.

On another note, unfortunately LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P only
'forces' the use of partial vectors it doesn't tell us whether it is possible
or not AFAIU, hence why I introduced that new function, that really only checks
whether the target is at all capable of partial vector generation, since if we
know it's not possible at all we can skip more modes and avoid unnecessary
analysis.

Reply via email to