Hi,
Rebased the patch on top of Richard Sandiford's patches, with his fixes
I can now allow for vectorization of epilogues after we match simdlen.
This will however not turn on epilogue vectorization in cases where we
specify a desired simdlen that is never matched. This would require
more work as before simdlen is matched we would need to analyze each
vector_size after creating a "first_loop_vinfo" twice: once as an
epilogue (for in the case we never match simdlen) and once as a main
loop (in case simdlen would match its VF). Maybe there is a different
way of doing it but I don't see it right now.
Bootstrapped and regression tested (also ran libgomp tests) for x86_64
and aarch64. Currently libgomp has 5 failures for aarch64, these are all
openacc tests. The first one I looked at is due to a reduction seemingly
performing too many iterations when defining '$acc parallel
vector_length(vl)' I am looking into it.
Is this OK for trunk?
Cheers,
Andre
gcc/ChangeLog:
2019-11-07 Andre Vieira <andre.simoesdiasvie...@arm.com>
* tree-vect-loop.c (vect_analyze_loop): Disable epilogue
vectorization for loops with SIMDUID set. Enable epilogue
vectorization for loops with SIMDLEN set after finding a main
loop with a VF that matches it.
On 05/11/2019 07:16, Jakub Jelinek wrote:
On Tue, Nov 05, 2019 at 08:07:53AM +0100, Richard Biener wrote:
I was using loop->simdlen to detect whether it was a SIMD loop and I don't
believe that was correct, as can be witnessed by the mass failures in libgomp.
My apologies for not running this, didn't think of it!
I found that these were failing because we do not handle vectorization of
epilogues correctly when SIMDUID is set. For now Jakub and I agreed to disable
epilogue vectorization for loops where SIMDUID is set until we have fixed
this. See further comments inline.
I bootstrapped it on aarch64 and x86_64, ran libgomp on both.
This OK for trunk?
OK. Can you remove the simdlen == 0 check as a followup?
Yeah, exactly, I wanted to ask what the point of the simdlen == 0 check is.
All a non-zero simdlen says is a user assertion that certain inter-loop
depencencies don't exist.
Jakub
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index dfa087ebb2cf01a5d21da0a921f8b6fc3d691ce9..22550ca2d6c56cce201ea422bfae5472a0d85f3a 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2455,11 +2455,15 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
delete loop_vinfo;
/* Only vectorize epilogues if PARAM_VECT_EPILOGUES_NOMASK is
- enabled, this is not a simd loop and it is the innermost loop. */
- vect_epilogues = (!loop->simdlen
+ enabled, SIMDUID is not set, it is the innermost loop and we have
+ either already found the loop's SIMDLEN or there was no SIMDLEN to
+ begin with.
+ TODO: Enable epilogue vectorization for loops with SIMDUID set. */
+ vect_epilogues = (!simdlen
&& loop->inner == NULL
&& PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
&& LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
+ && !loop->simduid
/* For now only allow one epilogue loop. */
&& first_loop_vinfo->epilogue_vinfos.is_empty ());