https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Hmm, can't get the test to FAIL with a cross, somehow the dejagnu tests always
end up UNSUPPORTED. The testcase for x86_64 has
/* With AVX256 or more we do not pull off the trick eliding the epilogue. */
/* { dg-additional-options "-mprefer-avx128" { target { x86_64-*-* i?86-*-* } }
} */
so we require the use of V16QImode -> V4SImode SAD with the V16QImode loads
split into two V8QImode ones. There were insufficient dejagnu effective
targets to model the restriction in
+ /* If the gap splits the vector in half and the target
+ can do half-vector operations avoid the epilogue peeling
+ by simply loading half of the vector only. Usually
+ the construction with an upper zero half will be elided. */
+ dr_alignment_support alignment_support_scheme;
+ scalar_mode elmode = SCALAR_TYPE_MODE (TREE_TYPE (vectype));
+ machine_mode vmode;
+ if (overrun_p
+ && !masked_p
+ && (((alignment_support_scheme
+ = vect_supportable_dr_alignment (first_dr_info, false)))
+ == dr_aligned
+ || alignment_support_scheme == dr_unaligned_supported)
+ && known_eq (nunits, (group_size - gap) * 2)
+ && mode_for_vector (elmode, (group_size - gap)).exists (&vmode)
+ && VECTOR_MODE_P (vmode)
+ && targetm.vector_mode_supported_p (vmode)
+ && (convert_optab_handler (vec_init_optab,
+ TYPE_MODE (vectype), vmode)
+ != CODE_FOR_nothing))
+ overrun_p = false;
I see we probably need hw_misalign, so does
Index: gcc/testsuite/gcc.dg/vect/slp-reduc-sad-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-reduc-sad-2.c (revision 270899)
+++ gcc/testsuite/gcc.dg/vect/slp-reduc-sad-2.c (working copy)
@@ -25,5 +25,5 @@ int x264_pixel_sad_8x8( uint8_t *pix1, u
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } }
*/
/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
-/* { dg-final { scan-tree-dump-not "access with gaps requires scalar epilogue
loop" "vect" } } */
+/* { dg-final { scan-tree-dump-not "access with gaps requires scalar epilogue
loop" "vect" { xfail { ! vect_hw_misalign } } } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
fix everything?