On 27/02/2024 13:56, Andre Vieira wrote: > > This patch adds support for MVE Tail-Predicated Low Overhead Loops by using > the > doloop funcitonality added to support predicated vectorized hardware loops. > > gcc/ChangeLog: > > * config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change > declaration to pass basic_block. > (arm_attempt_dlstp_transform): New declaration. > * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook. > (TARGET_PREDICT_DOLOOP_P): Likewise. > (arm_target_bb_ok_for_lob): Adapt condition. > (arm_mve_get_vctp_lanes): New function. > (arm_dl_usage_type): New internal enum. > (arm_get_required_vpr_reg): New function. > (arm_get_required_vpr_reg_param): New function. > (arm_get_required_vpr_reg_ret_val): New function. > (arm_mve_get_loop_vctp): New function. > (arm_mve_insn_predicated_by): New function. > (arm_mve_across_lane_insn_p): New function. > (arm_mve_load_store_insn_p): New function. > (arm_mve_impl_pred_on_outputs_p): New function. > (arm_mve_impl_pred_on_inputs_p): New function. > (arm_last_vect_def_insn): New function. > (arm_mve_impl_predicated_p): New function. > (arm_mve_check_reg_origin_is_num_elems): New function. > (arm_mve_dlstp_check_inc_counter): New function. > (arm_mve_dlstp_check_dec_counter): New function. > (arm_mve_loop_valid_for_dlstp): New function. > (arm_predict_doloop_p): New function. > (arm_loop_unroll_adjust): New function. > (arm_emit_mve_unpredicated_insn_to_seq): New function. > (arm_attempt_dlstp_transform): New function. > * config/arm/arm.opt (mdlstp): New option. > * config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes, > letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes. > (DLSTP, LETP): New iterators. > (predicated_doloop_end_internal<letp_num_lanes>): New pattern. > (dlstp<dlstp_elemsize>_insn): New pattern. > * config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated > loops. > (doloop_begin): Likewise. > * config/arm/types.md (mve_misc): New mve type to represent > predicated_loop_end insn sequences. > * config/arm/unspecs.md: > (DLSTP8, DLSTP16, DLSTP32, DSLTP64, > LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP. > > gcc/testsuite/ChangeLog: > > * gcc.target/arm/lob.h: Add new helpers. > * gcc.target/arm/lob1.c: Use new helpers. > * gcc.target/arm/lob6.c: Likewise. > * gcc.target/arm/dlstp-compile-asm-1.c: New test. > * gcc.target/arm/dlstp-compile-asm-2.c: New test. > * gcc.target/arm/dlstp-compile-asm-3.c: New test. > * gcc.target/arm/dlstp-int8x16.c: New test. > * gcc.target/arm/dlstp-int8x16-run.c: New test. > * gcc.target/arm/dlstp-int16x8.c: New test. > * gcc.target/arm/dlstp-int16x8-run.c: New test. > * gcc.target/arm/dlstp-int32x4.c: New test. > * gcc.target/arm/dlstp-int32x4-run.c: New test. > * gcc.target/arm/dlstp-int64x2.c: New test. > * gcc.target/arm/dlstp-int64x2-run.c: New test. > * gcc.target/arm/dlstp-invalid-asm.c: New test. > > Co-authored-by: Stam Markianos-Wright <stam.markianos-wri...@arm.com> > --- > gcc/config/arm/arm-protos.h | 4 +- > gcc/config/arm/arm.cc | 1249 ++++++++++++++++- > gcc/config/arm/arm.opt | 3 + > gcc/config/arm/iterators.md | 15 + > gcc/config/arm/mve.md | 50 + > gcc/config/arm/thumb2.md | 138 +- > gcc/config/arm/types.md | 6 +- > gcc/config/arm/unspecs.md | 14 +- > gcc/testsuite/gcc.target/arm/lob.h | 128 +- > gcc/testsuite/gcc.target/arm/lob1.c | 23 +- > gcc/testsuite/gcc.target/arm/lob6.c | 8 +- > .../gcc.target/arm/mve/dlstp-compile-asm-1.c | 146 ++ > .../gcc.target/arm/mve/dlstp-compile-asm-2.c | 749 ++++++++++ > .../gcc.target/arm/mve/dlstp-compile-asm-3.c | 46 + > .../gcc.target/arm/mve/dlstp-int16x8-run.c | 44 + > .../gcc.target/arm/mve/dlstp-int16x8.c | 31 + > .../gcc.target/arm/mve/dlstp-int32x4-run.c | 45 + > .../gcc.target/arm/mve/dlstp-int32x4.c | 31 + > .../gcc.target/arm/mve/dlstp-int64x2-run.c | 48 + > .../gcc.target/arm/mve/dlstp-int64x2.c | 28 + > .../gcc.target/arm/mve/dlstp-int8x16-run.c | 44 + > .../gcc.target/arm/mve/dlstp-int8x16.c | 32 + > .../gcc.target/arm/mve/dlstp-invalid-asm.c | 521 +++++++ > 23 files changed, 3321 insertions(+), 82 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8-run.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int32x4-run.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int32x4.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int64x2-run.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int64x2.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int8x16-run.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int8x16.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-invalid-asm.c >
This is OK once patch4 gets approved. R.