> On 25 Sep 2024, at 7:38 PM, Andre Vieira (lists) > <[email protected]> wrote: > > External email: Use caution opening links or attachments > > > Hi, > > This patch restores missed optimizations for armv8.1-m.main targets that > were missed when the generation of csinc, csinv and csneg were enabled > or the same with patch series containing: > > commit c2bb84be4a6e581bbf45891457ee632a07416982 > Author: Sudi Das <[email protected]> > Date: Fri Sep 18 15:47:46 2020 +0100 > > [PATCH 2/5][Arm] New pattern for CSINV instructions > > The original patch series makes use of the "noce" machinery to transfor > RTL into patterns that later match the Armv8.1-M Mainline, by getting > the target hook TARGET_HAVE_CONDITIONAL_EXECUTION, to return FALSE for > such targets prior to reload_completed. The same machinery however was > transforming other RTL patterns which were later on causing the "ce" > pass post reload_completed to no longer optimize conditional execution > opportunities, which was causing the regression observed in PR > target/116444. > > This patch implements the target hook > TARGET_NOCE_CONVERSION_PROFITABLE_P to only allow "noce" to generate > patterns that match CSINV, CSINC and CSNEG. Thus ensuring that the > early "ce" passes do not ruin things for later ones. > > gcc/ChangeLog: > > * config/arm/arm-protos.h (arm_noce_conversion_profitable_p): New > declaration. > * config/arm/arm.cc (arm_is_v81m_cond_insn): New helper function used > in ... > (arm_noce_conversion_profitable_p): ... here. New function to implement > ... > (TARGET_NOCE_PROFITABLE_P): ... this target hook. New define. >
>
> Regression tested on arm-none-eabi, also specifically tested
> thumb-icvt-2.c for -mcpu=cortex-m55.
>
> OK for trunk and backport to gcc-14 (after a week and testing ofc)?
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index
> 50cae2b513a262880e4f6ee577341d7a5d2c39c6..b694589cab4b45730b257baacfdbd9db00584cdc
> 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -210,6 +210,7 @@ extern bool arm_pad_reg_upward (machine_mode, tree, int);
> #endif
> extern int arm_apply_result_size (void);
> extern opt_machine_mode arm_get_mask_mode (machine_mode mode);
> +extern bool arm_noce_conversion_profitable_p (rtx_insn *,struct noce_if_info
> *);
> #endif /* RTX_CODE */
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index
> de34e9867e67f5c4730e92a4bc6f347d83847e58..fe9b1ad5bf8560858d3996c0659d65c9887307d9
> 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -815,6 +815,9 @@ static const scoped_attribute_specs *const
> arm_attribute_table[] =
> #undef TARGET_MODES_TIEABLE_P
> #define TARGET_MODES_TIEABLE_P arm_modes_tieable_p
> +#undef TARGET_NOCE_CONVERSION_PROFITABLE_P
> +#define TARGET_NOCE_CONVERSION_PROFITABLE_P arm_noce_conversion_profitable_p
> +
> #undef TARGET_CAN_CHANGE_MODE_CLASS
> #define TARGET_CAN_CHANGE_MODE_CLASS arm_can_change_mode_class
> @@ -36074,6 +36077,90 @@ arm_get_mask_mode (machine_mode mode)
> return default_get_mask_mode (mode);
> }
> +/* Helper function to determine whether SEQ represents a sequence of
> + instructions representing the Armv8.1-M Mainline conditional arithmetic
> + instructions: csinc, csneg and csinv. The cinc instruction is generated
> + using a different mechanism. */
> +
> +static bool
> +arm_is_v81m_cond_insn (rtx_insn *seq)
> +{
> + rtx_insn *curr_insn = seq;
> + rtx set;
> + /* The pattern may start with a simple set with register operands. Skip
> + through any of those. */
> + while (curr_insn)
> + {
> + set = single_set (curr_insn);
> + if (!set
> + || !REG_P (SET_DEST (set)))
> + return false;
> +
> + if (!REG_P (SET_SRC (set)))
> + break;
> + curr_insn = NEXT_INSN (curr_insn);
Too late at night for me - but don’t you want to skip DEBUG_INSNS in some way
here ?
Ramana
> + }
> +
> + if (!set)
> + return false;
> +
> + /* The next instruction should be one of:
> + NEG: for csneg,
> + PLUS: for csinc,
> + NOT: for csinv. */
> + if (GET_CODE (SET_SRC (set)) != NEG
> + && GET_CODE (SET_SRC (set)) != PLUS
> + && GET_CODE (SET_SRC (set)) != NOT)
> + return false;
> +
> + curr_insn = NEXT_INSN (curr_insn);
> + if (!curr_insn)
> + return false;
> +
> + /* The next instruction should be a COMPARE. */
> + set = single_set (curr_insn);
> + if (!set
> + || !REG_P (SET_DEST (set))
> + || GET_CODE (SET_SRC (set)) != COMPARE)
> + return false;
> +
> + curr_insn = NEXT_INSN (curr_insn);
> + if (!curr_insn)
> + return false;
> +
> + /* And the last instruction should be an IF_THEN_ELSE. */
> + set = single_set (curr_insn);
> + if (!set
> + || !REG_P (SET_DEST (set))
> + || GET_CODE (SET_SRC (set)) != IF_THEN_ELSE)
> + return false;
> +
> + return !NEXT_INSN (curr_insn);
> +}
> +
> +/* For Armv8.1-M Mainline we have both conditional execution through IT
> blocks,
> + as well as conditional arithmetic instructions controlled by
> + TARGET_COND_ARITH. To generate the latter we rely on a special part of
> the
> + "ce" pass that generates code for targets that don't support conditional
> + execution of general instructions known as "noce". These transformations
> + happen before 'reload_completed'. However, "noce" also triggers for some
> + unwanted patterns [PR 116444] that prevent "ce" optimisations after
> reload.
> + To make sure we can get both we use the
> TARGET_NOCE_CONVERSION_PROFITABLE_P
> + hook to only allow "noce" to generate the patterns that are profitable.
> */
> +
> +bool
> +arm_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info
> *if_info)
> +{
> + if (!TARGET_COND_ARITH
> + || reload_completed)
> + return true;
> +
> + if (arm_is_v81m_cond_insn (seq))
> + return true;
> +
> + return false;
> +}
> +
> /* Output assembly to read the thread pointer from the appropriate TPIDR
> register into DEST. If PRED_P also emit the %? that can be used to
> output the predication code. */
>
> Kind Regards,
> Andre Vieira
> <pr116444.patch>
smime.p7s
Description: S/MIME cryptographic signature
