Re: [Patch, ARM] PR71061, length pop* pattern in epilogue correctly
Hi Jiong, On 02/06/16 15:44, Jiong Wang wrote: On 31/05/16 15:15, Kyrill Tkachov wrote: +/* Compute the atrribute "length" of insn. Currently, this function is used + for "*load_multiple_with_writeback", "*pop_multiple_with_return" and + "*pop_multiple_with_writeback_and_return". */ + s/atrribute/attribute/ Also, please add a description of the function arguments to the function comment. +int +arm_attr_length_pop_multi(rtx *operands, bool return_pc, bool write_back_p) +{ space between function name and '('. + /* ARM mode. */ + if (TARGET_ARM) +return 4; + /* Thumb1 mode. */ + if (TARGET_THUMB1) +return 2; + + /* For POP/PUSH/LDM/STM under Thumb2 mode, we can use 16-bit Thumb encodings + if the register list is 8-bit. Normally this means all registers in the + list must be LO_REGS, that is (R0 -R7). If any HI_REGS used, then we must + use 32-bit encodings. But there are two exceptions to this rule where + special HI_REGS used in PUSH/POP. + + 1. 16-bit Thumb encoding POP can use an 8-bit register list, and an +additional bit, P, that corresponds to the PC. This means it can access +any of {PC, R7-R0}. It took me a bit to figure it out but this bit 'P' you're talking about is a just a bit in the illustration in the ARM Architecture Reference Manual (section A8.8.131). I don't think it's useful to refer to it. This would be better phrased as "The encoding is 16 bits wide if the register list contains only the registers in {R0 - R7} or the PC". + 2. 16-bit Thumb encoding PUSH can use an 8-bit register list, and an +additional bit, M , that corresponds to the LR. This means it can +access any of {LR, R7-R0}. */ + This function only deals with POP instructions, so talking about PUSH encodings is confusing. I suggest you drop point 2). Thanks, all fixed. Meanwhile I splitted the comments to keep PUSH part in arm_attr_length_push_multi. Patch updated. OK for trunk? gcc/ 2016-06-02 Jiong WangPR target/71061 * config/arm/arm-protos.h (arm_attr_length_pop_multi): New declaration. * config/arm/arm.c (arm_attr_length_pop_multi): New function to return length for pop patterns. (arm_attr_length_push_multi): Update comments. * config/arm/arm.md (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. Ok if bootstrap and test with --with-mode=thumb is successful. Thanks, Kyrill
Re: [Patch, ARM] PR71061, length pop* pattern in epilogue correctly
On 31/05/16 15:15, Kyrill Tkachov wrote: +/* Compute the atrribute "length" of insn. Currently, this function is used + for "*load_multiple_with_writeback", "*pop_multiple_with_return" and + "*pop_multiple_with_writeback_and_return". */ + s/atrribute/attribute/ Also, please add a description of the function arguments to the function comment. +int +arm_attr_length_pop_multi(rtx *operands, bool return_pc, bool write_back_p) +{ space between function name and '('. + /* ARM mode. */ + if (TARGET_ARM) +return 4; + /* Thumb1 mode. */ + if (TARGET_THUMB1) +return 2; + + /* For POP/PUSH/LDM/STM under Thumb2 mode, we can use 16-bit Thumb encodings + if the register list is 8-bit. Normally this means all registers in the + list must be LO_REGS, that is (R0 -R7). If any HI_REGS used, then we must + use 32-bit encodings. But there are two exceptions to this rule where + special HI_REGS used in PUSH/POP. + + 1. 16-bit Thumb encoding POP can use an 8-bit register list, and an +additional bit, P, that corresponds to the PC. This means it can access +any of {PC, R7-R0}. It took me a bit to figure it out but this bit 'P' you're talking about is a just a bit in the illustration in the ARM Architecture Reference Manual (section A8.8.131). I don't think it's useful to refer to it. This would be better phrased as "The encoding is 16 bits wide if the register list contains only the registers in {R0 - R7} or the PC". + 2. 16-bit Thumb encoding PUSH can use an 8-bit register list, and an +additional bit, M , that corresponds to the LR. This means it can +access any of {LR, R7-R0}. */ + This function only deals with POP instructions, so talking about PUSH encodings is confusing. I suggest you drop point 2). Thanks, all fixed. Meanwhile I splitted the comments to keep PUSH part in arm_attr_length_push_multi. Patch updated. OK for trunk? gcc/ 2016-06-02 Jiong WangPR target/71061 * config/arm/arm-protos.h (arm_attr_length_pop_multi): New declaration. * config/arm/arm.c (arm_attr_length_pop_multi): New function to return length for pop patterns. (arm_attr_length_push_multi): Update comments. * config/arm/arm.md (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 34fd06a..c707fd5 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -163,6 +163,7 @@ extern const char *arm_output_iwmmxt_shift_immediate (const char *, rtx *, bool) extern const char *arm_output_iwmmxt_tinsr (rtx *); extern unsigned int arm_sync_loop_insns (rtx , rtx *); extern int arm_attr_length_push_multi(rtx, rtx); +extern int arm_attr_length_pop_multi(rtx *, bool, bool); extern void arm_expand_compare_and_swap (rtx op[]); extern void arm_split_compare_and_swap (rtx op[]); extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 16499ce..350e46e 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -27795,7 +27795,7 @@ arm_preferred_rename_class (reg_class_t rclass) return NO_REGS; } -/* Compute the atrribute "length" of insn "*push_multi". +/* Compute the attribute "length" of insn "*push_multi". So this function MUST be kept in sync with that insn pattern. */ int arm_attr_length_push_multi(rtx parallel_op, rtx first_op) @@ -27812,6 +27812,11 @@ arm_attr_length_push_multi(rtx parallel_op, rtx first_op) /* Thumb2 mode. */ regno = REGNO (first_op); + /* For PUSH/STM under Thumb2 mode, we can use 16-bit encodings if the register + list is 8-bit. Normally this means all registers in the list must be + LO_REGS, that is (R0 -R7). If any HI_REGS used, then we must use 32-bit + encodings. There is one exception for PUSH that LR in HI_REGS can be used + with 16-bit encoding. */ hi_reg = (REGNO_REG_CLASS (regno) == HI_REGS) && (regno != LR_REGNUM); for (i = 1; i < num_saves && !hi_reg; i++) { @@ -27824,6 +27829,56 @@ arm_attr_length_push_multi(rtx parallel_op, rtx first_op) return 4; } +/* Compute the attribute "length" of insn. Currently, this function is used + for "*load_multiple_with_writeback", "*pop_multiple_with_return" and + "*pop_multiple_with_writeback_and_return". OPERANDS is the toplevel PARALLEL + rtx, RETURN_PC is true if OPERANDS contains return insn. WRITE_BACK_P is + true if OPERANDS contains insn which explicit updates base register. */ + +int +arm_attr_length_pop_multi (rtx *operands, bool return_pc, bool write_back_p) +{ + /* ARM mode. */ + if (TARGET_ARM) +return 4; + /* Thumb1 mode. */ + if (TARGET_THUMB1) +return 2; + + rtx parallel_op = operands[0]; + /* Initialize to elements number of PARALLEL.
Re: [Patch, ARM] PR71061, length pop* pattern in epilogue correctly
Hi Jiong, On 19/05/16 09:19, Jiong Wang wrote: On 13/05/16 14:54, Jiong Wang wrote: For thumb mode, this is causing wrong size calculation and may affect some rtl pass, for example bb-order where copy_bb_p needs accurate insn length info. This have eventually part of the reason for https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00639.html where bb-order failed to do the bb copy. For the fix, I think we should extend arm_attr_length_push_multi to pop* pattern. OK for trunk? 2016-05-13 Jiong. Wanggcc/ PR target/71061 * config/arm/arm-protos.h (arm_attr_length_push_multi): Rename to "arm_attr_length_pp_multi". Add one parameter "first_index". * config/arm/arm.md (*push_multi): Use new function. (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. Wilco pointed out offline the new length function is wrong as for pop we should check PC_REG instead of LR_REG, meanwhile these pop patterns may allow ldm thus base register needs to be checked also. I updated this patch to address these issues. OK for trunk ? I agree with the idea of the patch, but I have some comments inline. gcc/ 2016-05-19 Jiong Wang PR target/71061 * config/arm/arm-protos.h (arm_attr_length_pop_multi): New declaration. * config/arm/arm.c (arm_attr_length_pop_multi): New function to return length for pop patterns. * config/arm/arm.md (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index d8179c4..5dd3e0d 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -163,6 +163,7 @@ extern const char *arm_output_iwmmxt_shift_immediate (const char *, rtx *, bool) extern const char *arm_output_iwmmxt_tinsr (rtx *); extern unsigned int arm_sync_loop_insns (rtx , rtx *); extern int arm_attr_length_push_multi(rtx, rtx); +extern int arm_attr_length_pop_multi(rtx *, bool, bool); extern void arm_expand_compare_and_swap (rtx op[]); extern void arm_split_compare_and_swap (rtx op[]); extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c3f74dc..4d19d3a 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -27812,6 +27812,65 @@ arm_attr_length_push_multi(rtx parallel_op, rtx first_op) return 4; } +/* Compute the atrribute "length" of insn. Currently, this function is used + for "*load_multiple_with_writeback", "*pop_multiple_with_return" and + "*pop_multiple_with_writeback_and_return". */ + s/atrribute/attribute/ Also, please add a description of the function arguments to the function comment. +int +arm_attr_length_pop_multi(rtx *operands, bool return_pc, bool write_back_p) +{ space between function name and '('. + /* ARM mode. */ + if (TARGET_ARM) +return 4; + /* Thumb1 mode. */ + if (TARGET_THUMB1) +return 2; + + /* For POP/PUSH/LDM/STM under Thumb2 mode, we can use 16-bit Thumb encodings + if the register list is 8-bit. Normally this means all registers in the + list must be LO_REGS, that is (R0 -R7). If any HI_REGS used, then we must + use 32-bit encodings. But there are two exceptions to this rule where + special HI_REGS used in PUSH/POP. + + 1. 16-bit Thumb encoding POP can use an 8-bit register list, and an +additional bit, P, that corresponds to the PC. This means it can access + any of {PC, R7-R0}. It took me a bit to figure it out but this bit 'P' you're talking about is a just a bit in the illustration in the ARM Architecture Reference Manual (section A8.8.131). I don't think it's useful to refer to it. This would be better phrased as "The encoding is 16 bits wide if the register list contains only the registers in {R0 - R7} or the PC". + 2. 16-bit Thumb encoding PUSH can use an 8-bit register list, and an + additional bit, M , that corresponds to the LR. This means it can + access any of {LR, R7-R0}. */ + This function only deals with POP instructions, so talking about PUSH encodings is confusing. I suggest you drop point 2). + rtx parallel_op = operands[0]; + /* Initialize to elements number of PARALLEL. */ + unsigned indx = XVECLEN (parallel_op, 0) - 1; + /* Initialize the value to base register. */ + unsigned regno = REGNO (operands[1]); + /* Skip return and write back pattern. + We only need register pop pattern for later analysis. */ + unsigned first_indx = 0; + first_indx += return_pc ? 1 : 0; + first_indx += write_back_p ? 1 : 0; + + /* A pop operation can be done through LDM or POP. If the base register is SP + and if it's with write back, then a LDM will be alias of POP. */ + bool pop_p = (regno == SP_REGNUM &&
Re: [Patch, ARM] PR71061, length pop* pattern in epilogue correctly
On 13/05/16 14:54, Jiong Wang wrote: For thumb mode, this is causing wrong size calculation and may affect some rtl pass, for example bb-order where copy_bb_p needs accurate insn length info. This have eventually part of the reason for https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00639.html where bb-order failed to do the bb copy. For the fix, I think we should extend arm_attr_length_push_multi to pop* pattern. OK for trunk? 2016-05-13 Jiong. Wanggcc/ PR target/71061 * config/arm/arm-protos.h (arm_attr_length_push_multi): Rename to "arm_attr_length_pp_multi". Add one parameter "first_index". * config/arm/arm.md (*push_multi): Use new function. (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. Wilco pointed out offline the new length function is wrong as for pop we should check PC_REG instead of LR_REG, meanwhile these pop patterns may allow ldm thus base register needs to be checked also. I updated this patch to address these issues. OK for trunk ? gcc/ 2016-05-19 Jiong Wang PR target/71061 * config/arm/arm-protos.h (arm_attr_length_pop_multi): New declaration. * config/arm/arm.c (arm_attr_length_pop_multi): New function to return length for pop patterns. * config/arm/arm.md (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index d8179c4..5dd3e0d 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -163,6 +163,7 @@ extern const char *arm_output_iwmmxt_shift_immediate (const char *, rtx *, bool) extern const char *arm_output_iwmmxt_tinsr (rtx *); extern unsigned int arm_sync_loop_insns (rtx , rtx *); extern int arm_attr_length_push_multi(rtx, rtx); +extern int arm_attr_length_pop_multi(rtx *, bool, bool); extern void arm_expand_compare_and_swap (rtx op[]); extern void arm_split_compare_and_swap (rtx op[]); extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c3f74dc..4d19d3a 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -27812,6 +27812,65 @@ arm_attr_length_push_multi(rtx parallel_op, rtx first_op) return 4; } +/* Compute the atrribute "length" of insn. Currently, this function is used + for "*load_multiple_with_writeback", "*pop_multiple_with_return" and + "*pop_multiple_with_writeback_and_return". */ + +int +arm_attr_length_pop_multi(rtx *operands, bool return_pc, bool write_back_p) +{ + /* ARM mode. */ + if (TARGET_ARM) +return 4; + /* Thumb1 mode. */ + if (TARGET_THUMB1) +return 2; + + /* For POP/PUSH/LDM/STM under Thumb2 mode, we can use 16-bit Thumb encodings + if the register list is 8-bit. Normally this means all registers in the + list must be LO_REGS, that is (R0 -R7). If any HI_REGS used, then we must + use 32-bit encodings. But there are two exceptions to this rule where + special HI_REGS used in PUSH/POP. + + 1. 16-bit Thumb encoding POP can use an 8-bit register list, and an +additional bit, P, that corresponds to the PC. This means it can access + any of {PC, R7-R0}. + 2. 16-bit Thumb encoding PUSH can use an 8-bit register list, and an + additional bit, M , that corresponds to the LR. This means it can + access any of {LR, R7-R0}. */ + + rtx parallel_op = operands[0]; + /* Initialize to elements number of PARALLEL. */ + unsigned indx = XVECLEN (parallel_op, 0) - 1; + /* Initialize the value to base register. */ + unsigned regno = REGNO (operands[1]); + /* Skip return and write back pattern. + We only need register pop pattern for later analysis. */ + unsigned first_indx = 0; + first_indx += return_pc ? 1 : 0; + first_indx += write_back_p ? 1 : 0; + + /* A pop operation can be done through LDM or POP. If the base register is SP + and if it's with write back, then a LDM will be alias of POP. */ + bool pop_p = (regno == SP_REGNUM && write_back_p); + bool ldm_p = !pop_p; + + /* Check base register for LDM. */ + if (ldm_p && REGNO_REG_CLASS (regno) == HI_REGS) +return 4; + + /* Check each register in the list. */ + for (; indx >= first_indx; indx--) +{ + regno = REGNO (XEXP (XVECEXP (parallel_op, 0, indx), 0)); + if (REGNO_REG_CLASS (regno) == HI_REGS + && (regno != PC_REGNUM || ldm_p)) + return 4; +} + + return 2; +} + /* Compute the number of instructions emitted by output_move_double. */ int arm_count_output_move_double_insns (rtx *operands) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 4049f10..d746690 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -10565,7 +10565,11 @@ } " [(set_attr "type" "load4") - (set_attr
[Patch, ARM] PR71061, length pop* pattern in epilogue correctly
For thumb mode, this is causing wrong size calculation and may affect some rtl pass, for example bb-order where copy_bb_p needs accurate insn length info. This have eventually part of the reason for https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00639.html where bb-order failed to do the bb copy. For the fix, I think we should extend arm_attr_length_push_multi to pop* pattern. OK for trunk? 2016-05-13 Jiong. Wanggcc/ PR target/71061 * config/arm/arm-protos.h (arm_attr_length_push_multi): Rename to "arm_attr_length_pp_multi". Add one parameter "first_index". * config/arm/arm.md (*push_multi): Use new function. (*load_multiple_with_writeback): Set "length" attribute. (*pop_multiple_with_writeback_and_return): Likewise. (*pop_multiple_with_return): Likewise. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index d8179c4..d9a09c0 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -162,7 +162,7 @@ extern const char *arm_output_shift(rtx *, int); extern const char *arm_output_iwmmxt_shift_immediate (const char *, rtx *, bool); extern const char *arm_output_iwmmxt_tinsr (rtx *); extern unsigned int arm_sync_loop_insns (rtx , rtx *); -extern int arm_attr_length_push_multi(rtx, rtx); +extern int arm_attr_length_pp_multi(rtx, rtx, int); extern void arm_expand_compare_and_swap (rtx op[]); extern void arm_split_compare_and_swap (rtx op[]); extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 71b5143..0ba98e1 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -27729,14 +27729,15 @@ arm_preferred_rename_class (reg_class_t rclass) return NO_REGS; } -/* Compute the atrribute "length" of insn "*push_multi". - So this function MUST be kept in sync with that insn pattern. */ +/* Compute the attribute "length" of an insn which performs a push/pop on + multiple registers. So this function MUST be kept in sync with that insn + pattern. PARALLEL_OP is the toplevel PARALLEL rtx, FIRST_OP is the first + push/pop register. FIRST_INDEX is the element index inside PARALLEL_OP for + the first register push/pop rtx. */ + int -arm_attr_length_push_multi(rtx parallel_op, rtx first_op) +arm_attr_length_pp_multi(rtx parallel_op, rtx first_op, int first_index) { - int i, regno, hi_reg; - int num_saves = XVECLEN (parallel_op, 0); - /* ARM mode. */ if (TARGET_ARM) return 4; @@ -27744,18 +27745,31 @@ arm_attr_length_push_multi(rtx parallel_op, rtx first_op) if (TARGET_THUMB1) return 2; - /* Thumb2 mode. */ - regno = REGNO (first_op); - hi_reg = (REGNO_REG_CLASS (regno) == HI_REGS) && (regno != LR_REGNUM); - for (i = 1; i < num_saves && !hi_reg; i++) + /* Thumb2 mode. + For the pattern "*push_multi", the register for the first push is kept in + the first UNSPEC rtx inside parallel, all other registers are kept in the + later USE rtxes. For pop* pattern, each register pop is cleanly + represented by an (set (reg) (mem)). + + So we can't always use REGNO (XEXP (input, 0)) to fetch the first register, + thus it's passed as argument. Then we iterate the register list from the + last to the first, as the high register is usually at the end so we can + return earlier. */ + + unsigned int regno = REGNO (first_op); + if ((REGNO_REG_CLASS (regno) == HI_REGS) && (regno != LR_REGNUM)) +return 4; + + int i = XVECLEN (parallel_op, 0) - 1; + gcc_assert (first_index >= 0 && first_index <= i); + for (; i > first_index; i--) { regno = REGNO (XEXP (XVECEXP (parallel_op, 0, i), 0)); - hi_reg |= (REGNO_REG_CLASS (regno) == HI_REGS) && (regno != LR_REGNUM); + if (REGNO_REG_CLASS (regno) == HI_REGS && (regno != LR_REGNUM)) + return 4; } - if (!hi_reg) -return 2; - return 4; + return 2; } /* Compute the number of instructions emitted by output_move_double. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 7cf87ef..1e175f6 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -10488,7 +10488,7 @@ ;; expressions. For simplicity, the first register is also in the unspec ;; part. ;; To avoid the usage of GNU extension, the length attribute is computed -;; in a C function arm_attr_length_push_multi. +;; in a C function arm_attr_length_pp_multi. (define_insn "*push_multi" [(match_parallel 2 "multi_register_push" [(set (match_operand:BLK 0 "push_mult_memory_operand" "") @@ -10530,7 +10530,7 @@ }" [(set_attr "type" "store4") (set (attr "length") - (symbol_ref "arm_attr_length_push_multi (operands[2], operands[1])"))] + (symbol_ref "arm_attr_length_pp_multi (operands[2], operands[1], 0)"))] ) (define_insn "stack_tie" @@ -10565,7 +10565,9 @@ } " [(set_attr "type" "load4") - (set_attr "predicable" "yes")] + (set_attr "predicable" "yes") + (set (attr "length") +