> -----Original Message----- > From: Ramana Radhakrishnan > Sent: Thursday, April 11, 2013 4:40 PM > To: Joey Ye > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH][ARM][thumb1] Reduce lr save for leaf function with > non-far jump > > On 12/20/12 09:53, Joey Ye wrote: > > Current GCC thumb1 has an annoying problem that always assuming far > branch. > > So it forces to save lr, even when unnecessarily. The most extreme > > case complained by partner is: > > > > // compiled with "-mthumb -mcpu=cortex-m0 -Os". > > void foo() { for (;;); } > > => > > foo: > > push {lr} // Crazy!!! > > .L2: > > b .L2 > > > > The reason is that thumb1 far jump is only resolved in the very late > > pass "shorten_branch". Prologue/epilogue pass doesn't actually know a > > branch is far or not from its attribute. It has to conservatively > > save/restore lr whenever there is a branch. > > > > This patch tries to fix it with a simple heuristic, i.e., using > > function size to decide if a far jump will likely be used. Function > > size information is meaningful in prologue/epilogue pass. The > > heuristic uses following check to decide if lr should be saved for far jump: > > > > function_size * 3 >= 2048 // yes: save lr for possible far jump. No: > > don't save lr for far jump > > > > The scheme has an issue: if some corner case does break above > > condition, there is no chance to fix-up but to ICE. But the heuristic > > condition is very conservative. It is base on the worse normal > > condition that each instruction is associated with a 4 byte literal ( (2+4)/2=3, > blooming size by 3 times ). > > I can't think of a real case to trigger the ICE. So I think it should work. > > > > Other approaches than the heuristic scheme are too expensive to > > implement for this small size/performance issue. I did explored some > > but none of them persuaded myself. > > > > Tests passed: > > * build libgcc, libstdc++, newlib, libm > > * make check-gcc with cpu=cortex-m0 > > * Small and extreme test cases > > > > ChangeLog: > > > > 2012-12-20 Joey Ye <joey...@arm.com> > > > > * config/arm/arm.c(thumb1_final_prescan_insn): > > Assert lr save for real far jump. > > (thumb_far_jump_used_p): Count instruction size and set > > far_jump_used. > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index > > 327ef22..ad79451 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) > > else if (conds != CONDS_NOCOND) > > cfun->machine->thumb1_cc_insn = NULL_RTX; > > } > > + > > + /* Check if unexpected far jump is used. */ > > + if (cfun->machine->lr_save_eliminated > > + && get_attr_far_jump (insn) == FAR_JUMP_YES) > > + internal_error("Unexpected thumb1 far jump"); > > } > > > > int > > @@ -21815,6 +21887,8 @@ static int > > thumb_far_jump_used_p (void) > > { > > rtx insn; > > + bool far_jump = false; > > + unsigned int func_size = 0; > > > > /* This test is only important for leaf functions. */ > > /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@ > > thumb_far_jump_used_p (void) > > && get_attr_far_jump (insn) == FAR_JUMP_YES > > ) > > { > > + far_jump = true; > > + } > > + func_size += get_attr_length (insn); > > + } > > + > > + /* Attribute far_jump will always be true for thumb1 before > > shorten_branch > > + pass. So checking far_jump attribute before shorten_branch isn't much > > + useful. > > + > > + Following heuristic tries to estimate more accurately if a far > > + jump > > may > > + finally be used. The heuristic is very conservative as there is > > + no > > chance > > + to roll-back the decision of not to use far jump. > > + > > + Thumb1 long branch offset is -2048 to 2046. The worst case is > > + each > > 2-byte > > + insn is associated with a 4 byte constant pool. Using function size > > + 2048/3 as the threshold is conservative enough. */ if > > + (far_jump) > > + { > > + if ((func_size * 3) >= 2048) > > + { > > /* Record the fact that we have decided that > > the function does use far jumps. */ > > cfun->machine->far_jump_used = 1; > > > > > > > > Check for 80 character line length in the comments above - I can never tell if > it is my mail client or yours. Further shorten the lines. > > Otherwise ok if no regressions.. Make check targeting Cortex-M0/M3 with qemu. No regression.
Committed as 197956 - Joey