Hi Sudi,
On 22 March 2018 at 18:26, Sudakshina Das <sudi....@arm.com> wrote: > Hi > > > On 22/03/18 16:52, Kyrill Tkachov wrote: >> >> >> On 22/03/18 16:20, Sudakshina Das wrote: >>> >>> Hi Kyrill >>> >>> On 22/03/18 16:08, Kyrill Tkachov wrote: >>>> >>>> Hi Sudi, >>>> >>>> On 21/03/18 17:44, Sudakshina Das wrote: >>>>> >>>>> Hi >>>>> >>>>> The ICE in the bug report was happening because the macro >>>>> USE_RETURN_INSN (FALSE) was returning different values at different >>>>> points in the compilation. This was internally occurring because the >>>>> function arm_compute_static_chain_stack_bytes () which was dependent on >>>>> arm_r3_live_at_start_p () was giving a different value after the >>>>> cond_exec instructions were created in ce3 causing the liveness of r3 >>>>> to escape up to the start block. >>>>> >>>>> The function arm_compute_static_chain_stack_bytes () should really >>>>> only compute the value once during epilogue/prologue stage. This pass >>>>> introduces a new member 'static_chain_stack_bytes' to the target >>>>> definition of the struct machine_function which gets calculated in >>>>> expand_prologue and is the value that is returned by >>>>> arm_compute_static_chain_stack_bytes () beyond that. >>>>> >>>>> Testing done: Bootstrapped and regtested on arm-none-linux-gnueabihf >>>>> and added the reported test to the testsuite. >>>>> >>>>> Is this ok for trunk? >>>>> >>>> >>>> Thanks for working on this. >>>> I agree with the approach, I have a couple of comments inline. >>>> >>>>> Sudi >>>>> >>>>> >>>>> ChangeLog entries: >>>>> >>>>> *** gcc/ChangeLog *** >>>>> >>>>> 2018-03-21 Sudakshina Das <sudi....@arm.com> >>>>> >>>>> PR target/84826 >>>>> * config/arm/arm.h (machine_function): Add >>>>> static_chain_stack_bytes. >>>>> * config/arm/arm.c (arm_compute_static_chain_stack_bytes): >>>>> Avoid re-computing once computed. >>>>> (arm_expand_prologue): Compute >>>>> machine->static_chain_stack_bytes. >>>>> (arm_init_machine_status): Initialize >>>>> machine->static_chain_stack_bytes. >>>>> >>>>> *** gcc/testsuite/ChangeLog *** >>>>> >>>>> 2018-03-21 Sudakshina Das <sudi....@arm.com> >>>>> >>>>> PR target/84826 >>>>> * gcc.target/arm/pr84826.c: New test >>>> The new test fails on arm-none-linux-gnueabi --with-mode thumb --with-cpu cortex-a9 --with-fpu default Dejagnu flags: -march=armv5t Because: /gcc/testsuite/gcc.target/arm/pr84826.c: In function 'a': /gcc/testsuite/gcc.target/arm/pr84826.c:15:1: sorry, unimplemented: -fstack-check=specific for Thumb-1 compiler exited with status 1 FAIL: gcc.target/arm/pr84826.c (test for excess errors) You probably have to add a require-effective-target to skip the test in such cases. Thanks, Christophe >>>> >>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h >>>> index bbf3937..2809112 100644 >>>> --- a/gcc/config/arm/arm.h >>>> +++ b/gcc/config/arm/arm.h >>>> @@ -1384,6 +1384,9 @@ typedef struct GTY(()) machine_function >>>> machine_mode thumb1_cc_mode; >>>> /* Set to 1 after arm_reorg has started. */ >>>> int after_arm_reorg; >>>> + /* The number of bytes used to store the static chain register on the >>>> + stack, above the stack frame. */ >>>> + int static_chain_stack_bytes; >>>> } >>>> machine_function; >>>> #endif >>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >>>> index cb6ab81..bc31810 100644 >>>> --- a/gcc/config/arm/arm.c >>>> +++ b/gcc/config/arm/arm.c >>>> @@ -19392,6 +19392,11 @@ arm_r3_live_at_start_p (void) >>>> static int >>>> arm_compute_static_chain_stack_bytes (void) >>>> { >>>> + /* Once the value is updated from the init value of -1, do not >>>> + re-compute. */ >>>> + if (cfun->machine->static_chain_stack_bytes != -1) >>>> + return cfun->machine->static_chain_stack_bytes; >>>> + >>>> >>>> My concern is that this approach caches the first value that is computed >>>> for static_chain_stack_bytes. >>>> I believe the layout frame code is called multiple times during register >>>> allocation as it goes through >>>> the motions and I think we want the last value it computes during reload >>>> >>>> How about we do something like: >>>> if (cfun->machine->static_chain_stack_bytes != -1 >>>> &&epilogue_completed) >>>> return cfun->machine->static_chain_stack_bytes; >>>> >>>> ?... >>>> >>>> /* See the defining assertion in arm_expand_prologue. */ >>>> if (IS_NESTED (arm_current_func_type ()) >>>> && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM) >>>> @@ -21699,6 +21704,11 @@ arm_expand_prologue (void) >>>> emit_insn (gen_movsi (stack_pointer_rtx, r1)); >>>> } >>>> >>>> + /* Let's compute the static_chain_stack_bytes required and store it. >>>> Right >>>> + now the value must the -1 as stored by arm_init_machine_status (). >>>> */ >>>> >>>> ... this comment would need to be tweaked as >>>> cfun->machine->static_chain_stack_bytes may hold >>>> an intermediate value computed in reload or some other point before >>>> epilogue_completed. >>>> >>>> + cfun->machine->static_chain_stack_bytes >>>> + = arm_compute_static_chain_stack_bytes (); >>>> + >>> >>> >>> Maybe I did not understand this completely, but my idea was that I am >>> initializing the value of cfun->machine->static_chain_stack_bytes to be -1 >>> in arm_init_machine_status () and then it stays as -1 all throughout reload >>> and hence the function arm_compute_static_chain_stack_bytes () will keep >>> computing the value instead of returning the cached value. Only during >>> expand_prologue (which I assumed occurs much after reload), I overwrite the >>> initial -1 and after that any call to arm_compute_static_chain_stack_bytes >>> () would return this cached value. >>> >>> I did start out writing the patch with a epilogue_completed check but >>> realized that even during this stage arm_compute_static_chain_stack_bytes () >>> was called several times and thus to avoid those re-computations, (again >>> assuming by this stage we already should have a fixed value) I re-wrote it >>> with the initialization to -1 approach. >>> >> >> Right, I had read the patch too quickly, sorry. >> You perform the caching of cfun->machine->static_chain_stack_bytes in >> arm_expand_prologue, not arm_compute_static_chain_stack_bytes. >> That does give you the right semantics I think. >> >> The patch is ok then with a small typo fix: > > > Thanks! Committed as r258777. > > Sudi > > >> >> + /* Let's compute the static_chain_stack_bytes required and store it. >> Right >> + now the value must the -1 as stored by arm_init_machine_status (). >> */ >> >> s/the/be/ >> >> + cfun->machine->static_chain_stack_bytes >> + = arm_compute_static_chain_stack_bytes (); >> + >> >> >> Thanks, >> Kyrill >> >