Hi Sudi,

On 22 March 2018 at 18:26, Sudakshina Das <sudi....@arm.com> wrote:
> Hi
>
>
> On 22/03/18 16:52, Kyrill Tkachov wrote:
>>
>>
>> On 22/03/18 16:20, Sudakshina Das wrote:
>>>
>>> Hi Kyrill
>>>
>>> On 22/03/18 16:08, Kyrill Tkachov wrote:
>>>>
>>>> Hi Sudi,
>>>>
>>>> On 21/03/18 17:44, Sudakshina Das wrote:
>>>>>
>>>>> Hi
>>>>>
>>>>> The ICE in the bug report was happening because the macro
>>>>> USE_RETURN_INSN (FALSE) was returning different values at different
>>>>> points in the compilation. This was internally occurring because the
>>>>> function arm_compute_static_chain_stack_bytes () which was dependent on
>>>>> arm_r3_live_at_start_p () was giving a different value after the
>>>>> cond_exec instructions were created in ce3 causing the liveness of r3
>>>>> to escape up to the start block.
>>>>>
>>>>> The function arm_compute_static_chain_stack_bytes () should really
>>>>> only compute the value once during epilogue/prologue stage. This pass
>>>>> introduces a new member 'static_chain_stack_bytes' to the target
>>>>> definition of the struct machine_function which gets calculated in
>>>>> expand_prologue and is the value that is returned by
>>>>> arm_compute_static_chain_stack_bytes () beyond that.
>>>>>
>>>>> Testing done: Bootstrapped and regtested on arm-none-linux-gnueabihf
>>>>> and added the reported test to the testsuite.
>>>>>
>>>>> Is this ok for trunk?
>>>>>
>>>>
>>>> Thanks for working on this.
>>>> I agree with the approach, I have a couple of comments inline.
>>>>
>>>>> Sudi
>>>>>
>>>>>
>>>>> ChangeLog entries:
>>>>>
>>>>> *** gcc/ChangeLog ***
>>>>>
>>>>> 2018-03-21  Sudakshina Das  <sudi....@arm.com>
>>>>>
>>>>>         PR target/84826
>>>>>         * config/arm/arm.h (machine_function): Add
>>>>>         static_chain_stack_bytes.
>>>>>         * config/arm/arm.c (arm_compute_static_chain_stack_bytes):
>>>>>         Avoid re-computing once computed.
>>>>>         (arm_expand_prologue): Compute
>>>>> machine->static_chain_stack_bytes.
>>>>>         (arm_init_machine_status): Initialize
>>>>>         machine->static_chain_stack_bytes.
>>>>>
>>>>> *** gcc/testsuite/ChangeLog ***
>>>>>
>>>>> 2018-03-21  Sudakshina Das  <sudi....@arm.com>
>>>>>
>>>>>         PR target/84826
>>>>>         * gcc.target/arm/pr84826.c: New test
>>>>
The new test fails on
arm-none-linux-gnueabi
--with-mode thumb
--with-cpu cortex-a9
--with-fpu default
Dejagnu flags: -march=armv5t

Because:
/gcc/testsuite/gcc.target/arm/pr84826.c: In function 'a':
/gcc/testsuite/gcc.target/arm/pr84826.c:15:1: sorry, unimplemented:
-fstack-check=specific for Thumb-1
compiler exited with status 1
FAIL: gcc.target/arm/pr84826.c (test for excess errors)

You probably have to add a require-effective-target to skip the test
in such cases.

Thanks,

Christophe

>>>>
>>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>>>> index bbf3937..2809112 100644
>>>> --- a/gcc/config/arm/arm.h
>>>> +++ b/gcc/config/arm/arm.h
>>>> @@ -1384,6 +1384,9 @@ typedef struct GTY(()) machine_function
>>>>     machine_mode thumb1_cc_mode;
>>>>     /* Set to 1 after arm_reorg has started.  */
>>>>     int after_arm_reorg;
>>>> +  /* The number of bytes used to store the static chain register on the
>>>> +     stack, above the stack frame.  */
>>>> +  int static_chain_stack_bytes;
>>>>   }
>>>>   machine_function;
>>>>   #endif
>>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>>> index cb6ab81..bc31810 100644
>>>> --- a/gcc/config/arm/arm.c
>>>> +++ b/gcc/config/arm/arm.c
>>>> @@ -19392,6 +19392,11 @@ arm_r3_live_at_start_p (void)
>>>>   static int
>>>>   arm_compute_static_chain_stack_bytes (void)
>>>>   {
>>>> +  /* Once the value is updated from the init value of -1, do not
>>>> +     re-compute.  */
>>>> +  if (cfun->machine->static_chain_stack_bytes != -1)
>>>> +    return cfun->machine->static_chain_stack_bytes;
>>>> +
>>>>
>>>> My concern is that this approach caches the first value that is computed
>>>> for static_chain_stack_bytes.
>>>> I believe the layout frame code is called multiple times during register
>>>> allocation as it goes through
>>>> the motions and I think we want the last value it computes during reload
>>>>
>>>> How about we do something like:
>>>>    if (cfun->machine->static_chain_stack_bytes != -1
>>>> &&epilogue_completed)
>>>>      return cfun->machine->static_chain_stack_bytes;
>>>>
>>>> ?...
>>>>
>>>>   /* See the defining assertion in arm_expand_prologue.  */
>>>>     if (IS_NESTED (arm_current_func_type ())
>>>>         && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
>>>> @@ -21699,6 +21704,11 @@ arm_expand_prologue (void)
>>>>         emit_insn (gen_movsi (stack_pointer_rtx, r1));
>>>>       }
>>>>
>>>> +  /* Let's compute the static_chain_stack_bytes required and store it.
>>>> Right
>>>> +     now the value must the -1 as stored by arm_init_machine_status ().
>>>> */
>>>>
>>>> ... this comment would need to be tweaked as
>>>> cfun->machine->static_chain_stack_bytes may hold
>>>> an intermediate value computed in reload or some other point before
>>>> epilogue_completed.
>>>>
>>>> +  cfun->machine->static_chain_stack_bytes
>>>> +    = arm_compute_static_chain_stack_bytes ();
>>>> +
>>>
>>>
>>> Maybe I did not understand this completely, but my idea was that I am
>>> initializing the value of cfun->machine->static_chain_stack_bytes to be -1
>>> in arm_init_machine_status () and then it stays as -1 all throughout reload
>>> and hence the function arm_compute_static_chain_stack_bytes () will keep
>>> computing the value instead of returning the cached value. Only during
>>> expand_prologue (which I assumed occurs much after reload), I overwrite the
>>> initial -1 and after that any call to arm_compute_static_chain_stack_bytes
>>> () would return this cached value.
>>>
>>> I did start out writing the patch with a epilogue_completed check but
>>> realized that even during this stage arm_compute_static_chain_stack_bytes ()
>>> was called several times and thus to avoid those re-computations, (again
>>> assuming by this stage we already should have a fixed value) I re-wrote it
>>> with the initialization to -1 approach.
>>>
>>
>> Right, I had read the patch too quickly, sorry.
>> You perform the caching of cfun->machine->static_chain_stack_bytes in
>> arm_expand_prologue, not arm_compute_static_chain_stack_bytes.
>> That does give you the right semantics I think.
>>
>> The patch is ok then with a small typo fix:
>
>
> Thanks! Committed as r258777.
>
> Sudi
>
>
>>
>> +  /* Let's compute the static_chain_stack_bytes required and store it.
>> Right
>> +     now the value must the -1 as stored by arm_init_machine_status ().
>> */
>>
>> s/the/be/
>>
>> +  cfun->machine->static_chain_stack_bytes
>> +    = arm_compute_static_chain_stack_bytes ();
>> +
>>
>>
>> Thanks,
>> Kyrill
>>
>

Reply via email to