On 14/12/2023 07:17, Surya Kumari Jangala via Gcc wrote:
> Hi Richard,
> Thanks a lot for your response!
>
> Another failure reported by the Linaro CI is as follows:
>
> Running gcc:gcc.dg/dg.exp ...
> FAIL: gcc.dg/ira-shrinkwrap-prep-1.c scan-rtl-dump pro_and_epilogue
> "Performing shrink-wrapping"
> FAIL: gcc.dg/pr10474.c scan-rtl-dump pro_and_epilogue "Performing
> shrink-wrapping"
>
> I analyzed the failures and the root cause is the same for both the failures.
>
> The test pr10474.c is as follows:
>
> void f(int *i)
> {
> if (!i)
> return;
> else
> {
> __builtin_printf("Hi");
> *i=0;
> }
> }
>
>
> With the patch (for PR111673), x1 (volatile) is being assigned to hold value
> of
> x0 (first parameter). Since it is a volatile, x1 is saved to the stack as
> there
> is a call later on. The save to the stack is generated in the LRA pass. The
> save
> is generated in the entry basic block. Due to the usage of the stack pointer
> in
> the entry bb, the testcase fails to be shrink wrapped.
I'm not entirely sure I understand what you mean from a quick glance. Do you
mean that X1 has the /v flag marked on it (ie it's printed in dumps as
"reg/v")? If so, that's not volatile, it just means that the register is
associated with a user variable (as opposed to a compiler-generated temporary
variable):
>From the manual:
@item REG_USERVAR_P (@var{x})
In a @code{reg}, nonzero if it corresponds to a variable present in
the user's source code. Zero for temporaries generated internally by
the compiler. Stored in the @code{volatil} field and printed as
@samp{/v}.
There are several other cases where we re-use this bit on different RTL
constructs to mean things other than 'volatile': it pretty much only has the
conventional meaning on MEM objects.
>
> The reason why LRA generates the store insn in the entry bb is as follows:
> LRA emits insns to save volatile registers in the inheritance/splitting pass.
> In this pass, LRA builds EBBs (Extended Basic Block) and traverses the insns
> in
> the EBBs in reverse order from the last insn to the first insn. When LRA sees
> a
> write to a pseudo (that has been assigned a volatile register), and there is a
> read following the write, with an intervening call insn between the write and
> read,
> then LRA generates a spill immediately after the write and a restore
> immediately
> before the read. In the above test, there is an EBB containing the entry bb
> and
> the bb with the printf call. In the entry bb, there is a write to x1
> (basically
> a copy from x0 to x1) and in the printf bb, there is a read of x1 after the
> call
> insn. So LRA generates a spill in the entry bb.
>
> Without patch, x19 is chosen to hold the value of x0. Since x19 is a
> non-volatile,
> the input RTL to the shrink wrap pass does not have any code to save x19 to
> the
> stack. Only the insn that copies x0 to x19 is present in the entry bb. In the
> shrink wrap pass, this insn is moved down the cfg to the bb containing the
> call
> to printf, thereby allowing prolog to be allocated only where needed. Thus
> shrink
> wrap succeeds.
>
>
> Shrink wrap can be made to succeed if the save of x1 occurs just before the
> call
> insn, instead of generating it after the write to x1. This will ensure that
> the
> spill does not occur in the entry bb. In fact, it is more efficient if the
> save
> occurs only in the path containing the printf call instead of occurring in the
> entry bb.
>
> I have a patch (bootstrapped and regtested on powerpc) that makes changes in
> LRA to save volatile registers before a call instead of after the write to the
> volatile. With this patch, both the above tests pass.
>
> Since the patch for PR111673 has been approved by Vladimir, I plan to
> commit the patch to trunk. And I will fix the test failures after doing the
> commit.
>
I think I'd probably understand this better if you could give some example RTL
(before and after). Do you have that?
R.
> Regards,
> Surya
>
>
>
> On 28/11/23 7:18 pm, Richard Earnshaw wrote:
>>
>>
>> On 28/11/2023 12:52, Surya Kumari Jangala wrote:
>>> Hi Richard,
>>> Thanks a lot for your response!
>>>
>>> Another failure reported by the Linaro CI is as follows :
>>> (Note: I am planning to send a separate mail for each failure, as this will
>>> make
>>> the discussion easy to track)
>>>
>>> FAIL: gcc.target/aarch64/sve/acle/general/cpy_1.c -march=armv8.2-a+sve
>>> -moverride=tune=none check-function-bodies dup_x0_m
>>>
>>> Expected code:
>>>
>>> ...
>>> add (x[0-9]+), x0, #?1
>>> mov (p[0-7])\.b, p15\.b
>>> mov z0\.d, \2/m, \1
>>> ...
>>> ret
>>>
>>>
>>> Code obtained w/o patch:
>>> addvl sp, sp, #-1
>>> str p15, [sp]
>>> add x0, x0, 1
>>> mov p3.b, p15.b
>>> mov z0.d, p3/m, x0
>>> ldr p15, [sp]
>>>