Hongtao Liu via Gcc-patches <[email protected]> writes:
> On Mon, May 17, 2021 at 5:56 PM Richard Sandiford
> <[email protected]> wrote:
>> It looks like the rtx “used” flag is unused for INSNs, so we could
>> use that as a CALL_INSN flag that indicates a fake call. We could just
>> need to make:
>>
>> /* For all other RTXes clear the used flag on the copy. */
>> RTX_FLAG (copy, used) = 0;
>>
>> conditional on !INSN_P.
>>
> I got another error in
>
> @@ -83,6 +83,9 @@ control_flow_insn_p (const rtx_insn *insn)
> return true;
>
> case CALL_INSN:
> + /* CALL_INSN use "used" flag to indicate it's a fake call. */
> + if (RTX_FLAG (insn, used))
> + break;
I guess this is because of the nonlocal_goto condition? If so, that
could be fixed by adding a REG_EH_REGION note of INT_MIN. Even if we
don't do that, I think the fix belongs in nonlocal_goto instead.
> and performance issue in
>
> modified gcc/final.c
> @@ -4498,7 +4498,8 @@ leaf_function_p (void)
> for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> {
> if (CALL_P (insn)
> - && ! SIBLING_CALL_P (insn))
> + && ! SIBLING_CALL_P (insn)
> + && !RTX_FLAG (insn, used))
> return 0;
> if (NONJUMP_INSN_P (insn)
>
> Also i grep CALL_P or CALL_INSN in GCC source codes, there are many
> places which hold the assumption CALL_P/CALL_INSN is a real call.
> Considering that vzeroupper is used a lot on the i386 backend, I'm a
> bit worried that this implementation solution will be a bottomless
> pit.
Maybe, but I think the same is true for CLOBBER_HIGH. If we have
a third alternative then we should consider it, but I think the
call approach is still going to be less problematic then CLOBBER_HIGH.
The main advantage of the call approach is that the CALL_P handling
is (mostly) conservatively correct and performance problems are just
a one-line change. The CLOBBER_HIGH approach instead requires
changes to the way that passes track liveness information for
non-call instructions (so is much more than a one-line change).
Also, treating a CLOBBER_HIGH like a CLOBBER isn't conservatively
correct, because other code might be relying on part of the register
being preserved.
Thanks,
Richard