Jeff Law <[email protected]> writes:
> On 1/8/24 04:52, Richard Sandiford wrote:
>> Jeff Law <[email protected]> writes:
>>> The other issue that's been in the back of my mind is costing. But I
>>> think the model here is combine without regards to cost.
>>
>> No, it does take costing into account. For size, it's the usual
>> "sum up the before and after insn costs and see which one is lower".
>> For speed, the costs are weighted by execution frequency, so e.g.
>> two insns of cost 4 in the same block can be combined into a single
>> instruction of cost 8, but a hoisted invariant can only be combined
>> into a loop body instruction if the loop body instruction's cost
>> doesn't increase significantly.
>>
>> This is done by rtl_ssa::changes_are_worthwhile.
> You're absolutely correct. My bad.
>
> Interesting that's exactly where we do have a notable concern.
Gah.
> If you remember, there were a few ports that failed to build
> newlib/libgcc that we initially ignored. I went back and looked at one
> (arc-elf).
>
> What appears to be happening for arc-elf is we're testing to see if the
> change is profitable. On arc-elf the costing model is highly dependent
> on the length of the insns.
>
> We've got a very reasonable looking insn:
>
>> (insn 3674 753 2851 98 (set (reg/v:SI 18 r18 [orig:300 inex ] [300])
>> (ashift:SI (reg:SI 27 fp [548])
>> (const_int 4 [0x4])))
>> "../../../..//newlib-cygwin/newlib/libc/stdlib/gdtoa-gdtoa.c":437:10 120
>> {*ashlsi3_insn}
>> (nil))
>
> We call rtl_ssa::changes_are_profitable -> insn_cost -> arc_insn_cost ->
> get_attr_length -> get_attr_length_1 -> insn_default_length
>
> insn_default_length grubs around looking at the operands via recog_data
> which appears to be stale:
>
>
>
>> (gdb) p debug_rtx(recog_data.operand[0])
>> (reg/v:SI 18 r18 [orig:300 inex ] [300])
>> $4 = void
>> (gdb) p debug_rtx(recog_data.operand[1])
>> (reg/v:SI 3 r3 [orig:300 inex ] [300])
>> $5 = void
>> (gdb) p debug_rtx(recog_data.operand[2])
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x0000000001432955 in rtx_writer::print_rtx (this=0x7fffffffe0e0,
>> in_rtx=0xabababababababab) at /home/jlaw/test/gcc/gcc/print-rtl.cc:809
>> 809 else if (GET_CODE (in_rtx) > NUM_RTX_CODE)
>
> Note the 0xabab.... That was accessing operand #2, which should have
> been (const_int 4).
>
> Sure enough if I force re-recognition then look at the recog_data I get
> the right values.
>
> After LRA we have:
>
>> (insn 753 2434 3674 98 (set (reg/v:SI 3 r3 [orig:300 inex ] [300])
>> (ashift:SI (reg:SI 27 fp [548])
>> (const_int 4 [0x4])))
>> "../../../..//newlib-cygwin/newlib/libc/stdlib/gdtoa-gdtoa.c":437:10 120
>> {*ashlsi3_insn}
>> (nil))
>> (insn 3674 753 2851 98 (set (reg/v:SI 18 r18 [orig:300 inex ] [300])
>> (reg/v:SI 3 r3 [orig:300 inex ] [300]))
>> "../../../..//newlib-cygwin/newlib/libc/stdlib/gdtoa-gdtoa.c":437:10 3
>> {*movsi_insn}
>> (nil))
>
> In the emergency dump in late_combine2 (so cleanup hasn't been done):
>
>> (insn 753 2434 3674 98 (set (reg/v:SI 3 r3 [orig:300 inex ] [300])
>> (ashift:SI (reg:SI 27 fp [548])
>> (const_int 4 [0x4])))
>> "../../../..//newlib-cygwin/newlib/libc/stdlib/gdtoa-gdtoa.c":437:10 120
>> {*ashlsi3_insn}
>> (nil))
>> (insn 3674 753 2851 98 (set (reg/v:SI 18 r18 [orig:300 inex ] [300])
>> (ashift:SI (reg:SI 27 fp [548])
>> (const_int 4 [0x4])))
>> "../../../..//newlib-cygwin/newlib/libc/stdlib/gdtoa-gdtoa.c":437:10 120
>> {*ashlsi3_insn}
>> (nil))
>
>
> Which brings us to the question. If we change the form of an insn, then
> ask for its cost, don't we need to make sure the insn is re-recognized
> as the costing function may do things like query the insn's length which
> would use cached recog_data?
Yeah, this only happens once we've verified that the new instruction
is valid. And it looks from the emergency dump above that the insn
code has been correctly updated to *ashlsi3_insn.
This is a bit of a hopeful stab, but is the problem that recog_data still
had the previous contents of insn 3674, and so extract_insn_cached wrongly
thought that it doesn't need to do anything? If so, does something like:
diff --git a/gcc/recog.cc b/gcc/recog.cc
index a6799e3f5e6..8ba63c78179 100644
--- a/gcc/recog.cc
+++ b/gcc/recog.cc
@@ -267,6 +267,8 @@ validate_change_1 (rtx object, rtx *loc, rtx new_rtx, bool
in_group,
case invalid. */
changes[num_changes].old_code = INSN_CODE (object);
INSN_CODE (object) = -1;
+ if (recog_data.insn == object)
+ recog_data.insn = nullptr;
}
num_changes++;
fix it? I suppose there's an argument that this belongs in whatever code
sets INSN_CODE to a new nonnegative value (so recog_level2 for RTL-SSA).
But doing it in validate_change_1 seems more robust, since anything
calling that function is considering changing the insn code.
Thanks for debugging the problem.
Richard