https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99600

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, just to document what GCC 10 does:
(insn 38 37 15 3 (set (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
        (plus:DI (mult:DI (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                (const_int 4 [0x4]))
            (const_int 4 [0x4]))) "pr99600.c":8:25 182 {*leadi}
     (nil))
after RA before split2 (like in GCC 11).
split2 makes:
(insn 44 43 45 3 (parallel [
            (set (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                (ashift:DI (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                    (const_int 2 [0x2])))
            (clobber (reg:CC 17 flags))
        ]) "pr99600.c":8:25 592 {*ashldi3_1}
     (nil))
(insn 45 44 15 3 (parallel [
            (set (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                (plus:DI (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                    (const_int 4 [0x4])))
            (clobber (reg:CC 17 flags))
        ]) "pr99600.c":8:25 186 {*adddi_1}
     (nil))
out of that because lea is expensive on atom.
Then peephole2 triggers and undoes that using the 2nd pattern mentioned in
there (but apparently not perfectly):
(insn 56 55 57 3 (set (reg:DI 1 dx)
        (const_int 4 [0x4])) "pr99600.c":8:25 -1
     (nil))
(insn 57 56 15 3 (set (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
        (plus:DI (reg:DI 1 dx)
            (mult:DI (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                (const_int 4 [0x4])))) "pr99600.c":8:25 -1
     (nil))
and finally split3 applies the lea split up again:
(insn 56 55 66 3 (set (reg:DI 1 dx)
        (const_int 4 [0x4])) "pr99600.c":8:25 66 {*movdi_internal}
     (nil))
(insn 66 56 67 3 (parallel [
            (set (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                (ashift:DI (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                    (const_int 2 [0x2])))
            (clobber (reg:CC 17 flags))
        ]) "pr99600.c":8:25 592 {*ashldi3_1}
     (nil))
(insn 67 66 15 3 (parallel [
            (set (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                (plus:DI (reg:DI 0 ax [orig:84 iftmp.1_3 ] [84])
                    (reg:DI 1 dx)))
            (clobber (reg:CC 17 flags))
        ]) "pr99600.c":8:25 186 {*adddi_1}
     (nil))
But because each of those do it, undo it, do it again operations happens in a
separate pass, there is not the compiler hang.

This means that I think the best fix is to FAIL in the second peephole2 if the
constructed address for lea is undesirable.
And maybe, for GCC12, optimize that peephole2 so that it doesn't force into
registers something that could be an immediate.

Reply via email to