On Thu, Feb 13, 2025 at 5:31 PM Uros Bizjak <[email protected]> wrote:
>
> On Thu, Feb 13, 2025 at 1:58 AM H.J. Lu <[email protected]> wrote:
> >
> > x86 conditional branch (jcc) target can be either a label or a symbol.
> > Add a pass to fold tail call with jcc by turning:
> >
> > jcc .L6
> > ...
> > .L6:
> > jmp tailcall
> >
> > into:
> >
> > jcc tailcall
> >
> > After basic block reordering pass, conditional branches look like
> >
> > (jump_insn 7 6 14 2 (set (pc)
> > (if_then_else (eq (reg:CCZ 17 flags)
> > (const_int 0 [0]))
> > (label_ref:DI 23)
> > (pc))) "x.c":8:5 1458 {jcc}
> > (expr_list:REG_DEAD (reg:CCZ 17 flags)
> > (int_list:REG_BR_PROB 217325348 (nil)))
> > ...
> > (code_label 23 20 8 4 4 (nil) [1 uses])
> > (note 8 23 9 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> > (call_insn/j 9 8 10 4 (call (mem:QI (symbol_ref:DI ("bar") [flags 0x41]
> > <functi
> > on_decl 0x7f4cff3c0b00 bar>) [0 bar S1 A8])
> > (const_int 0 [0])) "x.c":8:14 discrim 1 1469 {sibcall_di}
> > (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar") [flags 0x41]
> > <function_dec
> > l 0x7f4cff3c0b00 bar>)
> > (nil))
> > (nil))
> >
> > If the branch edge destination is a basic block with only a direct
> > sibcall, change the jcc target to the sibcall target and decrement
> > the destination basic block entry label use count. Even though the
> > destination basic block is unused, it must be kept since it is required
> > by RTL control flow check and JUMP_LABEL of the conditional jump can
> > only point to a code label, not a code symbol. Dummy sibcall patterns
> > are added so that sibcalls in basic blocks, whose entry label use count
> > is 0, won't be generated.
>
> This reads like you are trying to get around some checks in RTL
> control flow. So, either changes you are performing to RTX stream are
> not allowed (these checks are here for a reason), or the
> infrastructure is not (yet) prepared to handle this functionality.
The main issue is that because JUMP_LABEL of the conditional
jump can point to a code label, not a code symbol, I have no choice
but keep it even if it is unused. If the infrastructure allows a symbol
reference in all places where a label reference is allowed, only
x86 backend changes are needed.
BTW, some targets, like arm, don't set use count on referenced
labels. I will add a target hook to opt-out the zero use count label.
> Either way, please discuss with infrastructure maintainers (CC'd)
> first if the approach is correct and if these changes to RTX stream
> are allowed by the infra.
>
> Thanks,
> Uros.
>
> >
> > Jump tables like
> >
> > foo:
> > .cfi_startproc
> > cmpl $4, %edi
> > ja .L1
> > movl %edi, %edi
> > jmp *.L4(,%rdi,8)
> > .section .rodata
> > .L4:
> > .quad .L8
> > .quad .L7
> > .quad .L6
> > .quad .L5
> > .quad .L3
> > .text
> > .L5:
> > jmp bar3
> > .L3:
> > jmp bar4
> > .L8:
> > jmp bar0
> > .L7:
> > jmp bar1
> > .L6:
> > jmp bar2
> > .L1:
> > ret
> > .cfi_endproc
> >
> > can also be changed to:
> >
> > foo:
> > .cfi_startproc
> > cmpl $4, %edi
> > ja .L1
> > movl %edi, %edi
> > jmp *.L4(,%rdi,8)
> > .section .rodata
> > .L4:
> > .quad bar0
> > .quad bar1
> > .quad bar2
> > .quad bar3
> > .quad bar4
> > .text
> > .L1:
> > ret
> > .cfi_endproc
> >
> > After basic block reordering pass, jump tables look like:
> >
> > (jump_table_data 16 15 17 (addr_vec:DI [
> > (label_ref:DI 18)
> > (label_ref:DI 22)
> > (label_ref:DI 26)
> > (label_ref:DI 30)
> > (label_ref:DI 34)
> > ]))
> > ...
> > (code_label 30 17 31 4 5 (nil) [1 uses])
> > (note 31 30 32 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> > (call_insn/j 32 31 33 4 (call (mem:QI (symbol_ref:DI ("bar3") [flags 0x41]
> > <function_decl 0x7f21be3c0e00 bar3>) [0 bar3 S1 A8])
> > (const_int 0 [0])) "j.c":15:13 1469 {sibcall_di}
> > (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar3") [flags 0x41]
> > <function_decl 0x7f21be3c0e00 bar3>)
> > (nil))
> > (nil))
> >
> > If the jump table entry points to a target basic block with only a direct
> > sibcall, change the entry to point to the sibcall target and decrement
> > the target basic block entry label use count. If the target basic block
> > isn't kept for JUMP_LABEL of the conditional tailcall, delete it if its
> > entry label use count is 0.
> >
> > Update final_scan_insn_1 to skip a label if its use count is 0 and
> > support symbol reference in jump table. Update create_trace_edges to
> > skip symbol reference in jump table.
> >
> > H.J. Lu (2):
> > x86: Add a pass to fold tail call
> > x86: Fold sibcall targets into jump table
> >
> > gcc/config/i386/i386-features.cc | 274 +++++++++++++++++++++
> > gcc/config/i386/i386-passes.def | 1 +
> > gcc/config/i386/i386-protos.h | 3 +
> > gcc/config/i386/i386.cc | 12 +
> > gcc/config/i386/i386.md | 57 ++++-
> > gcc/config/i386/predicates.md | 4 +
> > gcc/dwarf2cfi.cc | 7 +-
> > gcc/final.cc | 26 +-
> > gcc/testsuite/gcc.target/i386/pr14721-1a.c | 54 ++++
> > gcc/testsuite/gcc.target/i386/pr14721-1b.c | 37 +++
> > gcc/testsuite/gcc.target/i386/pr14721-1c.c | 37 +++
> > gcc/testsuite/gcc.target/i386/pr14721-2a.c | 58 +++++
> > gcc/testsuite/gcc.target/i386/pr14721-2b.c | 41 +++
> > gcc/testsuite/gcc.target/i386/pr14721-2c.c | 43 ++++
> > gcc/testsuite/gcc.target/i386/pr14721-3a.c | 56 +++++
> > gcc/testsuite/gcc.target/i386/pr14721-3b.c | 40 +++
> > gcc/testsuite/gcc.target/i386/pr14721-3c.c | 39 +++
> > gcc/testsuite/gcc.target/i386/pr47253-1a.c | 24 ++
> > gcc/testsuite/gcc.target/i386/pr47253-1b.c | 17 ++
> > gcc/testsuite/gcc.target/i386/pr47253-2a.c | 27 ++
> > gcc/testsuite/gcc.target/i386/pr47253-2b.c | 17 ++
> > gcc/testsuite/gcc.target/i386/pr47253-3a.c | 32 +++
> > gcc/testsuite/gcc.target/i386/pr47253-3b.c | 20 ++
> > gcc/testsuite/gcc.target/i386/pr47253-3c.c | 20 ++
> > gcc/testsuite/gcc.target/i386/pr47253-4a.c | 26 ++
> > gcc/testsuite/gcc.target/i386/pr47253-4b.c | 18 ++
> > gcc/testsuite/gcc.target/i386/pr47253-5.c | 15 ++
> > gcc/testsuite/gcc.target/i386/pr47253-6.c | 15 ++
> > gcc/testsuite/gcc.target/i386/pr47253-7a.c | 52 ++++
> > gcc/testsuite/gcc.target/i386/pr47253-7b.c | 36 +++
> > 30 files changed, 1097 insertions(+), 11 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1c.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2c.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3c.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-1a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-1b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-2a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-2b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3c.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-4a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-4b.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-5.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-6.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-7a.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-7b.c
> >
> > --
> > 2.48.1
> >
--
H.J.