On Wed, Jan 24, 2024 at 7:36 PM Hongtao Liu <crazy...@gmail.com> wrote: > > On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > Changes in v3: > > > > 1. Rebase against commit 02e68389494 > > 2. Don't add call_no_callee_saved_registers to machine_function since > > all callee-saved registers are properly clobbered by callee with > > no_callee_saved_registers attribute. > > > The patch LGTM, it should be low risk since there's already > no_caller_save_registers attribute, the patch just extends to > no_callee_save_registers with the same approach. > So if there's no objection(or any concerns) in the next couple days, > I'm ok for the patch to be in GCC14 and backport.
I am checking it in. Thanks. H.J. > > Changes in v2: > > > > 1. Rebase against commit f9df00340e3 > > 2. Don't add redundant clobbered_registers check in ix86_expand_call. > > > > In some cases, there are no need to save callee-saved registers: > > > > 1. If a noreturn function doesn't throw nor support exceptions, it can > > skip saving callee-saved registers. > > > > 2. When an interrupt handler is implemented by an assembly stub which does: > > > > 1. Save all registers. > > 2. Call a C function. > > 3. Restore all registers. > > 4. Return from interrupt. > > > > it is completely unnecessary to save and restore any registers in the C > > function called by the assembly stub, even if they would normally be > > callee-saved. > > > > This patch set adds no_callee_saved_registers function attribute, which > > is complementary to no_caller_saved_registers function attribute, to > > classify x86 backend call-saved register handling type with > > > > 1. Default call-saved registers. > > 2. No caller-saved registers with no_caller_saved_registers attribute. > > 3. No callee-saved registers with no_callee_saved_registers attribute. > > > > Functions of no callee-saved registers won't save callee-saved registers. > > If a noreturn function doesn't throw nor support exceptions, it is > > classified as the no callee-saved registers type. > > > > With these changes, __libc_start_main in glibc 2.39, which is a noreturn > > function, is changed from > > > > __libc_start_main: > > endbr64 > > push %r15 > > push %r14 > > mov %rcx,%r14 > > push %r13 > > push %r12 > > push %rbp > > mov %esi,%ebp > > push %rbx > > mov %rdx,%rbx > > sub $0x28,%rsp > > mov %rdi,(%rsp) > > mov %fs:0x28,%rax > > mov %rax,0x18(%rsp) > > xor %eax,%eax > > test %r9,%r9 > > > > to > > > > __libc_start_main: > > endbr64 > > sub $0x28,%rsp > > mov %esi,%ebp > > mov %rdx,%rbx > > mov %rcx,%r14 > > mov %rdi,(%rsp) > > mov %fs:0x28,%rax > > mov %rax,0x18(%rsp) > > xor %eax,%eax > > test %r9,%r9 > > > > In Linux kernel 6.7.0 on x86-64, do_exit is changed from > > > > do_exit: > > endbr64 > > call <do_exit+0x9> > > push %r15 > > push %r14 > > push %r13 > > push %r12 > > mov %rdi,%r12 > > push %rbp > > push %rbx > > mov %gs:0x0,%rbx > > sub $0x28,%rsp > > mov %gs:0x28,%rax > > mov %rax,0x20(%rsp) > > xor %eax,%eax > > call *0x0(%rip) # <do_exit+0x39> > > test $0x2,%ah > > je <do_exit+0x8d3> > > > > to > > > > do_exit: > > endbr64 > > call <do_exit+0x9> > > sub $0x28,%rsp > > mov %rdi,%r12 > > mov %gs:0x28,%rax > > mov %rax,0x20(%rsp) > > xor %eax,%eax > > mov %gs:0x0,%rbx > > call *0x0(%rip) # <do_exit+0x2f> > > test $0x2,%ah > > je <do_exit+0x8c9> > > > > I compared GCC master branch bootstrap and test times on a slow machine > > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > > with the backported patch. The performance data isn't precise since the > > measurements were done on different days with different GCC sources under > > different 6.6 kernel versions. > > > > GCC master branch build time in seconds: > > > > before after improvement > > 30043.75user 30013.16user 0% > > 1274.85system 1243.72system 2.4% > > > > GCC master branch test time in seconds (new tests added): > > > > before after improvement > > 216035.90user 216547.51user 0 > > 27365.51system 26658.54system 2.6% > > > > Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. > > Systems perform normally. > > > > > > H.J. Lu (2): > > x86: Add no_callee_saved_registers function attribute > > x86: Don't save callee-saved registers in noreturn functions > > > > gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- > > gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- > > gcc/config/i386/i386.cc | 57 +++++++++++++---- > > gcc/config/i386/i386.h | 16 ++++- > > gcc/doc/extend.texi | 8 +++ > > .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ > > .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ > > .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ > > .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ > > .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ > > .../gcc.target/i386/no-callee-saved-12.c | 10 +++ > > .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ > > .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ > > .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ > > .../gcc.target/i386/no-callee-saved-3.c | 8 +++ > > .../gcc.target/i386/no-callee-saved-4.c | 8 +++ > > .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ > > .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ > > .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ > > .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ > > .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ > > gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ > > gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ > > gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ > > .../gcc.target/i386/stack-check-17.c | 19 +++--- > > 30 files changed, 775 insertions(+), 47 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c > > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c > > > > -- > > 2.43.0 > > > > > -- > BR, > Hongtao -- H.J.