> On Tue, Dec 06, 2022 at 05:46:11PM -0000, Andrii Nakryiko wrote:
> 
> Depends on how large functions are.  If you have lots of small functions,
> it can be more than 50% of instructions you get the frame pointer wrong.

You might be technically correct, but if some application is spending 50% of 
the time entering and exiting functions, then it's not doing a ton of useful 
work efficiently anyways and should be rethought and improved. And profiling 
data will point this out very clearly.

Let's just not use these convoluted unrealistic corner cases as a generalized 
claims about how frame pointers are useless, ok?

> 
> 
> You haven't looked carefully enough.
> 

Oh, I didn't claim I studied glibc's assembly code very carefully for sure. I 
did a very similar grepping to yours and found just the same **very few** 
functions that do use %rbp.

> find . -name \*.S | xargs grep -l %rbp | xargs grep -L movq.*%rsp.*%rbp
> ./sysdeps/x86_64/fpu/multiarch/svml_s_tanhf16_core_avx512.S
> ./sysdeps/x86_64/fpu/multiarch/svml_s_tanhf8_core_avx2.S
> ./sysdeps/x86_64/fpu/multiarch/svml_s_atanhf16_core_avx512.S
> ./sysdeps/x86_64/fpu/multiarch/svml_s_atanhf8_core_avx2.S
> ./sysdeps/x86_64/fpu/svml_d_sincos2_core.S
> ./sysdeps/x86_64/fpu/svml_s_sincosf4_core.S
> ./sysdeps/x86_64/addmul_1.S
> ./sysdeps/x86_64/__longjmp.S
> ./sysdeps/x86_64/setjmp.S
> ./sysdeps/x86_64/_mcount.S
> ./sysdeps/unix/sysv/linux/x86_64/____longjmp_chk.S
> ./sysdeps/unix/sysv/linux/x86_64/setcontext.S
> ./sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> ./sysdeps/unix/sysv/linux/x86_64/getcontext.S

Out of all of these, we have addmul_1.S as seemingly generic. But grepping 
around glibc sources I could find any place that's actually calling into 
__mpn_addmul_1, so there must be some more magic happening. I'll believe you 
that it's used somewhere in printf family of functions, though I certainly 
didn't see any mention of __mpn_addmul_1 in llvm-objdump and llvm-readelf 
outputs for a sample program that is just doing printf or snprintf.

But all that aside. You've found 14 files that do use %rbp and again are using 
this as a point that frame pointers are useless.

And I'm telling you that **in practice**, in real world applications, they are 
very much useful and are relied upon, even if in some circumstances we fail to 
get frame pointers. I don't understand how one can honestly can use this line 
of argument to say that frame pointers are useless.

And by enabling frame pointers even wider we make them even more useful.

> 
> E.g. mpn_addmul_1 is used by printf, which certainly shows up a lot in
> normal workloads.  The above cases mean the bp register contains total
> garbage if one tries to use it for backtrace purposes.  And obviously
> glibc isn't the only library with inline assembly out there...
> As I said, you can't rely on frame pointers making sense, with unwind info

See my replies. Yes, frame pointers might be broken in some rare situations, 
yes we don't have 100% success rate capturing stack traces precisely because 
there is embedded asm that clobbers %rbp, or binaries and libraries are build 
with -fomit-frame-pointers. We are asking to minimize the latter.

And for the former, it's on a case-by-case basis to try to mitigate this, but 
ultimately no one expect that stack traces will be captured with 100% success 
rate. We want this rate to be as high as possible though. And arguments like 
yours are not helping, but also misleading people that are not familiar with 
the matters at hand, but trust people like you just because you are "a 
compiler/toolchain engineer with considerable experience", while it's clear 
that compiler/toolchain engineers would rather spend time arguing about petty 
low-probability counter-examples than be helpful and see the problem in a wider 
context.

> such as in DWARF you know that in this case rbp is used as a frame pointer
> and in this case it is not, use some other register or this offset from
> stack pointer etc.
> 
> AFAIK systemtap already uses DWARF unwinder in the kernel and as I said, for
> the profiling purposes one can use only a small subset of that for the vast
> majority of cases.

This does not work for any of multitude of BPF-based tools. And even with perf, 
that supports very expensive DWARF-based unwinding, this doesn't work very well 
in practice. There is evidence even from people commenting in this thread. And 
all this was explained at length in https://pagure.io/fesco/issue/2817.

> 
>       Jakub
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to