https://sourceware.org/bugzilla/show_bug.cgi?id=29015
Bug ID: 29015 Summary: On Intel Skylake the call tree is incorrect Product: binutils Version: 2.39 (HEAD) Status: UNCONFIRMED Severity: normal Priority: P2 Component: gprofng Assignee: vladimir.mezentsev at oracle dot com Reporter: ruud.vanderpas at oracle dot com Target Milestone: --- Created attachment 14046 --> https://sourceware.org/bugzilla/attachment.cgi?id=14046&action=edit This directory contains everything needed to reproduce the problem. The call tree output is not correct for this example I ran on an Intel Skylake based system. The code has been parallelized using Pthreads and we should see function "start_thread" in the call tree. It is not there though and this looks like an issue related to stack unwind. This is the output I get: Functions Call Tree. Metric: Attributed Total CPU Time Attr. Name Total CPU sec. 4.827 +-<Total> 4.712 +-collector_root 4.712 | +-driver_mxv 4.712 | +-mxv_core 0.116 +-__libc_start_main 0.116 +-main 0.106 +-init_data 0.050 | +-drand48 0.039 | +-erand48_r 0.014 | +-__drand48_iterate 0.010 +-allocate_data 0.010 +-malloc 0.010 +-_int_malloc 0.003 +-sysmalloc 0.002 +-__default_morecore 0.002 +-sbrk 0.002 +-brk I used gcc 10 and did not enable any optimizations, but I also see this problem if I use -O for example. On an older Intel Haswell based system, I do see start_thread in the call tree. The attachment has everything needed to reproduce the problem. The code is in directory "src" and can be built with "make". On purpose I left my objects and the binary in, as well as the experiment directory. There is a run.sh script that was used to show the problem. Sample output of this script is in run.res. -- You are receiving this mail because: You are on the CC list for the bug.