https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69004
--- Comment #24 from Martin Liška <marxin at gcc dot gnu.org> --- Well, first observation is that as we're dumping gcov files (triggered in _exit in the main thread), there are still many running threads that are not joined before the gcov_dump function is called: (gdb) info threads Id Target Id Frame * 1 Thread 0x7fe87770d780 (LWP 15866) "t-engine" 0x00007fe8755f9fe0 in _exit () from /lib64/libc.so.6 2 Thread 0x7fe86f159700 (LWP 15889) "threaded-ml" 0x00007fe875620a1d in poll () from /lib64/libc.so.6 3 Thread 0x7fe87770b700 (LWP 15895) "t-engine" 0x00007fe875b07087 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 4 Thread 0x7fe86a140700 (LWP 15899) "SDLTimer" 0x00007fe875b09653 in do_futex_wait () from /lib64/libpthread.so.0 5 Thread 0x7fe854bbc700 (LWP 16067) "t-engine" 0x00007fe875b0709f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 6 Thread 0x7fe8541b6700 (LWP 16069) "save" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 7 Thread 0x7fe8539b5700 (LWP 16070) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 8 Thread 0x7fe8531b4700 (LWP 16071) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 9 Thread 0x7fe8529b3700 (LWP 16072) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 10 Thread 0x7fe8521b2700 (LWP 16073) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 11 Thread 0x7fe8519b1700 (LWP 16074) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 12 Thread 0x7fe8511b0700 (LWP 16075) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 13 Thread 0x7fe8509af700 (LWP 16076) "particles" 0x00007fe875b09487 in do_futex_wait.constprop () from /lib64/libpthread.so.0 14 Thread 0x7fe833816700 (LWP 16082) "profile" 0x00007fe875b0a7ed in nanosleep () from /lib64/libpthread.so.0 That's already described what's happening: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68080#c3 Is it doable to join all threads for the main one is going to exit? I can imagine making profiles as a thread-local storage would make sense, but we still need a place where to merge the data. IMHO that's only possible by invoking If it's in general a problematic, we'll have to implement TLS counters that will be merged whenever a thread joins. That's still not doable with detached threads. Thoughts?