https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org, | |jakub at gcc dot gnu.org --- Comment #21 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I'd say using the (default unless -pthread is used) -fprofile-update=single is wrong for the kernel, it can't work correctly in multi-threaded case which is the case of kernel. In the -fprofile-update=single (as opposed to -fprofile-update=atomic) the updates to the counters aren't atomic and the arrays aren't marked volatile or something similar, it is really meant for single threaded coverage. Anyway, before ivopts we have: pretmp_93 = __gcov0.prep_compound_page[7]; <bb 9> [local count: 955630225]: # i_66 = PHI <i_17(26), 1(8)> # prephitmp_92 = PHI <PROF_edge_counter_46(26), pretmp_93(8)> i.144_1 = (unsigned int) i_66; _2 = i.144_1 * 40; p_15 = page_12(D) + _2; p_15->D.13727.D.13672.mapping = 1024B; MEM[(volatile long unsigned int *)p_15 + 4B] ={v} _159; i_17 = i_66 + 1; PROF_edge_counter_46 = prephitmp_92 + 1; __gcov0.prep_compound_page[7] = PROF_edge_counter_46; if (nr_pages_11 > i_17) goto <bb 26>; [89.00%] else goto <bb 7>; [11.00%] <bb 26> [local count: 850510901]: goto <bb 9>; [100.00%] which given the non-volatile non-atomically updated arrays is to be expected, instead of re-reading __gcov0.prep_compound_page[7] in every iteration it just reads it once and stores in each iteration, which is possible because another thread changing it concurrently would mean a data race anyway.