https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110082
Bug ID: 110082 Summary: Coverage analysis vs. offloading compilation Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openacc, openmp, wrong-code Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: hubicka at gcc dot gnu.org, jakub at gcc dot gnu.org, marxin at gcc dot gnu.org, rguenth at gcc dot gnu.org Target Milestone: --- Target: amdgcn-amdhsa, nvptx-none (Via a customer report) we've determined that offloading compilation fails in combination with '-fprofile-arcs' (as implied by '--coverage', coverage analysis): <built-in>: error: variable ‘__gcov0.main._omp_fn.0’ has been referenced in offloaded code but hasn’t been marked to be included in the offloaded code lto1: fatal error: errors during merging of translation units compilation terminated. nvptx mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status [...] Per my quick look, during early host compilation, via 'gcc/tree-profile.cc:gimple_gen_edge_profiler', via 'pass_ipa_tree_profile', as visible in '*.069i.profile', '__atomic_fetch_add_8 (&__gcov0.main._omp_fn.0[0], 1, 0);' etc. statements are added. (Or, different statements in case that the target cannot "utilize atomic update operations", or 'PROFILE_UPDATE_SINGLE' is used.) That's part of the IPA passes, so before offloading compilation split-off. (... as per the error, evidently). As we've got no mechanism implemented currently to move any device-side coverage data from the device execution back to the host, and integrate it with the host-side coverage data, we propose to not do coverage analysis for offloading compilation. My idea is to abstract the "increment the edge execution count" operations into some new GIMPLE/IFN code (?), and then later, once the offloading code has been split off, lower it to the current form (host-side), or no-op (device-side). I'd appreciate a quick review if that approach makes sense? --- We've seen and dealt with a few already, over the past decade, but still more such similar issues certainly exist for other scenarios where GCC "early" (before the offloading code split-off) does code transformations that device compilation may choke on, and thus has to explicitly handle. A full review of GCC "early" transformations doesn't seem feasible, so we shall continue addressing these incrementally, as encountered. For another example (that I shall soon be working on), see <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544#c9>.