[Bug middle-end/119833] New: Clarify which semantics offloading compilation does (not) inherit from using the LTO infrastructure

tschwinge at gcc dot gnu.org via Gcc-bugs Wed, 16 Apr 2025 03:34:10 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119833


            Bug ID: 119833
           Summary: Clarify which semantics offloading compilation does
                    (not) inherit from using the LTO infrastructure
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: openacc, openmp
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tschwinge at gcc dot gnu.org
                CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org,
                    rguenth at gcc dot gnu.org, tschwinge at gcc dot gnu.org
  Target Milestone: ---

+++ This bug was initially created as a clone of Bug #117010 +++

(In reply to myself from bug 117010, comment #4)
> Jakub, Richi, C++/offloading question.  For the small test case posted here,
> for 'V<0>::V()' I see in the '-O0' x86_64 host code:
> 
>             .section       
> .text._ZN1VILi0EEC2Ev,"axG",@progbits,_ZN1VILi0EEC5Ev,comdat
>             .align 2
>             .weak   _ZN1VILi0EEC2Ev
>             .type   _ZN1VILi0EEC2Ev, @function
>     _ZN1VILi0EEC2Ev:
>             [...]
>             .size   _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev
>             .weak   _ZN1VILi0EEC1Ev
>             .set    _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev
> 
> That is, weak definitions of '_ZN1VILi0EEC2Ev' and its alias
> '_ZN1VILi0EEC1Ev' (which gets called from 'foo').
> 
> Likewise, I see weak definitions, if compiling such code for GCN target:
> 
>         .section       
> .text._ZN1VILi0EEC2Ev,"axG",@progbits,_ZN1VILi0EEC5Ev,comdat
>         .align  4
>         .weak   _ZN1VILi0EEC2Ev
>         .type   _ZN1VILi0EEC2Ev,@function
> _ZN1VILi0EEC2Ev:
> [...]
>         .size   _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev
>         .weak   _ZN1VILi0EEC1Ev
>         .set    _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev
> 
> ..., so that appears consistent.
> 
> For nvptx target (with '-malias'), I see:
> 
>     .weak .func _ZN1VILi0EEC1Ev (.param.u64 %in_ar0)
>     {
>     [...]
>     }
> 
> That is, it directly emits the (used) '_ZN1VILi0EEC1Ev' constructor, instead
> of emitting '_ZN1VILi0EEC2Ev' and then aliasing the former to the latter. 

(See bug 117010, comment #9 for the x86_64, or GCN vs. nvptx target
difference.)

> Now, the observation/question: compiling this code for offloading (as
> originally reported), I see for GCN offloading:
> 
>             .text
>     [...]
>             .type   _ZN1VILi0EEC2Ev,@function
>     _ZN1VILi0EEC2Ev:
>     [...]
>             .size   _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev
>             .set    _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev
> 
> That is, '_ZN1VILi0EEC2Ev' and its alias '_ZN1VILi0EEC1Ev' are now strong
> instead of weak definitions.

Similarly for nvptx offloading:

    .func _ZN1VILi0EEC2Ev (.param.u64 %in_ar0)
    {
    [...]

... is then non-'.weak'.

> Is this expected, or unexpected, and
> potentially problematic?

(In reply to myself from bug 117010, comment #6)
> [Looking for an explanation] why "weak" and "comdat" get lost in the GCN 
> offloading path?  GCN
> (ELF) does support all these things (to the best of my knowledge).  (Let's
> ignore nvptx for this moment.)  I'll thus analyze offload stream-out,
> stream-in etc.

(In reply to myself from bug 117010, comment #7)
> First observation: the same (per my understanding) happens with LTO: compile
> this code, still at '-O0' with '-foffload=disable' but with '-flto', and see
> the x86_64 '[...].ltrans0.ltrans.s' file:
> 
>             .text
>     [...]
>             .type   _ZN1VILi0EEC2Ev, @function
>     _ZN1VILi0EEC2Ev:
>     [...]
>             .size   _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev
>             .set    _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev
> 
> Could this be due to whole-program optimization, enabled by LTO?  (But
> '-O0'?)

(In reply to myself from bug 117010, comment #8)
> Well, indeed.  Offloading code generation uses the LTO machinery, including
> the 'lto1' front end, and thus has 'gcc/common.opt:in_lto_p' set to 'true':
> 
>     ; True if this is the lto front end.  This is used to disable gimple
>     ; generation and lowering passes that are normally run on the output
>     ; of a front end.  These passes must be bypassed for lto since they
>     ; have already been done before the gimple was written.
>     Variable
>     bool in_lto_p = false
> 
> The "weak", "comdat" transformations are described at the high level in
> 'gcc/doc/lto.texi':
> 
>     The whole program mode assumptions are slightly more complex in
>     C++, where inline functions in headers are put into @emph{COMDAT}
>     sections.  COMDAT function and variables can be defined by
>     multiple object files and their bodies are unified at link-time
>     and dynamic link-time.  COMDAT functions are changed to local only
>     when their address is not taken and thus un-sharing them with a
>     library is not harmful.  [...]
> 
> If I force-disable 'pass_ipa_whole_program_visibility':
> 
>     --- gcc/ipa-visibility.cc
>     +++ gcc/ipa-visibility.cc
>     @@ -993,4 +993,7 @@ public:
>        unsigned int execute (function *) final override
>          {
>     +#ifdef ACCEL_COMPILER
>     +      return 0;
>     +#endif
>            return whole_program_function_and_variable_visibility ();
>          }
> 
> ..., then we get the expected 'diff' for GCN offloading compilation's
> '[...].xamdgcn-amdhsa.mkoffload.082i.whole-program' (and similar for nvptx
> offloading compilation's '[...].xnvptx-none.mkoffload.082i.whole-program'):
> 
>     -Marking local functions: __ct_comp /2 __ct_base /1
>     [...]
>     @@ -49,22 +40,24 @@
>      _ZN1VILi0EEC1Ev/2 (__ct_comp )
>        Type: function definition analyzed alias
>     -  Visibility: semantic_interposition prevailing_def_ironly
>     +  Visibility: externally_visible semantic_interposition public weak 
> comdat comdat_group:_ZN1VILi0EEC5Ev one_only
>     +  Same comdat group as: _ZN1VILi0EEC2Ev/1
>        References: _ZN1VILi0EEC2Ev/1 (alias) 
>        Referring: 
>        Read from file: pr117010-1_.o
>     -  Availability: local
>     +  Availability: available
>        Unit id: 1
>     -  Function flags: local
>     +  Function flags:
>        Called by: _Z3foov/3 
>        Calls: 
>      _ZN1VILi0EEC2Ev/1 (__ct_base )
>        Type: function definition analyzed
>     -  Visibility: semantic_interposition no_reorder prevailing_def_ironly
>     +  Visibility: externally_visible semantic_interposition no_reorder 
> public weak comdat comdat_group:_ZN1VILi0EEC5Ev one_only
>     +  Same comdat group as: _ZN1VILi0EEC1Ev/2
>        References: 
>        Referring: _ZN1VILi0EEC1Ev/2 (alias) 
>        Read from file: pr117010-1_.o
>     -  Availability: local
>     +  Availability: available
>        Unit id: 1
>     -  Function flags: local
>     +  Function flags:
>        Called by: 
>        Calls: 
> 
> ..., and we get the expected 'diff' for the GCN offloading code,
> '[...].xamdgcn-amdhsa.mkoffload.2.s' (and similar for the nvptx offloading
> code, 'pr117010-1_.xnvptx-none.mkoffload.s'):
> 
>     +       .section       
> .text._ZN1VILi0EEC2Ev,"axG",@progbits,_ZN1VILi0EEC5Ev,comdat
>             .align  2
>     +       .weak   _ZN1VILi0EEC2Ev
>             .type   _ZN1VILi0EEC2Ev,@function
>     [...]
>             .size   _ZN1VILi0EEC2Ev, .-_ZN1VILi0EEC2Ev
>     +       .weak   _ZN1VILi0EEC1Ev
>             .set    _ZN1VILi0EEC1Ev,_ZN1VILi0EEC2Ev
> 
> Now, so much for the mechanics.  What this means semantically: whether
> 'in_lto_p' should vs. shouldn't actually be set for offloading compilation,
> I/we have to spend more thought on, whether all these
> transformations/optimizations guarded by 'in_lto_p' are generally applicable
> to offloading compilation or not?

That shall be the topic of this new PR here.

[Bug middle-end/119833] New: Clarify which semantics offloading compilation does (not) inherit from using the LTO infrastructure

Reply via email to