https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102534

            Bug ID: 102534
           Summary: RFE epilog is not reliably a statement
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: debug
          Assignee: unassigned at gcc dot gnu.org
          Reporter: woodard at redhat dot com
  Target Milestone: ---

Created attachment 51523
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51523&action=edit
demonstration program

Given a program like this:

     1  #include <stdio.h>
     2   
     3  static void do_print(char *s)
     4  {
     5     printf("%s", s);
     6  }
     7   
     8  int main(int argc, char *argv[])
     9  {
    10     int i = 0;
    11     for (;;) {
    12        do_print(argv[i]);
    13        i++;
    14        if (argv[i] == NULL) {
    15           do_print("\n");
    16           return 0;
    17        }
    18        do_print(", ");
    19     }
    20     #ifdef BAD_MATT_CODE
    21     //no longer used
    22     return -1;
    23     }
    24     #endif
    25  }
    26   
    27  /**
    28  * Just a comment taking a few lines of code
    29  * What do you call cheese that isn't yours?
    30  * Nacho cheese.
    31  **/
    32  int unused_variable;
    33   
    34  void unused_function()
    35  {
    36     printf("I'm not called anywhere\n");
    37  }

When you try to set a breakpoint on the closing brace of a function, it skips
to the beginning of the next function in the file:

$ gdb a.out 
GNU gdb (GDB) Fedora 10.2-3.fc34
Reading symbols from a.out...
(gdb) break 6
Breakpoint 1 at 0x401060: file line-range.c, line 9.

That is the start of main() which is not what was intended which I would assert
means “break before you leave the context of do_print()” in other words the
epilog of the function. 

On the other hand “b 5” works as expected

(gdb) break 5
Breakpoint 2 at 0x401070: /home/ben/Shared/test/line-ranges/line-range.c:5. (3
locations)
(gdb) info break
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000401060 in main at line-range.c:9
2       breakpoint     keep y   <MULTIPLE>         
2.1                         y   0x0000000000401070 in do_print at
line-range.c:5
2.2                         y   0x0000000000401081 in do_print at
line-range.c:5
2.3                         y   0x000000000040109a in do_print at
line-range.c:5

In that particular case the function ends up being inlined and the "ret"
instruction where the epilog is_stmt would be has been eliminated. We believe
it should still mark the first instruction after the code from the function.
Where their would have been a ret.

It isn't just inline functions are affected.

(gdb) break 25
Breakpoint 1 at 0x4011a0: file line-range.c, line 35.

Once again this the first instruction of the function defined after the one
where the epilog for main should be. There is even a ret instruction there:

(gdb) disassemble main
Dump of assembler code for function main:
   0x0000000000401060 <+0>:     push   %rbx
   0x0000000000401061 <+1>:     mov    %rsi,%rbx
   0x0000000000401064 <+4>:     jmp    0x401081 <main+33>
   0x0000000000401066 <+6>:     nopw   %cs:0x0(%rax,%rax,1)
   0x0000000000401070 <+16>:    mov    $0x402013,%esi
   0x0000000000401075 <+21>:    mov    $0x402010,%edi
   0x000000000040107a <+26>:    xor    %eax,%eax
   0x000000000040107c <+28>:    call   0x401050 <printf@plt>
   0x0000000000401081 <+33>:    mov    (%rbx),%rsi
   0x0000000000401084 <+36>:    xor    %eax,%eax
   0x0000000000401086 <+38>:    mov    $0x402010,%edi
   0x000000000040108b <+43>:    add    $0x8,%rbx
   0x000000000040108f <+47>:    call   0x401050 <printf@plt>
   0x0000000000401094 <+52>:    cmpq   $0x0,(%rbx)
   0x0000000000401098 <+56>:    jne    0x401070 <main+16>
   0x000000000040109a <+58>:    mov    $0xa,%edi
   0x000000000040109f <+63>:    call   0x401030 <putchar@plt>
   0x00000000004010a4 <+68>:    xor    %eax,%eax
   0x00000000004010a6 <+70>:    pop    %rbx
   0x00000000004010a7 <+71>:    ret    
End of assembler dump.

it even has linemap entries

$ readelf --debug-dump=decodedline a.out | egrep ^File\|25
File name                            Line number    Starting address    View   
Stmt
line-range.c                                  25            0x4010a4       2
line-range.c                                  25            0x4010a8        

But the problem seems to be that none of the linemap entries is adorned with
is_stmt. We believe that that should point at 4010a7.

Putting the is-stmt for the closing brace of a functopm on the ret instruction
of normal extern function is easy but we would like all the other complicating
situations to be handled as well some of which include: 

- inline functions
- void functions
- multiple returns from a function
- functions which optimize into being empty.
- external functions that are not used but could be called from another
function in a different CU. However that means that they could be dropped when
compiling with LTO. Several of these complications are demonstrated in the
attached program.

We have found that this works a bit better for C++ rather than C because C++
frequently has code for destructors for local variables that are executed in
the epilog of the function and the statement that it is tied to is the closing
brace of the function.

Reply via email to