[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 Richard Biener changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |MOVED --- Comment #28 from Richard Biener --- Fixed in BFD.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #27 from Trass3r --- (In reply to rguent...@suse.de from comment #25) > I fear this is the libbfd dwarf reader simply not coping with > DW_AT_abstract_origin in other CUs, being confused as to which > abbrev section it needs to look into (probably using that of > the refering CU instead of the referred to one). > so I think this is a BFD bug and to be filed in sourceware bugzilla > (probably enough to have an actual binary with this kind of DWARF > as testcase). > > Disclaimer: I didn't actually see whether my guess above is true > (but it looks so obvious ;)) https://sourceware.org/bugzilla/show_bug.cgi?id=24623
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 Jakub Jelinek changed: What|Removed |Added Target Milestone|9.2 |9.3 --- Comment #26 from Jakub Jelinek --- GCC 9.2 has been released.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #25 from rguenther at suse dot de --- On Sun, 26 May 2019, hoganmeier at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 > > --- Comment #22 from krux --- > I can also reproduce it without any linker script, simplified code: > > int serial3_available() {} > struct HardwareSerial3 { > int available() { serial3_available(); } > }; > HardwareSerial3 Serial3; > > void yield() { serial3_available(); } > int main() > { > yield(); > } > > $ g++-9 -c -fno-exceptions -fno-rtti -flto -g -O2 main.cpp > $ g++-9 -o firmware.elf -g -O2 main.o > $ nm -ClS --radix=d --size-sort firmware.elf > 4496 0001 T __libc_csu_fininm: DWARF error: could not > find abbrev number 8 I fear this is the libbfd dwarf reader simply not coping with DW_AT_abstract_origin in other CUs, being confused as to which abbrev section it needs to look into (probably using that of the refering CU instead of the referred to one). We have Compilation Unit @ offset 0xce: Length:0x7a (32-bit) Version: 4 Abbrev Offset: 0x64 Pointer Size: 8 <0>: Abbrev Number: 1 (DW_TAG_compile_unit) DW_AT_producer: (indirect string, offset: 0x1ac): GNU GIMPLE 9.1. 0 -mtune=generic -march=x86-64 -g -O2 -O2 -fno-openmp -fno-openacc -fno-pie -flt rans with 5 abbrev entries refering to the DIE: Compilation Unit @ offset 0x14c: Length:0x83 (32-bit) Version: 4 Abbrev Offset: 0xb9 Pointer Size: 8 <0><157>: Abbrev Number: 1 (DW_TAG_compile_unit) <158> DW_AT_producer: (indirect string, offset: 0x26f): GNU C++14 9.1. 0 -mtune=generic -march=x86-64 -g -O2 -fno-exceptions -fno-rtti -flto ... <1><1aa>: Abbrev Number: 8 (DW_TAG_subprogram) <1ab> DW_AT_external: 1 <1ab> DW_AT_name: (indirect string, offset: 0x2cc): main <1af> DW_AT_decl_file : 1 <1b0> DW_AT_decl_line : 8 <1b1> DW_AT_decl_column : 5 <1b2> DW_AT_type: <0x191> so I think this is a BFD bug and to be filed in sourceware bugzilla (probably enough to have an actual binary with this kind of DWARF as testcase). Disclaimer: I didn't actually see whether my guess above is true (but it looks so obvious ;))
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #24 from krux --- objdump -dCrS also prints these errors. It definitely fails to find the entry for main which is present according to objdump --dwarf: <1>: Abbrev Number: 8 (DW_TAG_subprogram) DW_AT_external: 1 DW_AT_name: (indirect string, offset: 0x1ab): main DW_AT_decl_file : 1 DW_AT_decl_line : 8 DW_AT_decl_column : 5 DW_AT_type: <0xc3>
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #23 from krux --- But it's so fragile, touch any part of the code and the error disappears. Like change serial3_available to void and you also get an additional symbol: 4160 0003 T mainmain.cpp:8
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #22 from krux --- I can also reproduce it without any linker script, simplified code: int serial3_available() {} struct HardwareSerial3 { int available() { serial3_available(); } }; HardwareSerial3 Serial3; void yield() { serial3_available(); } int main() { yield(); } $ g++-9 -c -fno-exceptions -fno-rtti -flto -g -O2 main.cpp $ g++-9 -o firmware.elf -g -O2 main.o $ nm -ClS --radix=d --size-sort firmware.elf 4496 0001 T __libc_csu_fininm: DWARF error: could not find abbrev number 8 00016424 0001 b completed.7374 8192 0004 R _IO_stdin_used 4160 0043 T _start 4400 0093 T __libc_csu_init But other tools are fine in this case: $ llvm-dwarfdump-8 --verify firmware.elf No errors. $ gdb firmware.elf Reading symbols from firmware.elf...
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 Iain Sandoe changed: What|Removed |Added Status|NEW |WAITING --- Comment #21 from Iain Sandoe --- (In reply to krux from comment #20) > Thanks your patch worked! > > Just fyi: llvm-dwarfdump doesn't understand call-site info: > https://bugs.llvm.org/show_bug.cgi?id=41846 > Not sure if it's relevant. So, what's the status here? I think * we conclude that the revision bisected was not actually responsible for the issue underlying (but caused it to be exposed). * llvm-dwarfdump output isn't reliable as an indication of the problem - do you have dwarfdump? - can you use objdump -W ? * we can now -save-temps for the case(s) of interest. * I'm assuming that the problem is not "fixed" ? (have to verified that on current trunk?)
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #20 from krux --- Thanks your patch worked! Just fyi: llvm-dwarfdump doesn't understand call-site info: https://bugs.llvm.org/show_bug.cgi?id=41846 Not sure if it's relevant.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #19 from Iain Sandoe --- (In reply to krux from comment #18) > (In reply to Iain Sandoe from comment #14) > > current trunk (27), manual regeneration of the > > firmware.elf.ltrans0.ltrans.o -> > > > > (it's kinda frustrating that one can't see the link line, more tweaks are > > still needed to help debug LTO > > Tell me about it. The first time I tried -save-temps I expected > firmware.elf.ltrans0.o to be compiled from firmware.elf.ltrans0.s of course. > But it's not, -v shows the .s file is compiled to > firmware.elf.ltrans0.ltrans.o and I still don't really know what the other > one is. > Some commandlines seem to be missing (and it's hard to find them in the > verbose output, maybe some color could help) in the verbose output and the > temporary files are gone already. That's been fixed for collect2, and I'm testing (right now) a patch to fix it for the linker plugin. For the record there's a somewhat magical incantation that should work even now: -Wl,-plugin-opt=-debug (but the patch under test is to enable -save-temps to do the Usual Thing).
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #18 from krux --- (In reply to Iain Sandoe from comment #14) > current trunk (27), manual regeneration of the > firmware.elf.ltrans0.ltrans.o -> > > (it's kinda frustrating that one can't see the link line, more tweaks are > still needed to help debug LTO Tell me about it. The first time I tried -save-temps I expected firmware.elf.ltrans0.o to be compiled from firmware.elf.ltrans0.s of course. But it's not, -v shows the .s file is compiled to firmware.elf.ltrans0.ltrans.o and I still don't really know what the other one is. Some commandlines seem to be missing (and it's hard to find them in the verbose output, maybe some color could help) in the verbose output and the temporary files are gone already.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #17 from Iain Sandoe --- Created attachment 46348 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46348=edit binaries for test here is the output from trunk at 27 For some reason the plugin isn't getting the "-Wl,-debug" flag, so I hacked it to save the intermediates - that's a separate issue (and I should add -save-temps there too) - it does work for the collect2 case. Actually, a quick scan of the files, they seem sensible - but nm still barfs on the linked output. JFTR, I still can't see a difference between pre-linked and the tot ..
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #16 from Richard Biener --- (In reply to Iain Sandoe from comment #14) > (In reply to Iain Sandoe from comment #13) > > (In reply to Iain Sandoe from comment #12) > > current trunk (27), manual regeneration of the > firmware.elf.ltrans0.ltrans.o -> > > (it's kinda frustrating that one can't see the link line, more tweaks are > still needed to help debug LTO - but putting -Wl,--verbose indicates that > the right number of files are presented to, and load correctly) > > iains@gcc122:~/gcc-trunk/C$ ../../llvm-710-build/bin/llvm-dwarfdump --verify > firmware.elf.ltrans0.ltrans.o > Verifying firmware.elf.ltrans0.ltrans.o: file format ELF64-x86-64 > Verifying .debug_abbrev... > Verifying .debug_info Unit Header Chain... > Verifying .debug_info references... > error: invalid DIE reference 0x001d. Offset is in between DIEs: > > 0x0038: DW_TAG_subprogram > DW_AT_abstract_origin (0x001d) > DW_AT_low_pc(0x) > DW_AT_high_pc (0x) > DW_AT_frame_base(DW_OP_call_frame_cfa) > DW_AT_GNU_all_call_sites(true) > > > error: invalid DIE reference 0x003b. Offset is in between DIEs: > > 0x004f: DW_TAG_variable > DW_AT_abstract_origin (0x003b) > DW_AT_location (DW_OP_addr 0x0) > > > Errors detected. Btw, this is quite natural if you inspect an object file! There are relocations for these "zero" values. And the final linked object looks fine in this regard. So throwing llvm-dwarfdump on an object file is a user error (or rather llvm-dwarfdump doesn't understand there can be relocations in DW_AT_abstract_origin for example).
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #15 from Iain Sandoe --- this repeats for the compiler build from r267372, confirming some latent issue.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #14 from Iain Sandoe --- (In reply to Iain Sandoe from comment #13) > (In reply to Iain Sandoe from comment #12) current trunk (27), manual regeneration of the firmware.elf.ltrans0.ltrans.o -> (it's kinda frustrating that one can't see the link line, more tweaks are still needed to help debug LTO - but putting -Wl,--verbose indicates that the right number of files are presented to, and load correctly) iains@gcc122:~/gcc-trunk/C$ ../../llvm-710-build/bin/llvm-dwarfdump --verify firmware.elf.ltrans0.ltrans.o Verifying firmware.elf.ltrans0.ltrans.o:file format ELF64-x86-64 Verifying .debug_abbrev... Verifying .debug_info Unit Header Chain... Verifying .debug_info references... error: invalid DIE reference 0x001d. Offset is in between DIEs: 0x0038: DW_TAG_subprogram DW_AT_abstract_origin (0x001d) DW_AT_low_pc (0x) DW_AT_high_pc (0x) DW_AT_frame_base (DW_OP_call_frame_cfa) DW_AT_GNU_all_call_sites (true) error: invalid DIE reference 0x003b. Offset is in between DIEs: 0x004f: DW_TAG_variable DW_AT_abstract_origin (0x003b) DW_AT_location(DW_OP_addr 0x0) Errors detected.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #13 from Iain Sandoe --- (In reply to Iain Sandoe from comment #12) > (In reply to rguent...@suse.de from comment #11) > > On Mon, 13 May 2019, iains at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 > > > > > > --- Comment #10 from Iain Sandoe --- > > > (In reply to Iain Sandoe from comment #9) > > > > this is on the rev *before* the change, using llvm-dwarfdump from the > > > > llvm-7 > > > > branch: > > > > > > > > iains@gcc122:~/gcc-trunk/A$ ../../llvm-710-build/bin/llvm-dwarfdump > > > > --verify > > > > firmware.elf > > > > Verifying firmware.elf: file format ELF64-x86-64 > > > > Verifying .debug_abbrev... > > > > Verifying .debug_info Unit Header Chain... > > > > Verifying .debug_info references... > > > > error: invalid DIE reference 0x. Offset is in between DIEs: > > > > > > so probably a missing pointer? > > > > It looks like an unresolved relocation - those are to be resolved > > from $label + offset where $label is defined in one of the early > > debug units. Maybe we miss one early debug file in the link? > > It doesn't seem so: > ../lto-a/bin/g++ -mtune=generic -march=x86-64 -r -nostdlib -o > /tmp/ccXK3OSgdebugobj /tmp/ccuTtXKldebugobjtem /tmp/ccjVk9Cqdebugobjtem > /tmp/ccodTovvdebugobjtem > > and they contain debug_info for the three object files. Which verifies (according to llvm-dwarfdump) and looks sane on readelf -wi. The -r output file is deleted for the prelink case, so can't easily check that. (since the problem exists before and after the change, perhaps I can find the rev that improves the save-temps)
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #12 from Iain Sandoe --- (In reply to rguent...@suse.de from comment #11) > On Mon, 13 May 2019, iains at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 > > > > --- Comment #10 from Iain Sandoe --- > > (In reply to Iain Sandoe from comment #9) > > > this is on the rev *before* the change, using llvm-dwarfdump from the > > > llvm-7 > > > branch: > > > > > > iains@gcc122:~/gcc-trunk/A$ ../../llvm-710-build/bin/llvm-dwarfdump > > > --verify > > > firmware.elf > > > Verifying firmware.elf: file format ELF64-x86-64 > > > Verifying .debug_abbrev... > > > Verifying .debug_info Unit Header Chain... > > > Verifying .debug_info references... > > > error: invalid DIE reference 0x. Offset is in between DIEs: > > > > so probably a missing pointer? > > It looks like an unresolved relocation - those are to be resolved > from $label + offset where $label is defined in one of the early > debug units. Maybe we miss one early debug file in the link? It doesn't seem so: ../lto-a/bin/g++ -mtune=generic -march=x86-64 -r -nostdlib -o /tmp/ccXK3OSgdebugobj /tmp/ccuTtXKldebugobjtem /tmp/ccjVk9Cqdebugobjtem /tmp/ccodTovvdebugobjtem and they contain debug_info for the three object files.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #11 from rguenther at suse dot de --- On Mon, 13 May 2019, iains at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 > > --- Comment #10 from Iain Sandoe --- > (In reply to Iain Sandoe from comment #9) > > this is on the rev *before* the change, using llvm-dwarfdump from the llvm-7 > > branch: > > > > iains@gcc122:~/gcc-trunk/A$ ../../llvm-710-build/bin/llvm-dwarfdump --verify > > firmware.elf > > Verifying firmware.elf: file format ELF64-x86-64 > > Verifying .debug_abbrev... > > Verifying .debug_info Unit Header Chain... > > Verifying .debug_info references... > > error: invalid DIE reference 0x. Offset is in between DIEs: > > so probably a missing pointer? It looks like an unresolved relocation - those are to be resolved from $label + offset where $label is defined in one of the early debug units. Maybe we miss one early debug file in the link?
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #10 from Iain Sandoe --- (In reply to Iain Sandoe from comment #9) > this is on the rev *before* the change, using llvm-dwarfdump from the llvm-7 > branch: > > iains@gcc122:~/gcc-trunk/A$ ../../llvm-710-build/bin/llvm-dwarfdump --verify > firmware.elf > Verifying firmware.elf: file format ELF64-x86-64 > Verifying .debug_abbrev... > Verifying .debug_info Unit Header Chain... > Verifying .debug_info references... > error: invalid DIE reference 0x. Offset is in between DIEs: so probably a missing pointer?
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #9 from Iain Sandoe --- this is on the rev *before* the change, using llvm-dwarfdump from the llvm-7 branch: iains@gcc122:~/gcc-trunk/A$ ../../llvm-710-build/bin/llvm-dwarfdump --verify firmware.elf Verifying firmware.elf: file format ELF64-x86-64 Verifying .debug_abbrev... Verifying .debug_info Unit Header Chain... Verifying .debug_info references... error: invalid DIE reference 0x. Offset is in between DIEs: 0x0029: DW_TAG_imported_unit DW_AT_import (0x) 0x002e: DW_TAG_imported_unit DW_AT_import (0x) 0x0033: DW_TAG_imported_unit DW_AT_import (0x) 0x0038: DW_TAG_subprogram DW_AT_abstract_origin (0x) DW_AT_low_pc (0x004003b0) DW_AT_high_pc (0x004003b0) DW_AT_frame_base (DW_OP_call_frame_cfa) DW_AT_GNU_all_call_sites (true) 0x004f: DW_TAG_variable DW_AT_abstract_origin (0x) DW_AT_location(DW_OP_addr 0x0) Errors detected.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #8 from Richard Biener --- nm -l fw.elf also complains nm: BFD (GNU Binutils; devel:gcc / openSUSE_Leap_42.3) 2.31.1.20180828-334 assertion fail ../../bfd/dwarf2.c:3750 nm: BFD (GNU Binutils; devel:gcc / openSUSE_Leap_42.3) 2.31.1.20180828-334 assertion fail ../../bfd/dwarf2.c:3750 iff trunk is still the same as 2.31.1 then this is static bfd_boolean comp_unit_hash_info (struct dwarf2_debug *stash, struct comp_unit *unit, struct info_hash_table *funcinfo_hash_table, struct info_hash_table *varinfo_hash_table) { ... BFD_ASSERT (!unit->cached); where it might be confused about abstract origins crossing CU boundaries.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #7 from Richard Biener --- Btw, I can reproduce the nm error when linking w/o the linker script. But readelf is happy about the dwarf. I'm not sure what the llvm dwarf linter complains about with error: DIE address ranges are not contained in its parent's ranges: but it doesnt' complain about the abbrev issue nm complains about. nm bug?
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 --- Comment #6 from Iain Sandoe --- (In reply to Richard Biener from comment #5) > Before the bisection the linker script probably managed to "fix" the debug > info > but the issue was latent. Without the linker script it works fine for me. > With the script I get > > /usr/bin/ld: fw.elf: not enough room for program headers, try linking with -N > /usr/bin/ld: final link failed: bad value > collect2: error: ld returned 1 exit status > > when I add -nostdlib it works fine again. > > So - can't really reproduce. built pre-rev and rev. I get the same result as Richi, with the script "not enough room for program headers" Removing the script the link succeeds in both cases - and the error below is present before and after the revision mentioned (so I concur that the revision is simply exposing a latent issue). - iains@gcc122:~/gcc-trunk/A$ nm -ClS --radix=d --size-sort firmware.elf 06295576 0001 b completed.7325nm: BFD (GNU Binutils for Debian) 2.28 internal error, aborting at ../../bfd/dwarf2.c:2505 in find_abstract_instance_name nm: Please report this bug.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #5 from Richard Biener --- Before the bisection the linker script probably managed to "fix" the debug info but the issue was latent. Without the linker script it works fine for me. With the script I get /usr/bin/ld: fw.elf: not enough room for program headers, try linking with -N /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status when I add -nostdlib it works fine again. So - can't really reproduce.
[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Known to work||8.3.0 Keywords|needs-bisection | Last reconfirmed||2019-05-13 CC||iains at gcc dot gnu.org, ||marxin at gcc dot gnu.org Ever confirmed|0 |1 Summary|[9 regression] corrupt |[9/10 Regression] corrupt |debug info with LTO |debug info with LTO Target Milestone|--- |9.2 Known to fail||10.0, 9.1.0 --- Comment #4 from Martin Liška --- Confirmed on x86_64, started with r267373.