[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #22 from Iain Sandoe --- OK, So this has been biting me some more. It might be another case where Darwin has thrown up a more general problem. What's happening is that, where functions are ending up zero-sized, an FDE is still being emitted. we get for DWARF FDE, .globl foo foo: LFBxxx LFExxx and for .cfi_ .globl foo foo: LFBxxx .cfi_startproc .cfi_endproc LFExxx ... both produce FDEs with 0 PC ranges. This upsets ld64. 1. GCC - it seems a waste of binary file space to emit FDEs with 0 PC range, since they can neither be the site of an exception, nor can they participate in unwinding; however, it might be rather intrusive for the current phase to fix that - if it's not causing any other port problems. * I haven't thought about it much harder than that - any reason anyone can see for wanting to emit an FDE with 0 PC range? 2. ld64 - should, perhaps, be more defensive, and discard 0-length FDEs when pulling in object files. I've patched my version to do this and testing - will post a revised version when it's done. Meanwhile, are there any other thoughts from folks on the best way forward?
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #23 from mrs at gcc dot gnu.org --- On the platform, external symbols are defined to be 1 or more bytes. 0 is not one or more. Once that is fixed, then the problem goes away. If you want to have Apple update their abi for future systems to include zero byte objects, you will have to ask them to change their abi.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #24 from Iain Sandoe --- (In reply to m...@gcc.gnu.org from comment #23) > On the platform, external symbols are defined to be 1 or more bytes. 0 is > not one or more. Once that is fixed, then the problem goes away. If you > want to have Apple update their abi for future systems to include zero byte > objects, you will have to ask them to change their abi. Well, I'm very aware of the constraint that has been applied our output to date (having implemented some of the "fixes" for it). However, it was my understanding that the constraint was one of the tools; I.E. ld64's ability to determine an unambiguous 'atom'. I can't find anything in the written ABI or assembler documentation that makes such a statement (although we accept that "what the other tools do" is the effective ABI). It appears that (recently) ld64 [and the assembler] have been modified to support symbol aliases. Thus the constraint you mention has been amended/modified; newer versions of ld64 are not complaining about the 0-sized functions (or co-incident symbols), only the 0-sized FDE. It would be quite a useful step forward to support symbol aliases - since the absence has been a source of difficulty for us - but let's not get side-tracked from the actual problem. 1. FWIW There is code in i386.c [12410 - 12438] that is supposed to ensure that functions on Darwin contain at least a NOP. However, it clearly isn't working in these cases. 2. The issue of whether Darwin can have 0-sized functions is actually a separate one from whether GCC should emit FDEs for 0-sized functions (since other platforms can clearly support them).
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #25 from Iain Sandoe --- Created attachment 37324 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37324&action=edit Avoid empty function bodies V1 So... The #if TARGET_MACHO code in ix86_output_function_epilogue () is supposed to prevent trailing labels on Darwin functions (because that creates another problem if those are used in relocations). However, the code doesn't work for multiple reasons - not least of which is that ix86_output_function_epilogue() is called before the last function lables are emitted. Ironically, if it was working - it would have suppressed the current bug since we typically get: …. globl foo foo: LFBxxx <=== ix86_output_function_epilogue() is called here. LFExxx and, in theory, the trailing LFBxxx should have fired the output of a nop. However, the presence of the barrier seems to undo this. Given that the ifdef-d code cannot do what it intends (it would need to be called later), it might as well be removed. We can, however, detect empty function bodies at this point an emit some instruction to avoid the circumstance. At present, there doesn't seem to be any legitimate case where an empty function body could be validly executed. Note that the usal reason for function bodies to be completely empty is when the code in the function is made unreachable (with __builtin_unreachable()). The GCC manual says that reaching such code is UB, so we are free to do whatever seems most useful. In this case making the function one insn long and making that insn "hlt" seems useful - so that if such a function is actually called it does something that will provide a hint to the user. Bootstrapped on trunk (and 5.3) testing as and when - folks, please comment/try out.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #26 from Uroš Bizjak --- + /* If we don't find any, we've got an empty function body; i.e. +completely empty - without a return or branch. Reaching an +empty function body means UB. Let's trap it. */ + if (insn == NULL) + fputs ("\thlt\n", file); Probably sou want to use ud2 instruction here.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #27 from Iain Sandoe --- (In reply to Uroš Bizjak from comment #26) > + /* If we don't find any, we've got an empty function body; i.e. > + completely empty - without a return or branch. Reaching an > + empty function body means UB. Let's trap it. */ > + if (insn == NULL) > + fputs ("\thlt\n", file); > > Probably sou want to use ud2 instruction here. yeah, hlt is a little drastic ;-)
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 Chris Green changed: What|Removed |Added CC||greenc at fnal dot gov --- Comment #28 from Chris Green --- For those people still running into this (problem still exists with GCC 6.2), the following workaround will do the job on OS X / Mac OS: simply add this definition to your compile commands: -D__builtin_unreachable=__builtin_trap
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #1 from Andrew Pinski --- This seems better reported to Apple than here as it is Apple's provided ld that is crashing.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 Jack Howarth changed: What|Removed |Added CC||howarth at nitro dot med.uc.edu --- Comment #2 from Jack Howarth --- Are these failures limited to 'make bootstrap-lean' on your machines? What happens if you just use 'make' without arguments.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #3 from Jack Howarth --- The trigger for this bug is the use of --disable-checking. The linker crash doesn't occur when --enable-checking=release or --enable-checking=yes is passed to configure instead.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #4 from Dara Hazeghi --- Aha! I will try using plain make and leaving checking alone. I don't suppose this is documented anywhere? As to reporting the bug to Apple, is this in fact a linker bug, as opposed to a bad-code-generation bug?
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #5 from Dara Hazeghi --- (In reply to Dara Hazeghi from comment #4) > Aha! I will try using plain make and leaving checking alone. I don't > suppose this is documented anywhere? make (not bootstrap) with --enable-checking=release does work. I'll try again with bootstrap-lean to verify whether checking is the sole cause of the failure.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #6 from Jack Howarth --- I've opened radar://14005298, "linker crash when building FSF gcc with --disable-checking" with a standalone test case of the failing linkage of cc1.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #7 from Dara Hazeghi --- (In reply to Jack Howarth from comment #6) > I've opened radar://14005298, "linker crash when building FSF gcc with > --disable-checking" with a standalone test case of the failing linkage of > cc1. Thanks a bunch! make bootstrap-lean works fine with --enable-checking=release, so the checking is definitely the cause here.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #8 from Jack Howarth --- The darwin linker developer's analysis of the failing linkage of cc1 is below... The assertion is about the file libbackend.a(varasm.o). There are overlapping FDEs. If you run dwarfdump in verify mode, it will complain about it to:: [/tmp/newlinkerbug/lib]> dwarfdump --eh-frame --verify varasm.o -- File: varasm.o (x86_64) -- Verifying EH Frame... error: FDE row for address 0x5900 is not in the FDE address range. 0x20e0: FDE length: 0x001c CIE_pointer: 0x start_addr: 0x5900 __Z24default_no_named_sectionPKcjP9tree_node range_size: 0x (end_addr = 0x5900) DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop Instructions: 0x5900: CFA=rsp+8 rip=[rsp] 1 errors found in EH frame for varasm.o (x86_64). Dumping the whole file, there is an FD for a zero length function, so two FDEs have the same function start address: 0x20e0: FDE length: 0x001c CIE_pointer: 0x start_addr: 0x5900 __Z24default_no_named_sectionPKcjP9tree_node range_size: 0x (end_addr = 0x5900) DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop DW_CFA_nop Instructions: 0x5900: CFA=rsp+8 rip=[rsp] 0x2100: FDE alength: 0x006c CIE_pointer: 0x start_addr: 0x5900 __Z24default_no_named_sectionPKcjP9tree_node range_size: 0x0154 (end_addr = 0x5a54) Instructions: 0x5900: CFA=rsp+8 rip=[rsp]
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #9 from Mike Stump --- If you can attach the .s file for varasm.c that does result in the crash that would be good. If this is a regression, identifying the change that broken it would be handy. Thanks.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #10 from Dara Hazeghi --- Created attachment 30211 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30211&action=edit varasm.s.gz varasm.s resulting in the crash
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 mrs at gcc dot gnu.org changed: What|Removed |Added CC||mrs at gcc dot gnu.org --- Comment #11 from mrs at gcc dot gnu.org --- Curious, varasm.s has: __Z24default_no_named_sectionPKcjP9tree_node: LFB588: nop # Required to be here, or the pair must be removed. LFE588: well, except for the nop. If I add the nop, we get a non-zero size and it works, if the nop is missing, zero size, and it fails. So, now the question is, what broke this, a tools upgrade on the OS, or a update on the gcc trunk? If gcc, which update. If a tools update on the OS, then I think we need to add code to dwarf to find and remove the trivial bits.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #12 from mrs at gcc dot gnu.org --- Ok, new theory. Does this patch fix it for you: Index: varasm.c === --- varasm.c(revision 199270) +++ varasm.c(working copy) @@ -6052,7 +6052,7 @@ default_no_named_section (const char *na { /* Some object formats don't support named sections at all. The front-end should already have flagged this as an error. */ - gcc_unreachable (); + gcc_assert (0); } #ifndef TLS_SECTION_ASM_FLAG
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #13 from Dara Hazeghi --- (In reply to m...@gcc.gnu.org from comment #12) > Ok, new theory. Does this patch fix it for you: Thanks for the patch. Just tried bootstrapping with it and checking disabled, and the same assertion still triggers.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #14 from mrs at gcc dot gnu.org --- Thanks, how about this one? Index: target.def === --- target.def(revision 199270) +++ target.def(working copy) @@ -225,7 +225,7 @@ DEFHOOK (named_section, "", void, (const char *name, unsigned int flags, tree decl), - default_no_named_section) + 0) /* Return preferred text (sub)section for function DECL. Main purpose of this function is to separate cold, normal and hot Index: varasm.c === --- varasm.c(revision 199270) +++ varasm.c(working copy) @@ -6042,19 +6042,6 @@ have_global_bss_p (void) return bss_noswitch_section || targetm.have_switchable_bss_sections; } -/* Output assembly to switch to section NAME with attribute FLAGS. - Four variants for common object file formats. */ - -void -default_no_named_section (const char *name ATTRIBUTE_UNUSED, - unsigned int flags ATTRIBUTE_UNUSED, - tree decl ATTRIBUTE_UNUSED) -{ - /* Some object formats don't support named sections at all. The - front-end should already have flagged this as an error. */ - gcc_unreachable (); -} -
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #15 from Dara Hazeghi --- (In reply to m...@gcc.gnu.org from comment #14) > Thanks, how about this one? Seems to be the same - assert in the same spot. Shall I upload the varasm.s produced with the second patch?
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #16 from mrs at gcc dot gnu.org --- Yes please. If you can run: dwarfdump --eh-frame --verify file.o on all the .o files and see if there are any more lurking in there. Any that fail verification will need to be fixed, one way, or another.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #17 from Dara Hazeghi --- (In reply to m...@gcc.gnu.org from comment #16) > Yes please. If you can run: > > dwarfdump --eh-frame --verify file.o > > on all the .o files and see if there are any more lurking in there. Any > that fail verification will need to be fixed, one way, or another. >From gcc/ I see the following: 1 errors found in EH frame for dfp.o (x86_64). 1 errors found in EH frame for gengtype-state.o (x86_64). 1 errors found in EH frame for hooks.o (x86_64). 3 errors found in EH frame for i386.o (x86_64). 3 errors found in EH frame for insn-output.o (x86_64). 2 errors found in EH frame for langhooks.o (x86_64). 1 errors found in EH frame for sched-deps.o (x86_64). 9 errors found in EH frame for targhooks.o (x86_64). 1 errors found in EH frame for tree-profile.o (x86_64). 1 errors found in EH frame for tree-ssa-loop-im.o (x86_64). 2 errors found in EH frame for tree.o (x86_64). 1 errors found in EH frame for var-tracking.o (x86_64). Shall I upload the object code or the assembly code?
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #18 from Jack Howarth --- Do we have any idea why this problem is latent with --checking=release and --checking=yes but is triggered by --disable-checking?
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #19 from mrs at gcc dot gnu.org --- I'll build my own tree, thanks. I was hoping that it was a singular issue and we'd be done with it.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 Dominique d'Humieres changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-10-12 Ever confirmed|0 |1 --- Comment #20 from Dominique d'Humieres --- Still present at revision 203491 and the patch in comment #14 does not help.
[Bug bootstrap/57438] bootstrap fails on x86_64 darwin in stage2 linking cc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 --- Comment #21 from Iain Sandoe --- (In reply to Dominique d'Humieres from comment #20) > Still present at revision 203491 and the patch in comment #14 does not help. Trivial reproducer: = __attribute__((noinline)) void foo (void) { __builtin_unreachable(); } int main (int ac, char *av[]) { foo (); return 0; } = As Mike surmises this is another case where we emit code that does not comply with the "atom model" that ld64 (and lld) uses. foo() and main() both end up empty for -O > 0. Mike: any thoughts on this? - seems you were intending to take a look. (it also breaks bootstrapping llvm with GCC in Release mode)