[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 --- Comment #18 from Andreas K. Huettel --- https://gitweb.gentoo.org/fork/binutils-gdb.git/log/?h=gentoo/binutils-2.40 -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 Andreas K. Huettel changed: What|Removed |Added CC||dilfridge at gentoo dot org --- Comment #17 from Andreas K. Huettel --- (In reply to Michael Matz from comment #16) > Yes, the hacky patch is fine for you to use, it won't introduce further > problems > (knock on wood! :) ). Michael: I just added after some tests that patch to Gentoo testing (i.e. Gentoo binutils-2.40-r5). It'll get a lot of usage now. If nothing pops up over the next 2-3 weeks, I could bring the upstream 2.40 backports branch to the same level, if that's OK with you. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 --- Comment #16 from Michael Matz --- Yes, the hacky patch is fine for you to use, it won't introduce further problems (knock on wood! :) ). -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 --- Comment #15 from Sam James --- Thank you! Do you think the hacky patch (not the one committed) would be good enough for us to backport downstream for 2.40? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 Michael Matz changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #14 from Michael Matz --- Fixed in master. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 --- Comment #13 from cvs-commit at gcc dot gnu.org --- The master branch has been updated by Michael Matz : https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=670c91c0c5eb3fb16d42fe5b2d48a33c64e3ec52 commit 670c91c0c5eb3fb16d42fe5b2d48a33c64e3ec52 Author: Michael Matz Date: Tue Apr 25 17:10:05 2023 +0200 Fix PR30358, performance with --sort-section since af31506c we only use the binary tree when section sorting is required. While its unbalanced and hence can degrade to a linear list it should otherwise have been equivalent to the old code relying on insertion sort. Unfortunately it was not. The old code directly used lang_add_section to populate the sorted list, the new code first populates the tree and only then does lang_add_section on the sorted result. In the testcase we have very many linkonce section groups, and hence lang_add_section won't actually insert anything for most of them. That limited the to-be-sorted list length previously. The tree-sorting code OTOH first created a tree of all candidates sections, including those that wouldn't be inserted by lang_add_section, hence increasing the size of the sorting problem. In the testcase the chain length went from about 1500 to 106000, and in the degenerated case (as in the testcase) that goes in quadratically. This splits out most of the early-out code from lang_add_section to its own function and uses the latter to avoid inserting into the tree. This refactoring slightly changes the order of early-out tests (the ones based on section flags is now done last, and only in lang_add_section). The new function is not a pure predicate: it can give warnings and it might change output_section, like the old early-out code did. I have also added a skip-warning case in the first discard case, whose non-existence seemed to have been an oversight. PR 30358 * ldlang.c (wont_add_section_p): Split out from ... (lang_add_section): ... here. (output_section_callback_sort): Use wont_add_section_p to not always add sections to the sort tree. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 --- Comment #12 from Sam James --- Not tested the patch yet, but Matz, I'd like to say thanks for the insightful explanation. Really appreciated! -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 --- Comment #11 from Michael Matz --- The old (insertion-sort) code only added something to the output section list if the considered section wasn't already either discarded or linked (by being part of a group for instance). That is done by lang_add_section. This output section list is the sorted container into which further insertions are done (and hence its length is the one determining performance). The binary search tree code always adds all candidates to the tree (which in our unlucky case here mostly degrades to a linked list), and only then goes through that tree to linearize it to a list which doesn't contain the discarded or already linked input sections (group members). So, the intermediate tree contains things that aren't ultimately going to be output, blowing performance down the drain. Otherwise the old insertion-sort code and the now always used tree-based code are equivalent here. But the example contains _many_ groups, and all of them have a .debug_macro section (and only that!) and it exists in many input files, so the difference is a search-chain length of about 1500 in the insertion sort case and about 106000 in the binary tree case, and that factor goes in quadraticly into performance (as said, the search tree is degraded in the example). So, the solution is obvious: don't add something to the tree if it won't be added to the linearized list later. That requires some refactoring. A hacky un-refactored patch for the above is below (relative to master). It restores performance. diff --git a/ld/ldlang.c b/ld/ldlang.c index 4bec280b9df..7e0a9db51e3 100644 --- a/ld/ldlang.c +++ b/ld/ldlang.c @@ -687,6 +687,19 @@ output_section_callback_sort (lang_wild_statement_type *ptr, if (unique_section_p (section, os)) return; + /* Don't add sections to the tree when we already know that + lang_add_section won't do anything with it. */ + if (section->output_section != NULL) +{ + if (!link_info.non_contiguous_regions) + return; + + /* SECTION has already been handled in a special way +(eg. LINK_ONCE): skip it. */ + if (bfd_is_abs_section (section->output_section)) + return; +} + node = (lang_section_bst_type *) xmalloc (sizeof (lang_section_bst_type)); node->left = 0; node->right = 0; -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 Michael Matz changed: What|Removed |Added Assignee|unassigned at sourceware dot org |matz at suse dot de Status|NEW |ASSIGNED --- Comment #10 from Michael Matz --- Yeah, that patch wouldn't help, it's in a different area. Thanks for the complete testcase. I'm going to have a look here. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30358] bfd very slow (> 4 minutes) to link busybox with -Wl,--sort-section,alignment (regression in binutils-2.40) since af31506c31a59a6edbb13498d6075fa704b801cd
https://sourceware.org/bugzilla/show_bug.cgi?id=30358 Sam James changed: What|Removed |Added Summary|bfd very slow (> 4 minutes) |bfd very slow (> 4 minutes) |to link busybox with|to link busybox with |-Wl,--sort-section,alignmen |-Wl,--sort-section,alignmen |t (regression in|t (regression in |binutils-2.40) |binutils-2.40) since ||af31506c31a59a6edbb13498d60 ||75fa704b801cd -- You are receiving this mail because: You are on the CC list for the bug.