On Sat, 2020-08-01 at 15:26 -0700, Kevin Fenzi wrote: > On Sat, Aug 01, 2020 at 02:03:40PM -0600, Jeff Law wrote: > > On Sat, 2020-08-01 at 12:12 +0200, Kevin Kofler wrote: > > > Hi, > > > > > > seeing the amount of fallout from LTO, I really think that this feature > > > ought to be dropped from F33, and evaluated carefully for F34 (i.e., can > > > it > > > be done without breaking the build of or miscompiling a large part of the > > > distribution, once the bugs such as the ld bug discussed in this thread > > > are > > > fixed, or is it just unsafe to enable by default to begin with?). I.e., > > > revert it for F33 for sure, then decide whether to retry it for F34 or > > > can > > > it permanently. > > Most of the fallout has been Nick pushing through binutils builds that are > > broken. Seriously, there's been at least 4 builds pushed through that kept > > bringing back the *same* problem. > > > > And just to be clear, this has been 6+ months of behind the scenes work to > > find > > and identify issues, fix broken packages, put global mitigations of broken > > crap > > in place in place, opt-out packages that do things that are fundamentally > > incompatible with LTO, etc. In fact it was that behind-the-scenes work that > > pushed this feature from F32 to F33 as it just wasn't ready to go in F32. > > > > I think the chances of a serious mis-compilation large parts of the > > distribution > > are small. The one mis-compilation we know about was a latent linker bug > > that > > just happened to be triggered by LTO and that particular bug we know how to > > identify any packages that might have been broken. > > > > Frankly, there's been more fallout from infrastructure breakage and cmake > > issues > > than anything. I went through the first ~1000 failures proactively looking > > for > > things that were potentially LTO related and fixing them half-dozen or so I > > found, but by far the s390 infrastructure and cmake changes have caused more > > failures than anything. > > > > As has always been the case, I'm here to address any problems that arise > > and use > > my 30 years of experience with GCC development as well as distribution mass > > rebuilds to make informed decisions about the best course of action for any > > particular problem. > > Yeah, the s390x failures were anoying. I have several ideas to make > things more robust that hopefully we can do before next mass rebuild: > > * move the cache host from a z/vm instance to a kvm one. > * We have the kvm ones oversubscribed on cpus, so I'd like to drop all > of them from 4 cpus to 3. > * We might play with the weight on them so koji doesn't run as many jobs > at a time as it does now. > * Make sure ci/koji-simple-ci/koschei isn't doing any long running > builds when the mass rebuild starts. A gcc or libreoffice build can take > up a builder for a long time. > * Run the mass rebuild with --fail-fast so if something fails on some > other arch, it never even needs to run on s390x. > > Anyhow, the mass rebuild is over and tagged in. Rawhide compose is > running and should hopefully finish later today. > > The second pass took failures from 4162 to 2833, so that helped > a lot: https://kojipkgs.fedoraproject.org/mass-rebuild/f33-failures.html Cool. Thanks for the update. I'll start scanning through those. It looks like some of the cmake things are getting addressed which I'm sure helps too.
Jeff > _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org