Re: Fedora mass rebuild 2018
Paul Howarth wrote: > On Fri, 23 Feb 2018 15:04:44 + > Tom Hugheswrote: > >> On 23/02/18 14:33, Paul Howarth wrote: >>> On Thu, 22 Feb 2018 18:49:02 +0100 >>> Marek Polacek wrote: proftpd: timeouts in tests, but in koji it's fine >>> >>> I get this too. If I build with mock --old-chroot then it works >>> fine. >> >> So the obvious difference is that mock with --old-chroot has >> networking enabled while the default (ie --new-chroot) does not. >> >> Does --enable-network also make it work? > > Yes, it does. > >> That doesn't explain koji though, as that has networking disabled. You likely know, koji doesn't run the current mock. It's got mock-1.3.4 which used the old-style chroot rather than systemd-nspawn. I'm not sure how exactly networking is disabled on the builders. > Indeed. It looks like it's the combination of using nspawn and no > network that breaks it. A fedpkg mockbuild of proftpd works with nspawn if you add the '--private-network' option in site-defaults.cfg, i.e.: config_opts['nspawn_args'] = ['--private-network'] Without that option the way networking is disbled can cause problems like this. If you want to use the old-style chroot with fedpkg mockbuild, it can be set in site-defaults.cfg via: config_opts['use_nspawn'] = False Working out the issues with mock's use of systemd-nspawn would be ideal. I left off looking at it by setting the '--private-network' option, but I didn't feel that I had enough evidence that was the only/best solution to propose it as a patch (particularly because it was used in a previous iteration of the changes in the issue below). As I mentioned in a previous reply, more discussion of this can be found in mock issue #113: https://github.com/rpm-software-management/mock/issues/113 I remember thinking part of the issue seemed to be that the solution currently implemented in mock sets the default route to 127.0.0.1 but copies /etc/resolv.conf from the host system -- regardless of the use_host_resolv setting because systemd-nspawn overrides mock and copies the system /etc/resolv.conf to the container. If someone has some time to poke around with this and perhaps open a new mock issue, that would be great. -- Todd ~~ Wisdom has two parts: (1) having a lot to say and (2) not saying it. signature.asc Description: PGP signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On Fri, 23 Feb 2018 15:04:44 + Tom Hugheswrote: > On 23/02/18 14:33, Paul Howarth wrote: > > On Thu, 22 Feb 2018 18:49:02 +0100 > > Marek Polacek wrote: > >> proftpd: timeouts in tests, but in koji it's fine > > > > I get this too. If I build with mock --old-chroot then it works > > fine. > > So the obvious difference is that mock with --old-chroot has > networking enabled while the default (ie --new-chroot) does not. > > Does --enable-network also make it work? Yes, it does. > That doesn't explain koji though, as that has networking disabled. Indeed. It looks like it's the combination of using nspawn and no network that breaks it. Paul. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On Fri, Feb 23, 2018 at 06:33:58PM +0100, Marek Polacek wrote: > On Fri, Feb 23, 2018 at 04:09:26PM +, Richard W.M. Jones wrote: > > On Thu, Feb 22, 2018 at 06:49:02PM +0100, Marek Polacek wrote: > > > libguestfs-1.37.35-2.fc28.src.rpm > > > I'm not sure about these failures, but they don't seem to be GCC bugs. > > > > Is there a log file? > > The one I have says > libvirt: XML-RPC error : Failed to connect socket to > '/builddir/.cache/libvirt/libvirt-sock': No such file or directory > libguestfs: error: could not connect to libvirt (URI = qemu:///session): > Failed to connect socket to '/builddir/.cache/libvirt/libvirt-sock': No such > file or directory [code=38 int1=2] > libguestfs: trace: launch = -1 (error) > libguestfs: trace: close > libguestfs: closing guestfs handle 0x5646f5406a40 (state 0) > libguestfs: command: run: rm > libguestfs: command: run: \ -rf > /builddir/build/BUILD/libguestfs-1.37.35/tmp/libguestfsfTNsVu > make: *** [Makefile:2906: quickcheck] Error 1 > > so it's probably another networking issue in mock. This is actually a bug in libvirt which happens randomly and infrequently. In any case we've since fixed all known problems with libguestfs and GCC 8 so there's no need to try again. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On Fri, Feb 23, 2018 at 04:09:26PM +, Richard W.M. Jones wrote: > On Thu, Feb 22, 2018 at 06:49:02PM +0100, Marek Polacek wrote: > > libguestfs-1.37.35-2.fc28.src.rpm > > I'm not sure about these failures, but they don't seem to be GCC bugs. > > Is there a log file? The one I have says libvirt: XML-RPC error : Failed to connect socket to '/builddir/.cache/libvirt/libvirt-sock': No such file or directory libguestfs: error: could not connect to libvirt (URI = qemu:///session): Failed to connect socket to '/builddir/.cache/libvirt/libvirt-sock': No such file or directory [code=38 int1=2] libguestfs: trace: launch = -1 (error) libguestfs: trace: close libguestfs: closing guestfs handle 0x5646f5406a40 (state 0) libguestfs: command: run: rm libguestfs: command: run: \ -rf /builddir/build/BUILD/libguestfs-1.37.35/tmp/libguestfsfTNsVu make: *** [Makefile:2906: quickcheck] Error 1 so it's probably another networking issue in mock. Marek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On Thu, Feb 22, 2018 at 06:49:02PM +0100, Marek Polacek wrote: > libguestfs-1.37.35-2.fc28.src.rpm > I'm not sure about these failures, but they don't seem to be GCC bugs. Is there a log file? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
Tom Hughes wrote: > On 23/02/18 14:33, Paul Howarth wrote: >> On Thu, 22 Feb 2018 18:49:02 +0100 >> Marek Polacekwrote: >>> proftpd: timeouts in tests, but in koji it's fine >> >> I get this too. If I build with mock --old-chroot then it works fine. > > So the obvious difference is that mock with --old-chroot has networking > enabled while the default (ie --new-chroot) does not. > > Does --enable-network also make it work? > > That doesn't explain koji though, as that has networking disabled. I don't know if this is the same issue, but I noticed that in local mock builds of git, some tests were very slow. The tests that were slow were suffering from socket timeouts while making calls to getaddrinfo (to get the local hostname, for determining the default git user identity). I added --private-network to the nspawn_args in the mock config which eliminated the 5 second timeouts for each of these calls. I made a comment to remind myself about it later: # NOTE (tmz): This should still be the default, I think. Without it, network # calls timeout rather than fail quickly. The typical default socket timeout is # 5 seconds. A simple call like 'git var GIT_AUTHOR_IDENT' saw 4 socket # timeouts and takes 20 seconds to return. With this option, it takes # milliseconds. # # See https://github.com/rpm-software-management/mock/issues/113 where this was # partially dealt with. Unfortunately, the solution there sets a default route # to 127.0.0.1 but copies /etc/resolv.conf from the host system -- regardless of # the use_host_resolv setting because systemd-nspawn overrides mock and copies # the system /etc/resolf.conf to the container. There seems to be no option # other than --private-network to avoid this. I haven't had time to look at it further to see if there are any patches worth submitting to improve the situation (in mock and/or systemd-nspawn). -- Todd ~~ Men have become the tools of their tools. -- Henry David Thoreau (1817-1862) signature.asc Description: PGP signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On 23/02/18 14:33, Paul Howarth wrote: On Thu, 22 Feb 2018 18:49:02 +0100 Marek Polacekwrote: proftpd: timeouts in tests, but in koji it's fine I get this too. If I build with mock --old-chroot then it works fine. So the obvious difference is that mock with --old-chroot has networking enabled while the default (ie --new-chroot) does not. Does --enable-network also make it work? That doesn't explain koji though, as that has networking disabled. Tom -- Tom Hughes (t...@compton.nu) http://compton.nu/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On Thu, 22 Feb 2018 18:49:02 +0100 Marek Polacekwrote: > proftpd: timeouts in tests, but in koji it's fine I get this too. If I build with mock --old-chroot then it works fine. Paul. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
2018-02-22 21:47 GMT+01:00 Josh Boyer: > On Thu, Feb 22, 2018 at 12:49 PM, Marek Polacek wrote: >> As many of you know, every year we (the GCC team) rebuild all the Fedora >> packages with the upcoming GCC, so as to reveal as many bugs as possible >> before >> we release the new version. As in the previous years, it is only performed >> on >> x86_64 only; we unfortunately lack the resources to deal with other arches. >> Ideally we'd conclude this mass rebuild *before* the new GCC has gotten into >> the buildroots; alas, this wasn't the case this year. Thank you for your great job. > > josh > >> >> Let's get down to the nitty-gritty: ... >> jalv-1.6.0-3.fc27.src.rpm I tried a mockbuild in current rawhide (jalv-1.6.0-4.fc28.src.rpm) and it was successful. ... >> lv2-x42-plugins-0.3.0-0.3.20170428.fc27.src.rpm I've patched this package (lv2-x42-plugins-0.3.0-0.5.20170428.fc29.src.rpm) and I've sent PRs upstream Ciao Guido fas account: tartina ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Fedora mass rebuild 2018
On Thu, Feb 22, 2018 at 12:49 PM, Marek Polacekwrote: > As many of you know, every year we (the GCC team) rebuild all the Fedora > packages with the upcoming GCC, so as to reveal as many bugs as possible > before > we release the new version. As in the previous years, it is only performed on > x86_64 only; we unfortunately lack the resources to deal with other arches. > Ideally we'd conclude this mass rebuild *before* the new GCC has gotten into > the buildroots; alas, this wasn't the case this year. > > I downloaded all Fedora packages on Jan 19, which should give you a sense of > how long it takes to process all of this. > There were 20892 packages overall (last year Fedora had 18811 packages). > Using > koji-is-noarch.py I removed all noarch packages from that, so that we only > build archful packages to save time. That left me with 9329 packages to > build. > Of that, 8358 built fine with the new GCC (mostly > gcc-8.0.1-0.3.fc28.x86_64.rpm > but I also used a newer version from rawhide). The packages that failed to > build with GCC 8 I rebuilt with GCC 7; if they failed with GCC 7, I took them > off the list. The rest had to be analyzed; it was around 300 packages this > time. (Last year it was ~198 packages. So more work this year.) > > I found several GCC bugs, most of which have already been fixed (actually all > but PR84231). Fortran ABI has changed in GCC 8 (as it did in GCC 7). A nasty > bug has been discovered in the new empty classes ABI code: we will need > another > mass rebuild to find out which package are affected. There have been a few > bugs in code that deals with optimizing strlen; hopefully we won't find more > of > them. > > A lot of churn has been caused due to changes in the C++ compiler. Previously > GCC was fairly forgiving about broken template code and would check almost > nothing until the template was instantiated. In more recent releases parts of > the template which don't depend on the template arguments get checked earlier. > The standard says such code is ill-formed, but that no diagnostic is required > i.e. the compiler is allowed to diagnose the problem, but not required to > (because it could be expensive or difficult to check for some compilers). So > the code was always broken, but now GCC tells you about it. I encourage the > packagers to fix these bugs. > > With every release GCC gains new warnings which cause build failures in > packages that use -Werror (not going to start a flame war about this here). > This year the main offender was probably -Wformat-truncation. Due to time > constrains I wasn't able to check every warning and decide if it's warranted > or a false positive. > > As usual, there will be a "porting to" document to ease the transition to the > new GCC. We already have https://gcc.gnu.org/gcc-8/porting_to.html, even > though > this document is still in flux. > > What follows is my analysis of what went wrong for the people who want to get > an overview of the details. Note that my understanding of packages other than > gcc is very limited, so it's entirely possible that I miscategorized some > problems. Since during the time I was analyzing failures GCC 8 made it to the > buildroots, it was no longer possible to use buildroots only different from > each other by the GCC version, so some of the failures are caused by a new > version of boost, glibc, etc. > > Thanks Jakub Jelinek for promply fixing bugs I reported (or caused :)) and > Jonathan Wakely for his great help with anything related to C++. Thanks for both the detailed summary and the overall great amount of work your team does to help ensure we continue to be "first" when it comes to toolchain in Fedora. I know this isn't easy by any means, but reducing and analyzing the failures to a relatively small number really reduces the burden on the overall Fedora packager community. Keep up the excellent work! josh > > Let's get down to the nitty-gritty: > > > > apr-1.6.3-1.fc28.src.rpm > undefined behavior -- signed overflow: > 310 for (off = 1; off < LONG_MAX && off > 0; off *= 2) { > 311 apr_strfsize(off, buf); > 312 apr_strfsize(off + 1, buf); > 313 apr_strfsize(off - 1, buf); > 314 } > where off is of type long int. > > mozjs38-38.8.0-8.fc28.src.rpm > undefined behavior; building with -fsanitize=undefined: > > # cd /builddir/build/BUILD/mozilla-esr38/js/src/tests && > /builddir/build/BUILD/mozilla-esr38/js/src/js/src/shell/js -f shell.js -f > js1_5/shell.js -f js1_5/Regress/shell.js -f > js1_5/Regress/regress-360969-06.js; cd - > BUGNUMBER: 360969 > STATUS: 2^17: global function > /builddir/build/BUILD/mozilla-esr38/js/src/gc/Marking.cpp:669:10: runtime > error: load of misaligned address 0x7f8be8070c9a for type 'void *', which > requires 8 byte alignment > 0x7f8be8070c9a: note: pointer points here > 00 00 48 b9 30 d8 e5 e2 8b 7f 00 00 48 8b 49
Fedora mass rebuild 2018
As many of you know, every year we (the GCC team) rebuild all the Fedora packages with the upcoming GCC, so as to reveal as many bugs as possible before we release the new version. As in the previous years, it is only performed on x86_64 only; we unfortunately lack the resources to deal with other arches. Ideally we'd conclude this mass rebuild *before* the new GCC has gotten into the buildroots; alas, this wasn't the case this year. I downloaded all Fedora packages on Jan 19, which should give you a sense of how long it takes to process all of this. There were 20892 packages overall (last year Fedora had 18811 packages). Using koji-is-noarch.py I removed all noarch packages from that, so that we only build archful packages to save time. That left me with 9329 packages to build. Of that, 8358 built fine with the new GCC (mostly gcc-8.0.1-0.3.fc28.x86_64.rpm but I also used a newer version from rawhide). The packages that failed to build with GCC 8 I rebuilt with GCC 7; if they failed with GCC 7, I took them off the list. The rest had to be analyzed; it was around 300 packages this time. (Last year it was ~198 packages. So more work this year.) I found several GCC bugs, most of which have already been fixed (actually all but PR84231). Fortran ABI has changed in GCC 8 (as it did in GCC 7). A nasty bug has been discovered in the new empty classes ABI code: we will need another mass rebuild to find out which package are affected. There have been a few bugs in code that deals with optimizing strlen; hopefully we won't find more of them. A lot of churn has been caused due to changes in the C++ compiler. Previously GCC was fairly forgiving about broken template code and would check almost nothing until the template was instantiated. In more recent releases parts of the template which don't depend on the template arguments get checked earlier. The standard says such code is ill-formed, but that no diagnostic is required i.e. the compiler is allowed to diagnose the problem, but not required to (because it could be expensive or difficult to check for some compilers). So the code was always broken, but now GCC tells you about it. I encourage the packagers to fix these bugs. With every release GCC gains new warnings which cause build failures in packages that use -Werror (not going to start a flame war about this here). This year the main offender was probably -Wformat-truncation. Due to time constrains I wasn't able to check every warning and decide if it's warranted or a false positive. As usual, there will be a "porting to" document to ease the transition to the new GCC. We already have https://gcc.gnu.org/gcc-8/porting_to.html, even though this document is still in flux. What follows is my analysis of what went wrong for the people who want to get an overview of the details. Note that my understanding of packages other than gcc is very limited, so it's entirely possible that I miscategorized some problems. Since during the time I was analyzing failures GCC 8 made it to the buildroots, it was no longer possible to use buildroots only different from each other by the GCC version, so some of the failures are caused by a new version of boost, glibc, etc. Thanks Jakub Jelinek for promply fixing bugs I reported (or caused :)) and Jonathan Wakely for his great help with anything related to C++. Let's get down to the nitty-gritty: apr-1.6.3-1.fc28.src.rpm undefined behavior -- signed overflow: 310 for (off = 1; off < LONG_MAX && off > 0; off *= 2) { 311 apr_strfsize(off, buf); 312 apr_strfsize(off + 1, buf); 313 apr_strfsize(off - 1, buf); 314 } where off is of type long int. mozjs38-38.8.0-8.fc28.src.rpm undefined behavior; building with -fsanitize=undefined: # cd /builddir/build/BUILD/mozilla-esr38/js/src/tests && /builddir/build/BUILD/mozilla-esr38/js/src/js/src/shell/js -f shell.js -f js1_5/shell.js -f js1_5/Regress/shell.js -f js1_5/Regress/regress-360969-06.js; cd - BUGNUMBER: 360969 STATUS: 2^17: global function /builddir/build/BUILD/mozilla-esr38/js/src/gc/Marking.cpp:669:10: runtime error: load of misaligned address 0x7f8be8070c9a for type 'void *', which requires 8 byte alignment 0x7f8be8070c9a: note: pointer points here 00 00 48 b9 30 d8 e5 e2 8b 7f 00 00 48 8b 49 70 48 8d 49 68 41 52 41 51 41 50 57 56 52 51 50 48 ^ /builddir/build/BUILD/mozilla-esr38/js/src/gc/Marking.cpp:671:13: runtime error: load of misaligned address 0x7f8be8070c9a for type 'void *', which requires 8 byte alignment 0x7f8be8070c9a: note: pointer points here 00 00 48 b9 30 d8 e5 e2 8b 7f 00 00 48 8b 49 70 48 8d 49 68 41 52 41 51 41 50 57 56 52 51 50 48 ^ ...and so on. This changed with https://gcc.gnu.org/r255387. With -fno-delete-null-poiner-checks this passes. libomxil-bellagio-0.9.3-15.fc27.src.rpm libX11-1.6.5-5.fc28.src.rpm error: