Re: Fedora mass rebuild 2018

2018-02-24 Thread Todd Zullinger
Paul Howarth wrote:
> On Fri, 23 Feb 2018 15:04:44 +
> Tom Hughes  wrote:
> 
>> On 23/02/18 14:33, Paul Howarth wrote:
>>> On Thu, 22 Feb 2018 18:49:02 +0100
>>> Marek Polacek  wrote:  
  proftpd: timeouts in tests, but in koji it's fine  
>>> 
>>> I get this too. If I build with mock --old-chroot then it works
>>> fine.  
>> 
>> So the obvious difference is that mock with --old-chroot has
>> networking enabled while the default (ie --new-chroot) does not.
>> 
>> Does --enable-network also make it work?
> 
> Yes, it does.
> 
>> That doesn't explain koji though, as that has networking disabled.

You likely know, koji doesn't run the current mock.  It's
got mock-1.3.4 which used the old-style chroot rather than
systemd-nspawn.  I'm not sure how exactly networking is
disabled on the builders.

> Indeed. It looks like it's the combination of using nspawn and no
> network that breaks it.

A fedpkg mockbuild of proftpd works with nspawn if you add
the '--private-network' option in site-defaults.cfg, i.e.:

  config_opts['nspawn_args'] = ['--private-network']

Without that option the way networking is disbled can cause
problems like this.

If you want to use the old-style chroot with fedpkg
mockbuild, it can be set in site-defaults.cfg via:

  config_opts['use_nspawn'] = False

Working out the issues with mock's use of systemd-nspawn
would be ideal.  I left off looking at it by setting the
'--private-network' option, but I didn't feel that I had
enough evidence that was the only/best solution to propose
it as a patch (particularly because it was used in a
previous iteration of the changes in the issue below).

As I mentioned in a previous reply, more discussion of this
can be found in mock issue #113:

  https://github.com/rpm-software-management/mock/issues/113

I remember thinking part of the issue seemed to be that the
solution currently implemented in mock sets the default
route to 127.0.0.1 but copies /etc/resolv.conf from the host
system -- regardless of the use_host_resolv setting because
systemd-nspawn overrides mock and copies the system
/etc/resolv.conf to the container.

If someone has some time to poke around with this and
perhaps open a new mock issue, that would be great.

-- 
Todd
~~
Wisdom has two parts: (1) having a lot to say and (2) not saying it.



signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-24 Thread Paul Howarth
On Fri, 23 Feb 2018 15:04:44 +
Tom Hughes  wrote:

> On 23/02/18 14:33, Paul Howarth wrote:
> > On Thu, 22 Feb 2018 18:49:02 +0100
> > Marek Polacek  wrote:  
> >>  proftpd: timeouts in tests, but in koji it's fine  
> > 
> > I get this too. If I build with mock --old-chroot then it works
> > fine.  
> 
> So the obvious difference is that mock with --old-chroot has
> networking enabled while the default (ie --new-chroot) does not.
> 
> Does --enable-network also make it work?

Yes, it does.

> That doesn't explain koji though, as that has networking disabled.

Indeed. It looks like it's the combination of using nspawn and no
network that breaks it.

Paul.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-24 Thread Richard W.M. Jones
On Fri, Feb 23, 2018 at 06:33:58PM +0100, Marek Polacek wrote:
> On Fri, Feb 23, 2018 at 04:09:26PM +, Richard W.M. Jones wrote:
> > On Thu, Feb 22, 2018 at 06:49:02PM +0100, Marek Polacek wrote:
> > > libguestfs-1.37.35-2.fc28.src.rpm
> > > I'm not sure about these failures, but they don't seem to be GCC bugs.
> > 
> > Is there a log file?
> 
> The one I have says
> libvirt: XML-RPC error : Failed to connect socket to 
> '/builddir/.cache/libvirt/libvirt-sock': No such file or directory
> libguestfs: error: could not connect to libvirt (URI = qemu:///session): 
> Failed to connect socket to '/builddir/.cache/libvirt/libvirt-sock': No such 
> file or directory [code=38 int1=2]
> libguestfs: trace: launch = -1 (error)
> libguestfs: trace: close
> libguestfs: closing guestfs handle 0x5646f5406a40 (state 0)
> libguestfs: command: run: rm
> libguestfs: command: run: \ -rf 
> /builddir/build/BUILD/libguestfs-1.37.35/tmp/libguestfsfTNsVu
> make: *** [Makefile:2906: quickcheck] Error 1
> 
> so it's probably another networking issue in mock.

This is actually a bug in libvirt which happens randomly and
infrequently.

In any case we've since fixed all known problems with libguestfs and
GCC 8 so there's no need to try again.

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-23 Thread Marek Polacek
On Fri, Feb 23, 2018 at 04:09:26PM +, Richard W.M. Jones wrote:
> On Thu, Feb 22, 2018 at 06:49:02PM +0100, Marek Polacek wrote:
> > libguestfs-1.37.35-2.fc28.src.rpm
> > I'm not sure about these failures, but they don't seem to be GCC bugs.
> 
> Is there a log file?

The one I have says
libvirt: XML-RPC error : Failed to connect socket to 
'/builddir/.cache/libvirt/libvirt-sock': No such file or directory
libguestfs: error: could not connect to libvirt (URI = qemu:///session): Failed 
to connect socket to '/builddir/.cache/libvirt/libvirt-sock': No such file or 
directory [code=38 int1=2]
libguestfs: trace: launch = -1 (error)
libguestfs: trace: close
libguestfs: closing guestfs handle 0x5646f5406a40 (state 0)
libguestfs: command: run: rm
libguestfs: command: run: \ -rf 
/builddir/build/BUILD/libguestfs-1.37.35/tmp/libguestfsfTNsVu
make: *** [Makefile:2906: quickcheck] Error 1

so it's probably another networking issue in mock.

Marek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-23 Thread Richard W.M. Jones
On Thu, Feb 22, 2018 at 06:49:02PM +0100, Marek Polacek wrote:
> libguestfs-1.37.35-2.fc28.src.rpm
> I'm not sure about these failures, but they don't seem to be GCC bugs.

Is there a log file?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-23 Thread Todd Zullinger
Tom Hughes wrote:
> On 23/02/18 14:33, Paul Howarth wrote:
>> On Thu, 22 Feb 2018 18:49:02 +0100
>> Marek Polacek  wrote:
>>>  proftpd: timeouts in tests, but in koji it's fine
>> 
>> I get this too. If I build with mock --old-chroot then it works fine.
> 
> So the obvious difference is that mock with --old-chroot has networking
> enabled while the default (ie --new-chroot) does not.
> 
> Does --enable-network also make it work?
> 
> That doesn't explain koji though, as that has networking disabled.

I don't know if this is the same issue, but I noticed that
in local mock builds of git, some tests were very slow.  The
tests that were slow were suffering from socket timeouts
while making calls to getaddrinfo (to get the local
hostname, for determining the default git user identity).

I added --private-network to the nspawn_args in the mock
config which eliminated the 5 second timeouts for each of
these calls.

I made a comment to remind myself about it later:

# NOTE (tmz): This should still be the default, I think.  Without it, network
# calls timeout rather than fail quickly.  The typical default socket timeout is
# 5 seconds.  A simple call like 'git var GIT_AUTHOR_IDENT' saw 4 socket
# timeouts and takes 20 seconds to return.  With this option, it takes
# milliseconds.
#
# See https://github.com/rpm-software-management/mock/issues/113 where this was
# partially dealt with.  Unfortunately, the solution there sets a default route
# to 127.0.0.1 but copies /etc/resolv.conf from the host system -- regardless of
# the use_host_resolv setting because systemd-nspawn overrides mock and copies
# the system /etc/resolf.conf to the container.  There seems to be no option
# other than --private-network to avoid this.

I haven't had time to look at it further to see if there are
any patches worth submitting to improve the situation (in
mock and/or systemd-nspawn).

-- 
Todd
~~
Men have become the tools of their tools.
-- Henry David Thoreau (1817-1862)



signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-23 Thread Tom Hughes

On 23/02/18 14:33, Paul Howarth wrote:

On Thu, 22 Feb 2018 18:49:02 +0100
Marek Polacek  wrote:

 proftpd: timeouts in tests, but in koji it's fine


I get this too. If I build with mock --old-chroot then it works fine.


So the obvious difference is that mock with --old-chroot has networking
enabled while the default (ie --new-chroot) does not.

Does --enable-network also make it work?

That doesn't explain koji though, as that has networking disabled.

Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-23 Thread Paul Howarth
On Thu, 22 Feb 2018 18:49:02 +0100
Marek Polacek  wrote:
> proftpd: timeouts in tests, but in koji it's fine

I get this too. If I build with mock --old-chroot then it works fine.

Paul.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-23 Thread Guido Aulisi
2018-02-22 21:47 GMT+01:00 Josh Boyer :
> On Thu, Feb 22, 2018 at 12:49 PM, Marek Polacek  wrote:
>> As many of you know, every year we (the GCC team) rebuild all the Fedora
>> packages with the upcoming GCC, so as to reveal as many bugs as possible 
>> before
>> we release the new version.  As in the previous years, it is only performed 
>> on
>> x86_64 only; we unfortunately lack the resources to deal with other arches.
>> Ideally we'd conclude this mass rebuild *before* the new GCC has gotten into
>> the buildroots; alas, this wasn't the case this year.

Thank you for your great job.

>
> josh
>
>>
>> Let's get down to the nitty-gritty:
...
>> jalv-1.6.0-3.fc27.src.rpm
I tried a mockbuild in current rawhide (jalv-1.6.0-4.fc28.src.rpm) and
it was successful.
...
>> lv2-x42-plugins-0.3.0-0.3.20170428.fc27.src.rpm
I've patched this package
(lv2-x42-plugins-0.3.0-0.5.20170428.fc29.src.rpm) and I've sent PRs
upstream

Ciao
Guido
fas account: tartina
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora mass rebuild 2018

2018-02-22 Thread Josh Boyer
On Thu, Feb 22, 2018 at 12:49 PM, Marek Polacek  wrote:
> As many of you know, every year we (the GCC team) rebuild all the Fedora
> packages with the upcoming GCC, so as to reveal as many bugs as possible 
> before
> we release the new version.  As in the previous years, it is only performed on
> x86_64 only; we unfortunately lack the resources to deal with other arches.
> Ideally we'd conclude this mass rebuild *before* the new GCC has gotten into
> the buildroots; alas, this wasn't the case this year.
>
> I downloaded all Fedora packages on Jan 19, which should give you a sense of
> how long it takes to process all of this.
> There were 20892 packages overall (last year Fedora had 18811 packages).  
> Using
> koji-is-noarch.py I removed all noarch packages from that, so that we only
> build archful packages to save time.  That left me with 9329 packages to 
> build.
> Of that, 8358 built fine with the new GCC (mostly 
> gcc-8.0.1-0.3.fc28.x86_64.rpm
> but I also used a newer version from rawhide).  The packages that failed to
> build with GCC 8 I rebuilt with GCC 7; if they failed with GCC 7, I took them
> off the list.  The rest had to be analyzed; it was around 300 packages this
> time.  (Last year it was ~198 packages.  So more work this year.)
>
> I found several GCC bugs, most of which have already been fixed (actually all
> but PR84231).  Fortran ABI has changed in GCC 8 (as it did in GCC 7).  A nasty
> bug has been discovered in the new empty classes ABI code: we will need 
> another
> mass rebuild to find out which package are affected.  There have been a few
> bugs in code that deals with optimizing strlen; hopefully we won't find more 
> of
> them.
>
> A lot of churn has been caused due to changes in the C++ compiler.  Previously
> GCC was fairly forgiving about broken template code and would check almost
> nothing until the template was instantiated.  In more recent releases parts of
> the template which don't depend on the template arguments get checked earlier.
> The standard says such code is ill-formed, but that no diagnostic is required
> i.e.  the compiler is allowed to diagnose the problem, but not required to
> (because it could be expensive or difficult to check for some compilers).  So
> the code was always broken, but now GCC tells you about it.  I encourage the
> packagers to fix these bugs.
>
> With every release GCC gains new warnings which cause build failures in
> packages that use -Werror (not going to start a flame war about this here).
> This year the main offender was probably -Wformat-truncation.  Due to time
> constrains I wasn't able to check every warning and decide if it's warranted
> or a false positive.
>
> As usual, there will be a "porting to" document to ease the transition to the
> new GCC.  We already have https://gcc.gnu.org/gcc-8/porting_to.html, even 
> though
> this document is still in flux.
>
> What follows is my analysis of what went wrong for the people who want to get
> an overview of the details.  Note that my understanding of packages other than
> gcc is very limited, so it's entirely possible that I miscategorized some
> problems.  Since during the time I was analyzing failures GCC 8 made it to the
> buildroots, it was no longer possible to use buildroots only different from
> each other by the GCC version, so some of the failures are caused by a new
> version of boost, glibc, etc.
>
> Thanks Jakub Jelinek for promply fixing bugs I reported (or caused :)) and
> Jonathan Wakely for his great help with anything related to C++.

Thanks for both the detailed summary and the overall great amount of
work your team does to help ensure we continue to be "first" when it
comes to toolchain in Fedora.  I know this isn't easy by any means,
but reducing and analyzing the failures to a relatively small number
really reduces the burden on the overall Fedora packager community.
Keep up the excellent work!

josh

>
> Let's get down to the nitty-gritty:
>
>
>
> apr-1.6.3-1.fc28.src.rpm
> undefined behavior -- signed overflow:
> 310 for (off = 1; off < LONG_MAX && off > 0; off *= 2) {
> 311 apr_strfsize(off, buf);
> 312 apr_strfsize(off + 1, buf);
> 313 apr_strfsize(off - 1, buf);
> 314 }
> where off is of type long int.
>
> mozjs38-38.8.0-8.fc28.src.rpm
> undefined behavior; building with -fsanitize=undefined:
>
> # cd /builddir/build/BUILD/mozilla-esr38/js/src/tests && 
> /builddir/build/BUILD/mozilla-esr38/js/src/js/src/shell/js -f shell.js -f 
> js1_5/shell.js -f js1_5/Regress/shell.js -f 
> js1_5/Regress/regress-360969-06.js; cd -
> BUGNUMBER: 360969
> STATUS: 2^17: global function
> /builddir/build/BUILD/mozilla-esr38/js/src/gc/Marking.cpp:669:10: runtime 
> error: load of misaligned address 0x7f8be8070c9a for type 'void *', which 
> requires 8 byte alignment
> 0x7f8be8070c9a: note: pointer points here
> 00 00  48 b9 30 d8 e5 e2 8b 7f  00 00 48 8b 49 

Fedora mass rebuild 2018

2018-02-22 Thread Marek Polacek
As many of you know, every year we (the GCC team) rebuild all the Fedora
packages with the upcoming GCC, so as to reveal as many bugs as possible before
we release the new version.  As in the previous years, it is only performed on
x86_64 only; we unfortunately lack the resources to deal with other arches.
Ideally we'd conclude this mass rebuild *before* the new GCC has gotten into
the buildroots; alas, this wasn't the case this year.

I downloaded all Fedora packages on Jan 19, which should give you a sense of
how long it takes to process all of this.
There were 20892 packages overall (last year Fedora had 18811 packages).  Using
koji-is-noarch.py I removed all noarch packages from that, so that we only
build archful packages to save time.  That left me with 9329 packages to build.
Of that, 8358 built fine with the new GCC (mostly gcc-8.0.1-0.3.fc28.x86_64.rpm
but I also used a newer version from rawhide).  The packages that failed to
build with GCC 8 I rebuilt with GCC 7; if they failed with GCC 7, I took them
off the list.  The rest had to be analyzed; it was around 300 packages this
time.  (Last year it was ~198 packages.  So more work this year.)

I found several GCC bugs, most of which have already been fixed (actually all
but PR84231).  Fortran ABI has changed in GCC 8 (as it did in GCC 7).  A nasty
bug has been discovered in the new empty classes ABI code: we will need another
mass rebuild to find out which package are affected.  There have been a few
bugs in code that deals with optimizing strlen; hopefully we won't find more of
them.

A lot of churn has been caused due to changes in the C++ compiler.  Previously
GCC was fairly forgiving about broken template code and would check almost
nothing until the template was instantiated.  In more recent releases parts of
the template which don't depend on the template arguments get checked earlier.
The standard says such code is ill-formed, but that no diagnostic is required
i.e.  the compiler is allowed to diagnose the problem, but not required to
(because it could be expensive or difficult to check for some compilers).  So
the code was always broken, but now GCC tells you about it.  I encourage the
packagers to fix these bugs.

With every release GCC gains new warnings which cause build failures in
packages that use -Werror (not going to start a flame war about this here).
This year the main offender was probably -Wformat-truncation.  Due to time
constrains I wasn't able to check every warning and decide if it's warranted
or a false positive.

As usual, there will be a "porting to" document to ease the transition to the
new GCC.  We already have https://gcc.gnu.org/gcc-8/porting_to.html, even though
this document is still in flux.

What follows is my analysis of what went wrong for the people who want to get
an overview of the details.  Note that my understanding of packages other than
gcc is very limited, so it's entirely possible that I miscategorized some
problems.  Since during the time I was analyzing failures GCC 8 made it to the
buildroots, it was no longer possible to use buildroots only different from
each other by the GCC version, so some of the failures are caused by a new
version of boost, glibc, etc.

Thanks Jakub Jelinek for promply fixing bugs I reported (or caused :)) and
Jonathan Wakely for his great help with anything related to C++.

Let's get down to the nitty-gritty:



apr-1.6.3-1.fc28.src.rpm
undefined behavior -- signed overflow:
310 for (off = 1; off < LONG_MAX && off > 0; off *= 2) {
311 apr_strfsize(off, buf);
312 apr_strfsize(off + 1, buf);
313 apr_strfsize(off - 1, buf);
314 }
where off is of type long int.

mozjs38-38.8.0-8.fc28.src.rpm
undefined behavior; building with -fsanitize=undefined:

# cd /builddir/build/BUILD/mozilla-esr38/js/src/tests && 
/builddir/build/BUILD/mozilla-esr38/js/src/js/src/shell/js -f shell.js -f 
js1_5/shell.js -f js1_5/Regress/shell.js -f 
js1_5/Regress/regress-360969-06.js; cd -
BUGNUMBER: 360969
STATUS: 2^17: global function
/builddir/build/BUILD/mozilla-esr38/js/src/gc/Marking.cpp:669:10: runtime 
error: load of misaligned address 0x7f8be8070c9a for type 'void *', which 
requires 8 byte alignment
0x7f8be8070c9a: note: pointer points here
00 00  48 b9 30 d8 e5 e2 8b 7f  00 00 48 8b 49 70 48 8d  49 68 41 52 41 51 
41 50  57 56 52 51 50 48
  ^
/builddir/build/BUILD/mozilla-esr38/js/src/gc/Marking.cpp:671:13: runtime 
error: load of misaligned address 0x7f8be8070c9a for type 'void *', which 
requires 8 byte alignment
0x7f8be8070c9a: note: pointer points here
00 00  48 b9 30 d8 e5 e2 8b 7f  00 00 48 8b 49 70 48 8d  49 68 41 52 41 51 
41 50  57 56 52 51 50 48
  ^
...and so on.
This changed with https://gcc.gnu.org/r255387.
With -fno-delete-null-poiner-checks this passes.

libomxil-bellagio-0.9.3-15.fc27.src.rpm
libX11-1.6.5-5.fc28.src.rpm
error: