On Sunday 08 May 2016 23:06:02 Ben Cooksley wrote: > On Sun, May 8, 2016 at 2:44 AM, David Faure <fa...@kde.org> wrote: > > kdewebkit just failed with "Broken pipe" (the TCP error you mentionned) > > (and kxmlrpcclient failed again with an anongit error). This is like > > playing wack-a-mole... > > Yeah :( Fortunately the Broken Pipe error is the least common one. > > > > > I thought TCP was more robust than that. Would it help to increase some > > TCP-related timeout somewhere? > > TCP should definitely be more reliable, I agree. > I suspect the root cause of the Broken Pipe issue will be the same as > the Temporary failure in name resolution error. > > The /etc/hosts fix should be deployed shortly - the images are rebuilding now.
kmediaplayer job #63 failed with ssh: Could not resolve hostname build.kde.org: Temporary failure in name resolution at 12:56 yesterday (CI system time). Is build.kde.org missing from /etc/hosts? > The only thing I can think of at the moment are some kind of traffic > storm on the network bridge which disrupts arp or something similar at > that level when one or more containers start/stop in a short amount of > time. This could very well be Docker itself determining which IP / MAC > addresses it can use for the newly starting container - with > connections being broken and data lost when it steps on one that is in > use. I do seem to recall having the issue, albeit to a lesser extent > with the KVM setup as well. We definitely didn't have it with the LXC > containers though, but those all had public IP addresses of some form > or another (one was Public IPv6 only, with NAT IPv4) > > The current setup (using one machine as an example, they're all > identical except for the IP ranges used): > > - Normal Linux bridges, setup using Debian's /etc/network/interfaces > and bridge utilities. > - Host takes 10.150.85.1/25 (br0) and 10.150.81.129/25 (br1) > > - Docker containers are allocated the rest of the 10.150.85.1/25 IP > block, and are connected to the corresponding bridge (br0) > - Windows virtual machines are allocated static IP addresses in the > 10.150.85.129/25 block, on the corresponding bridge (br1) > > - VPN connection is established using OpenVPN, with the OpenVPN server > routing 10.150.85.0/24 to the VPN client. Only traffic within the > 10.150.85.0/16 subnet will be sent over the VPN. This is done to > permit secure communication with the Docker management daemons, and to > permit easy+secure access to the Windows VMs. > > - Public network access is handled on the host (not the VPN server) using NAT. I'm afraid I'm not enough of a network sysadmin to be able to find out what might be wrong in this setup, if anything. > I've bumped the limits on each anongit node so hopefully that will solve it. > The limit was a bit on the conservative side anyway. > > If Jenkins is making that number of Git connections at one moment.... > i'd be quite surprised. I think I saw more anongit errors yesterday, but I didn't write them down. Let's see. > Indeed. The main reason for performing many builds at once is to > ensure the small projects don't get blocked up when big items (like > Qt, PIM and Calligra) do a build. > They've all been known to tie up a builder for more than an hour per > build and has led to a large pile of other builds blocking up behind > them (which i've received complaints about as well) I know, but this was a smaller problem than false positives IMHO :-) -- David Faure, fa...@kde.org, http://www.davidfaure.fr Working on KDE Frameworks 5 _______________________________________________ Kde-frameworks-devel mailing list Kde-frameworks-devel@kde.org https://mail.kde.org/mailman/listinfo/kde-frameworks-devel