I recently added 3 builders builder208, builder209, builder210 to the regression pool. Network to these new builders did not come up because it was looking for non-existing ethernet card eth0 on reboot and hence failing. I'll reconnect them back and update here once I fix the issue today.
Sorry for the inconvenience. On Tue, Jun 4, 2019 at 7:07 PM Yaniv Kaul <yk...@redhat.com> wrote: > What was the result of this investigation? I suspect seeing the same issue > on builder209[1]. > Y. > > [1] https://build.gluster.org/job/centos7-regression/6302/consoleFull > > On Fri, Apr 5, 2019 at 5:40 PM Michael Scherer <msche...@redhat.com> > wrote: > >> Le vendredi 05 avril 2019 à 16:55 +0530, Nithya Balachandran a écrit : >> > On Fri, 5 Apr 2019 at 12:16, Michael Scherer <msche...@redhat.com> >> > wrote: >> > >> > > Le jeudi 04 avril 2019 à 18:24 +0200, Michael Scherer a écrit : >> > > > Le jeudi 04 avril 2019 à 19:10 +0300, Yaniv Kaul a écrit : >> > > > > I'm not convinced this is solved. Just had what I believe is a >> > > > > similar >> > > > > failure: >> > > > > >> > > > > *00:12:02.532* A dependency job for rpc-statd.service failed. >> > > > > See >> > > > > 'journalctl -xe' for details.*00:12:02.532* mount.nfs: >> > > > > rpc.statd is >> > > > > not running but is required for remote locking.*00:12:02.532* >> > > > > mount.nfs: Either use '-o nolock' to keep locks local, or start >> > > > > statd.*00:12:02.532* mount.nfs: an incorrect mount option was >> > > > > specified >> > > > > >> > > > > (of course, it can always be my patch!) >> > > > > >> > > > > https://build.gluster.org/job/centos7-regression/5384/console >> > > > >> > > > same issue, different builder (206). I will check them all, as >> > > > the >> > > > issue is more widespread than I expected (or it did popup since >> > > > last >> > > > time I checked). >> > > >> > > Deepshika did notice that the issue came back on one server >> > > (builder202) after a reboot, so the rpcbind issue is not related to >> > > the >> > > network initscript one, so the RCA continue. >> > > >> > > We are looking for another workaround involving fiddling with the >> > > socket (until we find why it do use ipv6 at boot, but not after, >> > > when >> > > ipv6 is disabled). >> > > >> > >> > Could this be relevant? >> > https://access.redhat.com/solutions/2798411 >> >> Good catch. >> >> So, we already do that, Nigel took care of that (after 2 days of >> research). But I didn't knew the exact symptoms, and decided to double >> check just in case. >> >> And... there is no sysctl.conf in the initrd. Running dracut -v -f do >> not change anything. >> >> Running "dracut -v -f -H" take care of that (and this fix the problem), >> but: >> - our ansible script already run that >> - -H is hostonly, which is already the default on EL7 according to the >> doc. >> >> However, if dracut-config-generic is installed, it doesn't build a >> hostonly initrd, and so do not include the sysctl.conf file (who break >> rpcbnd, who break the test suite). >> >> And for some reason, it is installed the image in ec2 (likely default), >> but not by default on the builders. >> >> So what happen is that after a kernel upgrade, dracut rebuild a generic >> initrd instead of a hostonly one, who break things. And kernel was >> likely upgraded recently (and upgrade happen nightly (for some value of >> "night"), so we didn't see that earlier, nor with a fresh system. >> >> >> So now, we have several solution: >> - be explicit on using hostonly in dracut, so this doesn't happen again >> (or not for this reason) >> >> - disable ipv6 in rpcbind in a cleaner way (to be tested) >> >> - get the test suite work with ip v6 >> >> In the long term, I also want to monitor the processes, but for that, I >> need a VPN between the nagios server and ec2, and that project got >> blocked by several issues (like EC2 not support ecdsa keys, and we use >> that for ansible, so we have to come back to RSA for full automated >> deployment, and openvon requires to use certificates, so I need a newer >> python openssl for doing what I want, and RHEL 7 is too old, etc, etc). >> >> As the weekend approach for me, I just rebuilt the initrd for the time >> being. I guess forcing hostonly is the safest fix for now, but this >> will be for monday. >> -- >> Michael Scherer >> Sysadmin, Community Infrastructure and Platform, OSAS >> >> >> _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > gluster-de...@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > >
_______________________________________________ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra