I'm not convinced this is solved. Just had what I believe is a similar failure:
*00:12:02.532* A dependency job for rpc-statd.service failed. See 'journalctl -xe' for details.*00:12:02.532* mount.nfs: rpc.statd is not running but is required for remote locking.*00:12:02.532* mount.nfs: Either use '-o nolock' to keep locks local, or start statd.*00:12:02.532* mount.nfs: an incorrect mount option was specified (of course, it can always be my patch!) https://build.gluster.org/job/centos7-regression/5384/console On Thu, Apr 4, 2019 at 6:56 PM Atin Mukherjee <amukh...@redhat.com> wrote: > Thanks misc. I have always seen a pattern that on a reattempt (recheck > centos) the same builder is picked up many time even though it's promised > to pick up the builders in a round robin manner. > > On Thu, Apr 4, 2019 at 7:24 PM Michael Scherer <msche...@redhat.com> > wrote: > >> Le jeudi 04 avril 2019 à 15:19 +0200, Michael Scherer a écrit : >> > Le jeudi 04 avril 2019 à 13:53 +0200, Michael Scherer a écrit : >> > > Le jeudi 04 avril 2019 à 16:13 +0530, Atin Mukherjee a écrit : >> > > > Based on what I have seen that any multi node test case will fail >> > > > and >> > > > the >> > > > above one is picked first from that group and If I am correct >> > > > none >> > > > of >> > > > the >> > > > code fixes will go through the regression until this is fixed. I >> > > > suspect it >> > > > to be an infra issue again. If we look at >> > > > https://review.gluster.org/#/c/glusterfs/+/22501/ & >> > > > https://build.gluster.org/job/centos7-regression/5382/ peer >> > > > handshaking is >> > > > stuck as 127.1.1.1 is unable to receive a response back, did we >> > > > end >> > > > up >> > > > having firewall and other n/w settings screwed up? The test never >> > > > fails >> > > > locally. >> > > >> > > The firewall didn't change, and since the start has a line: >> > > "-A INPUT -i lo -j ACCEPT", so all traffic on the localhost >> > > interface >> > > work. (I am not even sure that netfilter do anything meaningful on >> > > the >> > > loopback interface, but maybe I am wrong, and not keen on looking >> > > kernel code for that). >> > > >> > > >> > > Ping seems to work fine as well, so we can exclude a routing issue. >> > > >> > > Maybe we should look at the socket, does it listen to a specific >> > > address or not ? >> > >> > So, I did look at the 20 first ailure, removed all not related to >> > rebal-all-nodes-migrate.t and seen all were run on builder203, who >> > was >> > freshly reinstalled. As Deepshika noticed today, this one had a issue >> > with ipv6, the 2nd issue we were tracking. >> > >> > Summary, rpcbind.socket systemd unit listen on ipv6 despites ipv6 >> > being >> > disabled, and the fix is to reload systemd. We have so far no idea on >> > why it happen, but suspect this might be related to the network issue >> > we did identify, as that happen only after a reboot, that happen only >> > if a build is cancelled/crashed/aborted. >> > >> > I apply the workaround on builder203, so if the culprit is that >> > specific issue, guess that's fixed. >> > >> > I started a test to see how it go: >> > https://build.gluster.org/job/centos7-regression/5383/ >> >> The test did just pass, so I would assume the problem was local to >> builder203. Not sure why it was always selected, except because this >> was the only one that failed, so was always up for getting new jobs. >> >> Maybe we should increase the number of builder so this doesn't happen, >> as I guess the others builders were busy at that time ? >> >> -- >> Michael Scherer >> Sysadmin, Community Infrastructure and Platform, OSAS >> >> >> _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel