Re: [Gluster-infra] [Gluster-devel] RPM build failures post-mortem
This is all sorted now. I've restarted Jenkins so it's English rather than French :) -- nigelb signature.asc Description: PGP signature ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
Re: [Gluster-infra] [Gluster-devel] RPM build failures post-mortem
Le vendredi 17 mars 2017 à 08:41 -0400, Jeff Darcy a écrit : > > After the restart, our Jenkins server has accidentally had a fr_FR locale. > > I know this was probably frustrating for you (and possibly others as well), > but I have to admit it gave me a good chuckle. > > > "Of course I'm French! Why else would I have this outrageous French > error message?" > > > Key takeaway is that internationalization isn't always good for you. Some of > you might also recall when we discovered that putting LC_COLLATE=C in > run-tests.sh boosted performance. Well, the problem is not internationalization. The problem is that old initscripts (at least jenkins one) do leak environment variables. (yes, that's a pitch for systemd) -- Michael Scherer Sysadmin, Community Infrastructure and Platform, OSAS signature.asc Description: This is a digitally signed message part ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
Re: [Gluster-infra] [Gluster-devel] Infra outage today
Le mardi 14 mars 2017 à 21:21 +0100, Michael Scherer a écrit : > Le mardi 14 mars 2017 à 18:01 +0530, Nigel Babu a écrit : > > All servers are now shutdown in preparation for the move. We will have > > services > > restored (hopefully) by the end of EDT working day today. Michael or I will > > post updates when we have one. > > So the servers have moved, and they are plugged and up since 19h30 CET > (so 10h30 UTC). I was in the train back home, and I was informed that: > "there is some disk with red blinking light". I spare you the suspens, > that's supermicro way of saying "this is a spare drive", while anybody > would think "this disk is broken". > > So no issue on the move. > > Now on the issues after the move: > > - formicary ethernet connection didn't came back on boot. I will have to > investigate, but the server is for now not in production, so it is not > urgent So it turn out that using onboot=no in the configuration of that interface mean it will not be started on boot. The more you know... -- Michael Scherer Sysadmin, Community Infrastructure and Platform, OSAS signature.asc Description: This is a digitally signed message part ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
Re: [Gluster-infra] [Gluster-devel] RPM build failures post-mortem
> After the restart, our Jenkins server has accidentally had a fr_FR locale. I know this was probably frustrating for you (and possibly others as well), but I have to admit it gave me a good chuckle. "Of course I'm French! Why else would I have this outrageous French error message?" Key takeaway is that internationalization isn't always good for you. Some of you might also recall when we discovered that putting LC_COLLATE=C in run-tests.sh boosted performance. ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1433310] Regression tests getting permission errors
https://bugzilla.redhat.com/show_bug.cgi?id=1433310 --- Comment #3 from Nigel Babu --- The two machines I had to involve myself in both ran out of space. It had a 15G big glusterd log file. Is there a new test that's creating a lot of volumes or generating a lot of log entries? -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=ZLP8lr3D7m&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1433310] Regression tests getting permission errors
https://bugzilla.redhat.com/show_bug.cgi?id=1433310 --- Comment #2 from Jeff Darcy --- I was having problems with jobs for 16905 hanging, but it seems like a bit of an unlikely coincidence that those would exactly account for the multiple machines observing this issue. Also, it's still not clear how that cause would lead to that effect). Until we figure out how they're connected, should we perhaps be more liberal about rebooting machines automatically after failed/aborted runs? -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=epnarGoC8b&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1433310] Regression tests getting permission errors
https://bugzilla.redhat.com/show_bug.cgi?id=1433310 Nigel Babu changed: What|Removed |Added CC||nig...@redhat.com --- Comment #1 from Nigel Babu --- Oops, I'm not sure what went wrong here. There was a long-running prove command. I killed that and rebooted the machine. Hopefully, that'll get it back and running. I've now done a retrigger that's been assigned to other machines. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=XFZq77zIBA&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1433310] New: Regression tests getting permission errors
https://bugzilla.redhat.com/show_bug.cgi?id=1433310 Bug ID: 1433310 Summary: Regression tests getting permission errors Product: GlusterFS Version: mainline Component: project-infrastructure Assignee: b...@gluster.org Reporter: jda...@redhat.com CC: b...@gluster.org, gluster-infra@gluster.org Every centos6-regression run since 3:15pm Jenkins time yesterday has been failing in the same way, across multiple slaves. Latest example: https://build.gluster.org/job/centos6-regression/3656/console 04:47:38 Triggered by Gerrit: https://review.gluster.org/16903 04:47:38 Construction à distance sur slave1.cloud.gluster.org (rackspace_regression_2gb) in workspace /home/jenkins/root/workspace/centos6-regression 04:47:39 > git rev-parse --is-inside-work-tree # timeout=10 04:47:39 Fetching changes from the remote Git repository 04:47:39 > git config remote.origin.url git://review.gluster.org/glusterfs.git # timeout=10 04:47:39 ERROR: Error fetching remote repo 'origin' 04:47:39 hudson.plugins.git.GitException: Failed to fetch from git://review.gluster.org/glusterfs.git 04:47:39 at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:806) 04:47:39 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1066) 04:47:39 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1097) 04:47:39 at hudson.scm.SCM.checkout(SCM.java:485) 04:47:39 at hudson.model.AbstractProject.checkout(AbstractProject.java:1269) 04:47:39 at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607) 04:47:39 at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) 04:47:39 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529) 04:47:39 at hudson.model.Run.execute(Run.java:1738) 04:47:39 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 04:47:39 at hudson.model.ResourceController.execute(ResourceController.java:98) 04:47:39 at hudson.model.Executor.run(Executor.java:410) 04:47:39 Caused by: hudson.plugins.git.GitException: Command "git config remote.origin.url git://review.gluster.org/glusterfs.git" returned status code 4: 04:47:39 stdout: 04:47:39 stderr: error: failed to write new configuration file .git/config.lock 04:47:39 04:47:39 at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1784) It's amusing that some of the messages (especially for aborted tasks) are in French, but I don't think that's the real issue here. There seems to be some kind of permission problem affecting all machines. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=bZ4hL2chVN&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1431969] pull gluster-block v0.1.1 into d.g.o
https://bugzilla.redhat.com/show_bug.cgi?id=1431969 Nigel Babu changed: What|Removed |Added Status|NEW |CLOSED Resolution|--- |CURRENTRELEASE Last Closed|2017-03-15 06:01:34 |2017-03-17 05:13:15 --- Comment #11 from Nigel Babu --- This has been done. I've confirmed that upgrades from 0.1-2 to 0.1.1 works from the d.g.o page. All seems to be working well. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=OSNu36x0Q3&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1431969] pull gluster-block v0.1.1 into d.g.o
https://bugzilla.redhat.com/show_bug.cgi?id=1431969 --- Comment #10 from Prasanna Kumar Kalever --- copr link to package https://copr.fedorainfracloud.org/coprs/pkalever/gluster-block/build/527601/ -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=W0B3l0Qbrp&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1431969] pull gluster-block v0.1.1 into d.g.o
https://bugzilla.redhat.com/show_bug.cgi?id=1431969 Prasanna Kumar Kalever changed: What|Removed |Added Status|CLOSED |NEW Resolution|CURRENTRELEASE |--- Summary|pull gluster-block v0.1-2 |pull gluster-block v0.1.1 |into d.g.o |into d.g.o Keywords||Reopened --- Comment #9 from Prasanna Kumar Kalever --- I must apologies for wrong package name. Thanks to Kaleb, Niels as well as Nigel for corrected me. In "0.1-2" the "-2" belongs to revision. Means every time we build a package we do increment this, or if we have build failures due to spec issue we do modify the spec and increment the post-fix. Since there are code changes, this release should go with 0.1.1 tag. So now we have to change the 0.1-2 to 0.1.1, Nigel will I get your support here ? -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=OY4fTkj3mU&a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] RPM build failures post-mortem
Hello folks, You may have noticed that the RPM jobs failed quite often since the migration. This should finally be fixed now. Here's a quick post-mortem of what happened. After the restart, our Jenkins server has accidentally had a fr_FR locale. This means some of the logs are now in French. In the past, we've had a permission problem with the RPMS folder after a build. This is usually owned by root user and root group. I fixed this by triggering a clean up after the job. The trigger would listen for "Building remotely" in the log and then do a `chown` on the workspace. Since we started Jenkins in the fr_FR locale, that line is now in French, so Jenkins doesn't see "Bulding remotely" in the log. There's now "Construction à distance" instead. This means the chown isn't triggered and we error out with a Java traceback that's just a permission error. The Jenkins user can't delete the workspace to clean it up. The easy solution is to restart Jenkins. but in the interest of doing this without a downtime, I've solved this with acls which create less of a mess. This shouldn't be a problem from now on and I can take out the hacky post-commit scripts. Apologies for the large number of errors this week. If you recheck your job, you should get a green now for rpm jobs. -- nigelb signature.asc Description: PGP signature ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra