Re: [Gluster-infra] [Gluster-devel] RPM build failures post-mortem

2017-03-17 Thread Nigel Babu
This is all sorted now. I've restarted Jenkins so it's English rather than
French :)


--
nigelb


signature.asc
Description: PGP signature
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] [Gluster-devel] RPM build failures post-mortem

2017-03-17 Thread Michael Scherer
Le vendredi 17 mars 2017 à 08:41 -0400, Jeff Darcy a écrit :
> > After the restart, our Jenkins server has accidentally had a fr_FR locale.
> 
> I know this was probably frustrating for you (and possibly others as well), 
> but I have to admit it gave me a good chuckle.
> 
> 
> "Of course I'm French!  Why else would I have this outrageous French 
> error message?"
> 
> 
> Key takeaway is that internationalization isn't always good for you.  Some of 
> you might also recall when we discovered that putting LC_COLLATE=C in 
> run-tests.sh boosted performance.

Well, the problem is not internationalization. The problem is that old
initscripts (at least jenkins one) do leak environment variables. 

(yes, that's a pitch for systemd)

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] [Gluster-devel] Infra outage today

2017-03-17 Thread Michael Scherer
Le mardi 14 mars 2017 à 21:21 +0100, Michael Scherer a écrit :
> Le mardi 14 mars 2017 à 18:01 +0530, Nigel Babu a écrit :
> > All servers are now shutdown in preparation for the move. We will have 
> > services
> > restored (hopefully) by the end of EDT working day today. Michael or I will
> > post updates when we have one.
> 
> So the servers have moved, and they are plugged and up since 19h30 CET
> (so 10h30 UTC). I was in the train back home, and I was informed that:
> "there is some disk with red blinking light". I spare you the suspens,
> that's supermicro way of saying "this is a spare drive", while anybody
> would think "this disk is broken".
> 
> So no issue on the move.
> 
> Now on the issues after the move:
> 
> - formicary ethernet connection didn't came back on boot. I will have to
> investigate, but the server is for now not in production, so it is not
> urgent

So it turn out that using onboot=no in the configuration of that
interface mean it will not be started on boot. The more you know...

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] [Gluster-devel] RPM build failures post-mortem

2017-03-17 Thread Jeff Darcy
> After the restart, our Jenkins server has accidentally had a fr_FR locale.

I know this was probably frustrating for you (and possibly others as well), but 
I have to admit it gave me a good chuckle.


"Of course I'm French!  Why else would I have this outrageous French error 
message?"


Key takeaway is that internationalization isn't always good for you.  Some of 
you might also recall when we discovered that putting LC_COLLATE=C in 
run-tests.sh boosted performance.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1433310] Regression tests getting permission errors

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1433310



--- Comment #3 from Nigel Babu  ---
The two machines I had to involve myself in both ran out of space. It had a 15G
big glusterd log file. Is there a new test that's creating a lot of volumes or
generating a lot of log entries?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=ZLP8lr3D7m&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1433310] Regression tests getting permission errors

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1433310



--- Comment #2 from Jeff Darcy  ---
I was having problems with jobs for 16905 hanging, but it seems like a bit of
an unlikely coincidence that those would exactly account for the multiple
machines observing this issue.  Also, it's still not clear how that cause would
lead to that effect).  Until we figure out how they're connected, should we
perhaps be more liberal about rebooting machines automatically after
failed/aborted runs?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=epnarGoC8b&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1433310] Regression tests getting permission errors

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1433310

Nigel Babu  changed:

   What|Removed |Added

 CC||nig...@redhat.com



--- Comment #1 from Nigel Babu  ---
Oops, I'm not sure what went wrong here. There was a long-running prove
command. I killed that and rebooted the machine. Hopefully, that'll get it back
and running. I've now done a retrigger that's been assigned to other machines.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=XFZq77zIBA&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1433310] New: Regression tests getting permission errors

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1433310

Bug ID: 1433310
   Summary: Regression tests getting permission errors
   Product: GlusterFS
   Version: mainline
 Component: project-infrastructure
  Assignee: b...@gluster.org
  Reporter: jda...@redhat.com
CC: b...@gluster.org, gluster-infra@gluster.org



Every centos6-regression run since 3:15pm Jenkins time yesterday has been
failing in the same way, across multiple slaves.  Latest example:

  https://build.gluster.org/job/centos6-regression/3656/console

04:47:38 Triggered by Gerrit: https://review.gluster.org/16903
04:47:38 Construction à distance sur slave1.cloud.gluster.org
(rackspace_regression_2gb) in workspace
/home/jenkins/root/workspace/centos6-regression
04:47:39  > git rev-parse --is-inside-work-tree # timeout=10
04:47:39 Fetching changes from the remote Git repository
04:47:39  > git config remote.origin.url git://review.gluster.org/glusterfs.git
# timeout=10
04:47:39 ERROR: Error fetching remote repo 'origin'
04:47:39 hudson.plugins.git.GitException: Failed to fetch from
git://review.gluster.org/glusterfs.git
04:47:39 at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:806)
04:47:39 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1066)
04:47:39 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1097)
04:47:39 at hudson.scm.SCM.checkout(SCM.java:485)
04:47:39 at
hudson.model.AbstractProject.checkout(AbstractProject.java:1269)
04:47:39 at
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
04:47:39 at
jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
04:47:39 at
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
04:47:39 at hudson.model.Run.execute(Run.java:1738)
04:47:39 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
04:47:39 at
hudson.model.ResourceController.execute(ResourceController.java:98)
04:47:39 at hudson.model.Executor.run(Executor.java:410)
04:47:39 Caused by: hudson.plugins.git.GitException: Command "git config
remote.origin.url git://review.gluster.org/glusterfs.git" returned status code
4:
04:47:39 stdout: 
04:47:39 stderr: error: failed to write new configuration file .git/config.lock
04:47:39 
04:47:39 at
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1784)

It's amusing that some of the messages (especially for aborted tasks) are in
French, but I don't think that's the real issue here.  There seems to be some
kind of permission problem affecting all machines.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=bZ4hL2chVN&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra

[Gluster-infra] [Bug 1431969] pull gluster-block v0.1.1 into d.g.o

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1431969

Nigel Babu  changed:

   What|Removed |Added

 Status|NEW |CLOSED
 Resolution|--- |CURRENTRELEASE
Last Closed|2017-03-15 06:01:34 |2017-03-17 05:13:15



--- Comment #11 from Nigel Babu  ---
This has been done. I've confirmed that upgrades from 0.1-2 to 0.1.1 works from
the d.g.o page. All seems to be working well.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=OSNu36x0Q3&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1431969] pull gluster-block v0.1.1 into d.g.o

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1431969



--- Comment #10 from Prasanna Kumar Kalever  ---
copr link to package
https://copr.fedorainfracloud.org/coprs/pkalever/gluster-block/build/527601/

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=W0B3l0Qbrp&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1431969] pull gluster-block v0.1.1 into d.g.o

2017-03-17 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1431969

Prasanna Kumar Kalever  changed:

   What|Removed |Added

 Status|CLOSED  |NEW
 Resolution|CURRENTRELEASE  |---
Summary|pull gluster-block v0.1-2   |pull gluster-block v0.1.1
   |into d.g.o  |into d.g.o
   Keywords||Reopened



--- Comment #9 from Prasanna Kumar Kalever  ---
I must apologies for wrong package name.

Thanks to Kaleb, Niels as well as Nigel for corrected me.

In "0.1-2" the "-2" belongs to revision. Means every time we build a package we
do increment this, or if we have build failures due to spec issue we do modify
the spec and increment the post-fix.

Since there are code changes, this release should go with 0.1.1 tag.

So now we have to change the 0.1-2 to 0.1.1, Nigel will I get your support here
?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=OY4fTkj3mU&a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] RPM build failures post-mortem

2017-03-17 Thread Nigel Babu
Hello folks,

You may have noticed that the RPM jobs failed quite often since the migration.
This should finally be fixed now. Here's a quick post-mortem of what happened.

After the restart, our Jenkins server has accidentally had a fr_FR locale. This
means some of the logs are now in French. In the past, we've had a permission
problem with the RPMS folder after a build. This is usually owned by root user
and root group. I fixed this by triggering a clean up after the job. The
trigger would listen for "Building remotely" in the log and then do a `chown`
on the workspace. Since we started Jenkins in the fr_FR locale, that line is
now in French, so Jenkins doesn't see "Bulding remotely" in the log. There's
now "Construction à distance" instead. This means the chown isn't triggered and
we error out with a Java traceback that's just a permission error. The Jenkins
user can't delete the workspace to clean it up.

The easy solution is to restart Jenkins. but in the interest of doing this
without a downtime, I've solved this with acls which create less of a mess.
This shouldn't be a problem from now on and I can take out the hacky
post-commit scripts.

Apologies for the large number of errors this week. If you recheck your job,
you should get a green now for rpm jobs.


--
nigelb


signature.asc
Description: PGP signature
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra