[Gluster-devel] Removing myself as maintainer

2019-03-10 Thread Nigel Babu
Hello folks, This change has gone through, but I wanted to let folks here know as well. I'm removing myself as maintainer from everything to reflect that I will no longer be the primary point of contact for any of the components I used to own. However, I will still be around and contributing

Re: [Gluster-devel] Jenkins switched over to new builders for regression

2019-02-08 Thread Nigel Babu
in the logs. On Fri, Feb 8, 2019 at 7:49 AM Nigel Babu wrote: > Hello, > > We've reached the half way mark in the migration and half our builders > today are now running on AWS. I've turned off the RAX builders and have > them try to be online only if the AWS builders cannot ha

[Gluster-devel] Jenkins switched over to new builders for regression

2019-02-07 Thread Nigel Babu
Hello, We've reached the half way mark in the migration and half our builders today are now running on AWS. I've turned off the RAX builders and have them try to be online only if the AWS builders cannot handle the number of jobs running at any given point. The new builders are named

[Gluster-devel] Regression logs issue

2019-02-07 Thread Nigel Babu
Hello folks, In the last week, if you have had a regression job that failed, you will not find a log for it. This is due to a mistake I made while deleting code. Rather than deleting the code for the push to an internal HTTP server, I also deleted a line which handled the log creation. Apologies

[Gluster-devel] Tests for the GCS stack using the k8s framework

2019-01-03 Thread Nigel Babu
Hello, Deepshikha and I have been working on understanding and using the k8s framework for testing the GCS stack. With the help of the folks from sig-storage, we've managed to write a sample test that needs to be run against an already setup k8s gluster with GCS installed on top[1]. This is a

[Gluster-devel] Infra Update for Nov and Dec

2018-12-19 Thread Nigel Babu
Hello folks, The infra team has not been sending regular updates recently because we’ve been caught up in several different pieces of work that were running into longer than 2 week sprint cycles. This is a summary of what we’ve done so far since the last update. * The bugzilla updates are done

[Gluster-devel] Short review.gluster.org outage in the next 15 mins

2018-11-05 Thread Nigel Babu
Hello folks, Going to restart gerrit on review.gluster.org for a quick config change in the next 15 mins. Estimate outage of 5 mins. I'll update this thread when we're back online -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

[Gluster-devel] Centos CI automation Retrospective

2018-11-02 Thread Nigel Babu
Hello folks, On Monday, I merged in the changes that allowed all the jobs in Centos CI to be handled in an automated fashion. In the past, it depended on Infra team members to review, merge, and apply the changes on Centos CI. I've now changed that so that the individual job owners can do their

[Gluster-devel] Gluster Infra Update

2018-10-18 Thread Nigel Babu
Hello folks, Here's the update from the last 2 weeks from the Infra team. * Created an architecture document for Automated Upgrade Testing. This is now done and is undergoing reviews. It is scheduled to be published on the devel list as soon as we have a decent PoC. * Finished part of the

[Gluster-devel] Infra Update for the last 2 weeks

2018-10-03 Thread Nigel Babu
Hello folks, I meant to send this out on Monday, but it's been a busy few days. * The infra pieces of distributed regression are now complete. A big shout out to Deepshikha for driving this and Ramky for his help in get this to completion. * The GD2 containers and CSI container builds work now.

Re: [Gluster-devel] gluster-ansible: status of the project

2018-09-30 Thread Nigel Babu
On Sun, Sep 30, 2018 at 6:45 PM Sachidananda URS wrote: > > > On Sun, Sep 30, 2018 at 12:56 PM, Yaniv Kaul wrote: > >> >> >> On Fri, Sep 28, 2018 at 2:33 PM Sachidananda URS wrote: >> >>> Hi, >>> >>> gluster-ansible project is aimed at automating the deployment and >>> maintenance of GlusterFS

[Gluster-devel] Unplanned Jenkins maintenance

2018-09-28 Thread Nigel Babu
Hello folks, I did a quick unplanned Jenkins maintenance today to upgrade 3 plugins with security issues in them. This is now complete. There was a brief period where we did not start new jobs until Jenkins restarted. There should have been no interruption of existing jobs or any jobs canceled.

Re: [Gluster-devel] [Gluster-infra] Freebsd builder upgrade to 10.4, maybe 11

2018-09-11 Thread Nigel Babu
On Tue, Sep 11, 2018 at 7:06 PM Michael Scherer wrote: > And... rescue mode is not working. So the server is down until > Rackspace fix it. > > Can someone disable the freebsd smoke test, as I think our 2nd builder > is not yet building fine ? > Disabled. Please do not merge any JJB review

Re: [Gluster-devel] Proposal to change Gerrit -> Bugzilla updates

2018-09-11 Thread Nigel Babu
On Mon, Sep 10, 2018 at 7:08 PM Shyam Ranganathan wrote: > My assumption here is that for each patch that mentions a BZ, an > additional tracker would be added to the tracker list, right? > Correct. > > Further assumption (as I have not used trackers before) is that this > would reduce noise

[Gluster-devel] Moving Jenkins alerts to a new list

2018-09-10 Thread Nigel Babu
Hello, In an effort to make the devel list and maintainer lists more noise free, I'm going to move all the Jenkins related alerts to a new list. This does not apply to the alert sent out for new releases. This is part of a longer-term plan to monitor build failures in Centos CI and the nightly

[Gluster-devel] Proposal to change Gerrit -> Bugzilla updates

2018-09-10 Thread Nigel Babu
Hello folks, We now have review.gluster.org as an external tracker on Bugzilla. Our current automation when there is a bugzilla attached to a patch is as follows: 1. When a new patchset has "Fixes: bz#1234" or "Updates: bz#1234", we will post a comment to the bug with a link to the patch and

Re: [Gluster-devel] Unplanned Jenkins Restart

2018-08-24 Thread Nigel Babu
Oops, big note: Centos Regression jobs may have ended up canceled. Please retry them. On Fri, Aug 24, 2018 at 9:31 PM Nigel Babu wrote: > Hello, > > We've had to do an unplanned Jenkins restart. Jenkins was overloaded and > not responding to any requests. There was a backlog of o

[Gluster-devel] Unplanned Jenkins Restart

2018-08-24 Thread Nigel Babu
Hello, We've had to do an unplanned Jenkins restart. Jenkins was overloaded and not responding to any requests. There was a backlog of over 100 jobs as well. The restart seems to have fixed things up. More details in bug: https://bugzilla.redhat.com/show_bug.cgi?id=1622173 -- nigelb

[Gluster-devel] Urgent Gerrit reboot today

2018-08-23 Thread Nigel Babu
Hello folks, We're going to do an urgent reboot of the Gerrit server in the next 1h or so. For some reason, hot-adding RAM on this machine isn't working, so we're going to do a reboot to get this working. This is needed to prevent the OOM Kill problems we've been running into since last night.

[Gluster-devel] Fwd: [Ci-users] Maintenance Window 22-Aug-2018 12:00PM UTC

2018-08-21 Thread Nigel Babu
Heads up: Centos CI will be undergoing maintenance tomorrow. -- Forwarded message - From: Brian Stinson Date: Tue, Aug 21, 2018 at 1:58 AM Subject: [Ci-users] Maintenance Window 22-Aug-2018 12:00PM UTC To: Hi All, Due to some pending OS updates we will be rebooting machines

Re: [Gluster-devel] Access to Docker Hub Gluster organization

2018-08-14 Thread Nigel Babu
On Tue, Aug 14, 2018 at 5:52 PM Humble Chirammal wrote: > > > On Tue, Aug 14, 2018 at 2:09 PM, Nigel Babu wrote: > >> Hello folks, >> >> Do we know who's the admin of the Gluster organization on Docker hub? I'd >> like to be added to the org so I can set

Re: [Gluster-devel] Master branch is closed

2018-08-13 Thread Nigel Babu
Oops, I apparently forgot to send out a note. Master has been since ~7 am IST. On Mon, Aug 13, 2018 at 4:25 PM Atin Mukherjee wrote: > Nigel, > > Now that mater branch is reopened, can you please revoke the commit access > restrictions? > > On Mon, 6 Aug 2018 at 09:12,

[Gluster-devel] ASAN Builds!

2018-08-10 Thread Nigel Babu
Hello folks, Thanks to Niels, we now have ASAN builds compiling and a flag for getting it to work locally. The patch[1] is not merged yet, but I can trigger runs off the patch for now. The first run is off[2] [1]: https://review.gluster.org/c/glusterfs/+/20589/2 [2]:

[Gluster-devel] Python components and test coverage

2018-08-10 Thread Nigel Babu
Hello folks, We're currently in a transition to python3. Right now, there's a bug in one piece of this transition code. I saw Nithya run into this yesterday. The challenge here is, none of our testing for python2/python3 transition catches this bug. Both Pylint and the ast-based testing that

[Gluster-devel] Gerrit Upgrade Retrospective

2018-08-10 Thread Nigel Babu
Hello folks, This is a quick retrospective we (the Infra team) did for the Gerrit upgrade from 2 days ago. ## Went Well * We had a full back up to fall back to. We had to fall back on this. * We had a good 4h window so we had time to make mistakes and recover from them. * We had a good number of

[Gluster-devel] Clang failures update

2018-08-10 Thread Nigel Babu
Hello folks, Based on Yaniv's feedback, I've removed deadcode.DeadStores checker. We are left with 161 failures. I'm going to move this to 140 as a target for now. The job will continue to be yellow and we need to fix at least 21 failures by 31 Aug. That's about 7 issues per week to fix. If

Re: [Gluster-devel] Spurious smoke failure in build rpms

2018-08-09 Thread Nigel Babu
Infra issue. Please file a bug. On Thu, Aug 9, 2018 at 3:57 PM Pranith Kumar Karampuri wrote: > https://build.gluster.org/job/devrpm-el7/10441/console > > *10:12:42* Wrote: >

[Gluster-devel] Post-upgrade issues

2018-08-08 Thread Nigel Babu
Hello folks, We have two post-upgrade issues 1. Jenkins jobs are failing because git clones fail. This is now fixed. 2. git.gluster.org shows no repos at the moment. I'm currently debugging this. -- nigelb ___ Gluster-devel mailing list

Re: [Gluster-devel] [Gluster-infra] Fwd: Gerrit downtime on Aug 8, 2016

2018-08-08 Thread Nigel Babu
On Wed, Aug 8, 2018 at 4:59 PM Yaniv Kaul wrote: > > Nice, thanks! > I'm trying out the new UI. Needs getting used to, I guess. > Have we upgraded to NotesDB? > Yep! Account information is now completely in NoteDB and not in ReviewDB(which is backed by postgresql for us) anymore.

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Nigel Babu
On Wed, Aug 8, 2018 at 2:00 PM Ravishankar N wrote: > > On 08/08/2018 05:07 AM, Shyam Ranganathan wrote: > > 5) Current test failures > > We still have the following tests failing and some without any RCA or > > attention, (If something is incorrect, write back). > > > >

[Gluster-devel] Fwd: Gerrit downtime on Aug 8, 2016

2018-08-07 Thread Nigel Babu
Reminder, this upgrade is tomorrow. -- Forwarded message - From: Nigel Babu Date: Fri, Jul 27, 2018 at 5:28 PM Subject: Gerrit downtime on Aug 8, 2016 To: gluster-devel Cc: gluster-infra , < automated-test...@gluster.org> Hello, It's been a while since we upgraded Gerr

[Gluster-devel] New Coverity Scan

2018-08-06 Thread Nigel Babu
Hello folks, We've run a new Coverity run that was entirely automated. Current split of Coverity issues: High: 132 Medium: 241 Low: 83 Total: 456 We will be pushing a nightly build into scan.coverity.com via Jenkins. So, you should be able to see updates to these numbers as you merge in fixes.

[Gluster-devel] Master branch is closed

2018-08-05 Thread Nigel Babu
Hello folks, Master branch is now closed. Only a few people have commit access now and it's to be exclusively used to merge fixes to make master stable again. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Nigel Babu
On Thu, Aug 2, 2018 at 5:12 PM Kotresh Hiremath Ravishankar < khire...@redhat.com> wrote: > Don't know, something to do with perf xlators I suppose. It's not > repdroduced on my local system with brick-mux enabled as well. But it's > happening on Xavis' system. > > Xavi, > Could you try with the

Re: [Gluster-devel] FreeBSD smoke test may fail for older changes, rebase needed

2018-08-02 Thread Nigel Babu
> That is fine with me. It is prepared for GlusterFS 5, so nothing needs > to be done for that. Only for 4.1 and 3.12 FreeBSD needs to be disabled > from the smoke job(s). > > I could not find the repo that contains the smoke job, otherwise I would > have tried to send a PR. > > Niels > For

[Gluster-devel] bug-1432542-mpx-restart-crash.t failures

2018-08-01 Thread Nigel Babu
Hi Shyam, Amar and I sat down to debug this failure[1] this morning. There was a bit of fun looking at the logs. It looked like the test restarted itself. The first log entry is at 16:20:03. This test has a timeout of 400 seconds which is around 16:26:43. However, if you account for the fact

Re: [Gluster-devel] FreeBSD smoke test may fail for older changes, rebase needed

2018-07-30 Thread Nigel Babu
> > The outcome is to get existing maintained release branches building and > working on FreeBSD, would that be correct? > > If so I think we can use the cherry-picked version, the changes seem > mostly straight forward, and it is possibly easier to maintain. > > Although, I have to ask, what is

Re: [Gluster-devel] Gerrit downtime on Aug 8, 2016

2018-07-28 Thread Nigel Babu
will get overwritten by Ansible tonight :) On Fri, Jul 27, 2018 at 5:28 PM Nigel Babu wrote: > Hello, > > It's been a while since we upgraded Gerrit. We plan to do a full upgrade > and move to 2.15.3. Among other changes, this brings in the new PolyGerrit > interface which brings signi

Re: [Gluster-devel] [Gluster-infra] [automated-testing] Gerrit downtime on Aug 8, 2016

2018-07-27 Thread Nigel Babu
e: > >> The staging URL seems to be missing from the note >> >> On Fri, Jul 27, 2018 at 5:28 PM, Nigel Babu wrote: >> > Hello, >> > >> > It's been a while since we upgraded Gerrit. We plan to do a full >> upgrade and >> > move to 2.15.3.

[Gluster-devel] Gerrit downtime on Aug 8, 2016

2018-07-27 Thread Nigel Babu
Hello, It's been a while since we upgraded Gerrit. We plan to do a full upgrade and move to 2.15.3. Among other changes, this brings in the new PolyGerrit interface which brings significant frontend changes. You can take a look at how this would look on the staging site[1]. ## Outage Window 0330

Re: [Gluster-devel] Release 5: Master branch health report (Week of 23rd July)

2018-07-25 Thread Nigel Babu
Replies inline On Thu, Jul 26, 2018 at 1:48 AM Shyam Ranganathan wrote: > On 07/24/2018 03:28 PM, Shyam Ranganathan wrote: > > On 07/24/2018 03:12 PM, Shyam Ranganathan wrote: > >> 1) master branch health checks (weekly, till branching) > >> - Expect every Monday a status update on various

Re: [Gluster-devel] Github teams/repo cleanup

2018-07-25 Thread Nigel Babu
On Wed, Jul 25, 2018 at 6:51 PM Niels de Vos wrote: > We had someone working on starting/stopping Jenkins slaves in Rackspace > on-demand. He since has left Red Hat and I do not think the infra team > had a great interest in this either (with the move out of Rackspace). > > It can be deleted

Re: [Gluster-devel] Github teams/repo cleanup

2018-07-25 Thread Nigel Babu
> So while cleaning thing up, I wonder if we can remove this one: > https://github.com/gluster/jenkins-ssh-slaves-plugin > > We have just a fork, lagging from upstream and I am sure we do not use > it. > Safe to delete. We're not using it for sure. > > The same goes for: >

Re: [Gluster-devel] Github teams/repo cleanup

2018-07-25 Thread Nigel Babu
I think our team structure on Github has become unruly. I prefer that we use teams only when we can demonstrate that there is a strong need. At the moment, the gluster-maintainers and the glusterd2 projects have teams that have a strong need. If any other repo has a strong need for teams, please

[Gluster-devel] Postmortem for Jenkins Outage on 20/07/18

2018-07-20 Thread Nigel Babu
Hello folks, I had to take down Jenkins for some time today. The server ran out of space and was silently ignoring Gerrit requests for new jobs. If you think one of your jobs needed a smoke or regression run and it wasn't triggered, this is the root cause. Please retrigger your jobs. ## Summary

[Gluster-devel] [FYI] GitHub connectivity issues

2018-07-20 Thread Nigel Babu
Hello folks, Our infra also runs in the same network, so if you notice issues, they're most likely related to the same network issues. -- Forwarded message - From: Fabian Arrotin Date: Fri, Jul 20, 2018 at 12:49 PM Subject: [Ci-users] [FYI] GitHub connectivity issue To:

[Gluster-devel] Re-thinking gluster regression logging

2018-07-02 Thread Nigel Babu
Hello folks, Deepshikha is working on getting the distributed-regression testing into production. This is a good time to discuss how we log our regression. We tend to go with the approach of "get as many logs as possible" and then we try to make sense of it when it something fails. In a setup

[Gluster-devel] Fwd: Clang-format: Update

2018-06-28 Thread Nigel Babu
Hello folks, A while ago we talked about using clang-format for our codebase[1]. We started doing several pieces of this work asynchronously. Here's an update on the current state of affairs: * Team agrees on a style and a config file representing the style. This has been happening

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-06-25 Thread Nigel Babu
On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi wrote: > > > There are currently a few known issues: >> * Not collecting the entire logs (/var/log/glusterfs) from servers. >> > > If I look at the activities involved with regression failures, this can > wait. > Well, we can't debug the current

[Gluster-devel] Fedora builds and rawhide builds

2018-06-19 Thread Nigel Babu
Hello, We ran into a problem where builds for F28 and above will not build on CentOS7 chroots. We caught this when F28 was rawhide but deemed it not yet important enough to fix, however, recent developments have forced us to make the switch. Our Fedora builds will also switch to using F28. We

[Gluster-devel] Running regressions with GD2

2018-06-01 Thread Nigel Babu
Hello, We're nearly at 4.1 release, I think now is a time to decide when to flip the switch to default to GD2 server for all regressions or a nightly GD2 run against the current regression. Can someone help with what tasks need to be done for this to be accomplished and how the CI team can help.

[Gluster-devel] Cleaning up artifacts.ci.centos.org/gluster

2018-05-27 Thread Nigel Babu
Hello folks, I'd like to propose that we clean up artifacts.ci.centos.org/gluster. Here's my proposal: 1. Nightly folder will only have rpms from pre-release versions. That is, I'll be deleting everything that's not 4.1 or 4.2. 2. Releases that are no longer actively supported will be deleted.

[Gluster-devel] Reminder OUTAGE Today 0800 EDT / 1200 UTC / 1730 IST

2018-05-14 Thread Nigel Babu
Hello, This is a reminder that we have a an outage today at the community cage outage window. The switches and routers will be getting updated and rebooted. This will cause an outage for a short period of time. -- nigelb ___ Gluster-devel mailing list

Re: [Gluster-devel] Builds failing regularly in various jobs (Jenkins, Smoke, other)

2018-05-14 Thread Nigel Babu
Merging this in and deploying on builders based on Kotresh's +1 to unblock builds and merges. On Mon, May 14, 2018 at 9:49 AM, Nigel Babu <nig...@redhat.com> wrote: > This is because of a new warning by liblvm2app. I have a hacky fix to the > compilation process to get rid of the war

Re: [Gluster-devel] Builds failing regularly in various jobs (Jenkins, Smoke, other)

2018-05-13 Thread Nigel Babu
This is because of a new warning by liblvm2app. I have a hacky fix to the compilation process to get rid of the warning. Please review: https://github.com/gluster/glusterfs-patch-acceptance-tests/pull/130 However, this will soon become more than just a warning. We should either fix this or

Re: [Gluster-devel] Coding Standard: Automation

2018-04-23 Thread Nigel Babu
I hope I've made the changes that Jeff's recommended in the first comment correctly[1]. Xavi, I've not pulled in any of your suggestions yet, because I figured you'd want to see the output and send suggestions. Please send pull requests to the .clang-format file (and only that file) for anything

Re: [Gluster-devel] trash.t failure

2018-04-17 Thread Nigel Babu
I've reverted the original patch entirely. Our policy is to either mark the test as bad or revert the entire patch. This seems to have caused multiple failures in the test system, so I've reverted the entire patch. Please re-land the patch with any fixes as a fresh review. On Wed, Apr 18, 2018 at

[Gluster-devel] Regression with brick multiplex on demand

2018-04-17 Thread Nigel Babu
Hello folks, In the past if you had a patch that was fixing a brick multiplex failure, you couldn't test whether it actually fixed brick multiplex failures easily. You had two options: * Create a new review where you turn on brick multiplex via the code and also apply your patch. Mark a -1 for

[Gluster-devel] Unplanned Jenkins restart

2018-04-16 Thread Nigel Babu
Hello folks, I've just restarted Jenkins for an security update to a plugin. There was one running centos-regression job that I had to cancel. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

[Gluster-devel] EC Stripe cache jobs

2018-04-11 Thread Nigel Babu
Hello, We have a job that tries to turn on stripe cache and run EC tests. It looks like we recently made the decision to turn on stripe cache by default. Is this job needed anymore? It fails at the moment due to a merge conflict. -- nigelb ___

[Gluster-devel] Jenkins upgrade today

2018-04-10 Thread Nigel Babu
Hello folks, There's a Jenkins security fix scheduled to be released today. This will most likely happen in the morning EDT. The Jenkins team has not specified a time. When we're ready for an upgrade, I'll cancel all running jobs and re-trigger them at te end of the upgrade. The downtime should

Re: [Gluster-devel] Announcing Softserve- serve yourself a VM

2018-03-20 Thread Nigel Babu
a machine is expired, one has to configure the machine and all > other stuff from the beginning. > > Thanks, > Sanju > > On Tue, Mar 13, 2018 at 12:37 PM, Nigel Babu <nig...@redhat.com> wrote: > >> >> We’ve enabled certain limits for this application: >&g

Re: [Gluster-devel] Accessing tarball of core fails with FORBIDDEN

2018-03-19 Thread Nigel Babu
As is the practice for any infra problems, please file a bug: https://bugzilla.redhat.com/enter_bug.cgi?product= GlusterFS=project-infrastructure On Mon, Mar 19, 2018 at 5:58 PM, Raghavendra Gowdappa wrote: > Hi Nigel, > > I am not able to download the archive of core from:

[Gluster-devel] Branching out Gluster docs

2018-03-17 Thread Nigel Babu
Hello folks, Our docs need a significant facelift. Nithya has suggested that we branch out the current docs into a branch called version-3 (or some such, please let's not bikeshed about the name) and have the master branch track 4.x series. We will significantly change the documentation for

[Gluster-devel] gluster-ant is now admin on synced repos

2018-03-15 Thread Nigel Babu
Hello, If there's a repo that's synced from Gerrit to Github, gluster-ant is now admin on those repos. This is so that when issues are closed via commit message, it is closed by the right user (the bot). Rather than the Infra person who set that repo up. As always, please file a bug if you

Re: [Gluster-devel] ./tests/basic/mount-nfs-auth.t spews out warnings

2018-03-14 Thread Nigel Babu
When the test works it takes less than 60 seconds. If it needs more than 200 seconds, that means there's an actual issue. On Wed, Mar 14, 2018 at 10:16 AM, Raghavendra Gowdappa wrote: > All, > > I was trying to debug a regression failure [1]. When I ran test locally on > my

Re: [Gluster-devel] Announcing Softserve- serve yourself a VM

2018-03-13 Thread Nigel Babu
> We’ve enabled certain limits for this application: >> >>1. >> >>Maximum allowance of 5 VM at a time across all the users. User have >>to wait until a slot is available for them after 5 machines allocation. >>2. >> >>User will get the requesting machines maximum upto 4 hours.

[Gluster-devel] Please help test Gerrit 2.14

2018-03-04 Thread Nigel Babu
Hello, It's that time again. We need to move up a Gerrit release. Staging has now been upgraded to the latest version. Please help test it and give us feedback on any issues you notice: https://gerrit-stage.rht.gluster.org/ -- nigelb ___ Gluster-devel

Re: [Gluster-devel] [Gluster-infra] Continuous tests failure on Fedora RPM builds

2018-03-02 Thread Nigel Babu
This is now fixed. Shyam found the root case. After a mock upgrade, mock would wait for user confirmation that DNF wasn't installed on the system. Given this was a centos machine, DNF wasn't readily available. I set the config option dnf_warning=False and that fixed the failures. All previously

Re: [Gluster-devel] tests/bugs/rpc/bug-921072.t - fails almost all the times in mainline

2018-02-20 Thread Nigel Babu
Aha. Thanks Mohit. That was infra. Sorry about that. The first line in /etc/hosts said ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 Once I removed it, the tests started running faster. I'll update my patch to remove this particular test from timeout fix On

Re: [Gluster-devel] tests/bugs/rpc/bug-921072.t - fails almost all the times in mainline

2018-02-20 Thread Nigel Babu
The immediate cause of this failure is that we merged the timeout patch which gives each test 200 seconds to finish. This test and another one takes over 200 seconds on regression nodes. I have a patch up to change the timeout https://review.gluster.org/#/c/19605/1 However,

[Gluster-devel] Infra machines update

2018-02-19 Thread Nigel Babu
Hello folks, We're all out of Centos 6 nodes from today. I've just deleted the last of them. We now run exclusively on Centos 7 nodes. We've not received any negative feedback about plans to move NetBSD, so I've disabled and removed all the NetBSD jobs and nodes as well. -- nigelb

Re: [Gluster-devel] Jenkins Issues this weekend and how we're solving them

2018-02-19 Thread Nigel Babu
On Mon, Feb 19, 2018 at 5:58 PM, Nithya Balachandran <nbala...@redhat.com> wrote: > > > On 19 February 2018 at 13:12, Atin Mukherjee <amukh...@redhat.com> wrote: > >> >> >> On Mon, Feb 19, 2018 at 8:53 AM, Nigel Babu <nig...@redhat.com> wrote: >&

[Gluster-devel] Jenkins Issues this weekend and how we're solving them

2018-02-18 Thread Nigel Babu
Hello, As you all most likely know, we store the tarball of the binaries and core if there's a core during regression. Occasionally, we've introduced a bug in Gluster and this tar can take up a lot of space. This has happened recently with brick multiplex tests. The build-install tar takes up

Re: [Gluster-devel] run-tests-in-vagrant

2018-02-15 Thread Nigel Babu
So we have a job that's unmaintained and unwatched. If nobody steps up to own it in the next 2 weeks, I'll be deleting this job. On Wed, Feb 14, 2018 at 4:49 PM, Niels de Vos <nde...@redhat.com> wrote: > On Wed, Feb 14, 2018 at 11:15:23AM +0530, Nigel Babu wrote: > > Hello, > &

Re: [Gluster-devel] build.gluster.org in shutdown mode

2018-02-14 Thread Nigel Babu
This upgrade is now complete and we're now running the latest version of Jenkins. On Thu, Feb 15, 2018 at 9:53 AM, Nigel Babu <nig...@redhat.com> wrote: > Hello, > > I've just placed Jenkins in shutdown mode. No new jobs will be started for > about an hour from now. I intend

[Gluster-devel] build.gluster.org in shutdown mode

2018-02-14 Thread Nigel Babu
Hello, I've just placed Jenkins in shutdown mode. No new jobs will be started for about an hour from now. I intend to upgrade Jenkins to pull in the latest security fixes. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

[Gluster-devel] run-tests-in-vagrant

2018-02-13 Thread Nigel Babu
Hello, Centos CI has a run-tests-in-vagrant job. Do we continue to need this anymore? It still runs master and 3.8. I don't see this job adding much value at this point given we only look at results that are on build.gluster.org. I'd like to use the extra capacity for other tests that will run on

[Gluster-devel] Migration of Centos CI jobs to it's own repo

2018-02-12 Thread Nigel Babu
Hello folks, I'm trying to make the glusterfs-patch-acceptance-tests repo lighter by really only having code that's needed to run regressions and build for gluster. The Centos CI jobs, therefore need to move to it's own repo. As a first step, I've created a new repo[1] for centos ci jobs. The

Re: [Gluster-devel] Replacing Centos 6 nodes with Centos 7

2018-02-01 Thread Nigel Babu
specific test cases and the SSD disks. I'm going to add one more Centos 7 machine to the pool today. On Thu, Feb 1, 2018 at 9:26 AM, Nigel Babu <nig...@redhat.com> wrote: > Hello folks, > > Today, I'm putting the first Centos 7 node in our regression pool. > > slave28.cloud.gl

[Gluster-devel] Replacing Centos 6 nodes with Centos 7

2018-01-31 Thread Nigel Babu
Hello folks, Today, I'm putting the first Centos 7 node in our regression pool. slave28.cloud.gluster.org -> Shutdown and removed builder100.cloud.gluster.org -> New Centos7 node (we'll be starting from 100 upwards) If this run goes well, we'll be replacing the nodes one by one with Centos 7.

[Gluster-devel] Planned Outage: supercolony.gluster.org on 2018-02-21

2018-01-31 Thread Nigel Babu
Hello folks, We're going to be resizing the supercolony.gluster.org on our cloud provider. This will definitely lead to a small outage for 5 mins. In the event that something goes wrong in this process, we're taking a 2-hour window for this outage. Date: Feb 21 Server: supercolony.gluster.org

Re: [Gluster-devel] Rawhide RPM builds failing

2018-01-24 Thread Nigel Babu
More details: https://build.gluster.org/job/rpm-rawhide/1182/ On Wed, Jan 24, 2018 at 2:03 PM, Niels de Vos <nde...@redhat.com> wrote: > On Wed, Jan 24, 2018 at 09:14:51AM +0530, Nigel Babu wrote: > > Hello folks, > > > > Our rawhide rpm builds seem to be failing with

[Gluster-devel] Rawhide RPM builds failing

2018-01-23 Thread Nigel Babu
Hello folks, Our rawhide rpm builds seem to be failing with what looks like a specfile issue. It's worth looking into this now before F28 is released in May. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

Re: [Gluster-devel] Infra-related Regression Failures and What We're Doing

2018-01-22 Thread Nigel Babu
Update: All the nodes that had problems with geo-rep are now fixed. Waiting on the patch to be merged before we switch over to Centos 7. If things go well, we'll replace nodes one by one as soon as we have one green on Centos 7. On Mon, Jan 22, 2018 at 12:21 PM, Nigel Babu <nig...@redhat.

[Gluster-devel] Infra-related Regression Failures and What We're Doing

2018-01-21 Thread Nigel Babu
Hello folks, As you may have noticed, we've had a lot of centos6-regression failures lately. The geo-replication failures are the new ones which particularly concern me. These failures have nothing to do with the test. The tests are exposing a problem in our infrastructure that we've carried

[Gluster-devel] Please file a bug if you take a machine offline

2018-01-10 Thread Nigel Babu
Hello folks, If you take a machine offline, please file a bug so that the machine can be debugged and return to the pool. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Recent regression failures

2018-01-10 Thread Nigel Babu
Hello folks, We may have been a little too quick to blame Meltdown on the Jenkins failures yesterday. In any case, we've open a ticket with our provider and they're looking into the failures. I've looked at the last 90 failures to get a comprehensive number on the failures. Total Jobs: 90

[Gluster-devel] Moving Regressions to Centos 7

2017-12-20 Thread Nigel Babu
Hello folks, We've been using Centos 6 for our regressions for a long time. I believe it's time that we moved to Centos 7. It's causing us minor issues. For example, tests run fine on the regression boxes but don't work on local machines or vice-versa. Moving up gives us the ability to use newer

[Gluster-devel] Emergency Jenkins Restart

2017-12-13 Thread Nigel Babu
Hello folks, I'm going to be restarting Jenkins for an important security update. Any running jobs will be canceled and retriggered. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

[Gluster-devel] Permission change for +2 votes on review.gluster.org

2017-12-07 Thread Nigel Babu
Hello folks, We talked about this last week at the maintainer's meeting. We're going to restrict +2 votes to people who can also submit the patch. This makes sure that patches have actual maintainers giving +2. Everyone else will be able to give a +1. If this affects your project/component's

Re: [Gluster-devel] Need help figuring out the reason for test failure

2017-11-27 Thread Nigel Babu
Pranith, Our logging has changed slightly. Please read my email titled "Changes in handling logs from (centos) regressions and smoke" to gluster-devel and gluster-infra. On Tue, Nov 28, 2017 at 8:06 AM, Pranith Kumar Karampuri < pkara...@redhat.com> wrote: > One of my

[Gluster-devel] Tests failing on Centos 7

2017-11-27 Thread Nigel Babu
Hello folks, I have an update on chunking. There's good news and bad. The first bit is that We a chunked regression job now. It splits it out into 10 chunks that are run in parallel. This chunking is quite simple at the moment and doesn't try to be very smart. The intelligence steps will come in

[Gluster-devel] Changes in handling logs from (centos) regressions and smoke

2017-11-20 Thread Nigel Babu
Hello folks, We're making some changes in how we handle logs from Centos regression and smoke tests. Instead of having them available via HTTP access to the node itself, it will be available via the Jenkins job as artifacts. For example: Smoke job:

[Gluster-devel] Unplanned Jenkins restart

2017-11-19 Thread Nigel Babu
I noticed that Jenkins wasn't loading up this morning. Further debugging showed a java heap size problem. I tried to debug it, but eventually just restarted Jenkins. This means any running job or any job triggered was stopped. Please re-trigger your jobs. -- nigelb

[Gluster-devel] Unplanned Jenkins restart this morning

2017-11-08 Thread Nigel Babu
Hello folks, I had to do a quick Jenkins upgrade and restart this morning for an urgent security fix. A few of our periodic jobs were cancelled, I'll re-trigger them now. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org

Re: [Gluster-devel] Change #18681 has broken build on master

2017-11-08 Thread Nigel Babu
I can try to explain what happened. For instance, here's a git tree. Each alphabet represents a commit. A -> B -> C -> D -> E -> F (F is the HEAD of master. green builds) Change X branched off at B A -> B -> X (green builds) Change Y branched off at D A -> B -> C -> D -> Y (green builds) Now

Re: [Gluster-devel] Change #18681 has broken build on master

2017-11-07 Thread Nigel Babu
Landed: https://github.com/gluster/glusterfs/commit/c3d7974e2be68f0fac8f54c9557d0f868e6be6c8 Please rebase your patches and re-trigger. On Tue, Nov 7, 2017 at 5:23 PM, Nigel Babu <nig...@redhat.com> wrote: > Rafi has a fix[1]. I'm going to make it skip regressions and land it &

Re: [Gluster-devel] Change #18681 has broken build on master

2017-11-07 Thread Nigel Babu
Rafi has a fix[1]. I'm going to make it skip regressions and land it directly. https://review.gluster.org/#/c/18680/ On Tue, Nov 7, 2017 at 4:42 PM, Raghavendra Gowdappa wrote: > Please check [1]. > > Build on master branch on my laptop failed too: > > [raghu@unused

[Gluster-devel] Unplanned Gerrit Outage yesterday

2017-11-02 Thread Nigel Babu
Hello folks, Yesterday, we had an unplanned Gerrit outage. We have now determined that for some reason the machine rebooted for some reason. Michael is continuing to debug what lead to this issue. Gerrit does not start automatically when the VM restarted at this point. We are currently testing a

  1   2   3   >