Re: [Gluster-infra] [Bug 1793955] /tests/basic/distribute/rebal-all-nodes-migrate.t is failing in CI

2020-01-29 Thread Sankarshan Mukhopadhyay
On Wed, 29 Jan 2020 at 23:32,  wrote:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1793955
>
> yati padia  changed:
>
>What|Removed |Added
> 
>  Status|NEW |CLOSED
>  Resolution|--- |NOTABUG
> Last Closed||2020-01-29 18:02:12

Alright. I'm not sure that this was resolved in any satisfactory way.
There is seemingly a few more instances where it fails and if this is
not an issue with the structural/logging changes, then it needs more
work to deduce what causes it to happen.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] Follow-up note. Github migration planning meeting 23-Jan-2020

2020-01-23 Thread Sankarshan Mukhopadhyay
Deepshikha, thank you for calling the meeting to get this activity going.

Putting a note to the list as a summary of the items discussed. Please
add/modify as needed for completeness.

+ Michael summarized the key issues to consider. These include (a) the
actual project repositories which need to be migrated (b) the workflow
and model to be adopted by the project once Github is the main
repository (and not a mirror) (c) rights and permissions on Github
team/organization to enable doing specific things and (d) integration
work required (scripting, additional tooling and/or monitoring) to
enable the switch

+ Amar, Michael and Yaniv discussed the idea of migrating the
glusterfs repository for a trial run before switching over based on an
iterative learning discussion

+ During the meeting it was agreed that the 3 topics to work on
leading up to the start of the trial phase are (1) a documented
workflow to handle development within Github (blocker item) (2)
discovery of any existing knowledge experience for a similar migration
(not a blocker item) and (3) an estimation of the scripts or tooling
required

In preparation for the next meeting in 2 week's time (06-Feb,
tentatively) , the following items would be reasonably ready for
discussion. sankarshan will set up the next and subsequent meetings.

+ a proposed workflow including patch merge model. Amar had an earlier
proposal and he'll review the necessity to refresh it in light of best
practices from other projects such as Ansible

+ an enhanced understanding of dates for the start of the trial run in
context of the GA dates for Gluster8. At this meeting the idea is to
do the trial before the Gluster 8 release in order to get enough time
to work through any issues

+ Michael and Deepshikha will also endeavor to work through the topic
to discover any additional tasks/sub-tasks and risks.

/s

++ Invite for subsequent meetings ++

Topic: Gluster community Github migration prep meeting
Time: Feb 6, 2020 15:00 Mumbai, Kolkata, New Delhi
Every 2 weeks on Thu, until Apr 30, 2020, 7 occurrence(s)
Feb 6, 2020 15:00
Feb 20, 2020 15:00
Mar 5, 2020 15:00
Mar 19, 2020 15:00
Apr 2, 2020 15:00
Apr 16, 2020 15:00
Apr 30, 2020 15:00
Please download and import the following iCalendar (.ics) files to
your calendar system.
Weekly: 
https://zoom.us/meeting/tJUtdOGtrTwrc5fFvCPhiQcvaL9jU1shNA/ics?icsToken=98tyKuCprjgiH9eSsV39Z7ItOav5bM_2kHIbrIVLvy_tChFQdALab-h3Y6F3PvmB

Join Zoom Meeting
https://zoom.us/j/910385371

Meeting ID: 910 385 371

One tap mobile
+16699009128,,910385371# US (San Jose)
+16465588656,,910385371# US (New York)

Dial by your location
+1 669 900 9128 US (San Jose)
+1 646 558 8656 US (New York)
Meeting ID: 910 385 371
Find your local number: https://zoom.us/u/a9X9UtuHA
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [automated-testing] Gluster Build system is down for glusto-tests

2020-01-22 Thread Sankarshan Mukhopadhyay
Do Deepshikha and Michael know about this already?

On Wed, 22 Jan 2020 at 14:48, Kshithij Iyer  wrote:

> Hello,
> All the jobs in Gluster Build System for glusto-tests which are failing.
> Could anyone please look into it?
> https://build.gluster.org/job/glusto-tests-lint/4820/console
>
> Thanks,
>
> kshithij Iyer
>
> Associate Quality engineer
>
> Red Hat 
>
> M: +91-9998132289 IM: kiyer
> @RedHat    Red Hat
>   Red Hat
> 
>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] Is Mailman/list subscription broken?

2020-01-21 Thread Sankarshan Mukhopadhyay
On Tue, 21 Jan 2020 at 17:56, Michael Scherer  wrote:
>
> Le mardi 21 janvier 2020 à 08:34 +0530, Sankarshan Mukhopadhyay a
> écrit :
> > I was attempting to help someone subscribe to the lsts and we see the
> > following on the browser page load. The specific list (form) page
> > loads, but upon entering the data it throws this up
> >
> > This page isn’t working
> > lists.gluster.org didn’t send any data.
> > ERR_EMPTY_RESPONSE
>
> More information (like the IP, or provider, and the time this did
> occured) could help to narrow the problem and check the logs.
>
>
> Due to spam related reason in the past, we have some protections system
> that could have been triggerred and that would create the same kind of
> error, they are supposed to display a more meaningful message tho.

The ISP I'm using is Wish Net and my (public) IP as seen is
223.223.149.197 And it is likely to be an ISP blacklist because
switching over to a VPN based connection works well enough.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] Is Mailman/list subscription broken?

2020-01-20 Thread Sankarshan Mukhopadhyay
I was using Chrome when I faced this issue. Apparently, this happens
"often with Chrome". That said, I continue to face this when using
Firefox or, Safari. So, does not seem to be localized to a specific
browser.

On Tue, 21 Jan 2020 at 08:34, Sankarshan Mukhopadhyay
 wrote:
>
> I was attempting to help someone subscribe to the lsts and we see the
> following on the browser page load. The specific list (form) page
> loads, but upon entering the data it throws this up
>
> This page isn’t working
> lists.gluster.org didn’t send any data.
> ERR_EMPTY_RESPONSE
>
> /s
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

[Gluster-infra] Is Mailman/list subscription broken?

2020-01-20 Thread Sankarshan Mukhopadhyay
I was attempting to help someone subscribe to the lsts and we see the
following on the browser page load. The specific list (form) page
loads, but upon entering the data it throws this up

This page isn’t working
lists.gluster.org didn’t send any data.
ERR_EMPTY_RESPONSE

/s
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] [Gluster-devel] Infra & Tests: Few requests

2020-01-03 Thread Sankarshan Mukhopadhyay
On Fri, Jan 3, 2020 at 7:37 PM Amar Tumballi  wrote:
>
> Hi Team,
>
> First thing first - Happy 2020 !! Hope this year will be great for all of us 
> :-)
>
> Few requests to begin the new year!
>
> 1. Lets please move all the fedora builders to F31.
>- There can be some warnings with F31, so we can start with 'skip' mode 
> and once fixed enabled to vote.
>

Probably needs a ticket/issue/bug (I often forget which is the method
that is used to track tasks)

> 2. Failures to smoke due to devrpm-el.
>- I was not able to get the info just by looking at console logs, and 
> other things. It is not the previous glitches of Build root locked by another 
> process error.
>- Would be great to get this resolved, so we can merge some good patches.
>
> 3. Random failures in centos-regression.
>- Again, I am not sure if someone looking into this.
>- I have noticed tests like 
> './tests/basic/distribute/rebal-all-nodes-migrate.t' etc failing on few 
> machines.

Some of the random issues have been a bit difficult to nail down.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] New workflow proposal for glusterfs repo

2019-07-09 Thread Sankarshan Mukhopadhyay
Following up - what is the agreed upon plan here?

On Tue, Jun 25, 2019 at 11:20 AM Amar Tumballi Suryanarayan
 wrote:
>
> Adding gluster-devel ML.
>
> Only concern to my earlier proposal was not making regression runs wait for 
> reviews, but to be triggered automatically after successful smoke.
>
> The ask was to put burden on machines than on developers, which I agree to 
> start with. Lets watch the expenses due to this change for a month once it 
> gets implemented, and then take stock of the situation. For now, lets reduce 
> one more extra work for developers, ie, marking Verified flag.
>
> On Tue, Jun 25, 2019 at 11:01 AM Sankarshan Mukhopadhyay 
>  wrote:
>>
>> Amar, can you bring about an agreement/decision on this so that we can
>> make progress?
>>
>
> So, My take is:
>
> Lets make serialized smoke + regression a reality. It may add to overall 
> time, but if there are failures, this has potential to reduce overall machine 
> usage... for a successful patch, the extra few minutes at present doesn't 
> harm as many of our review avg time is around a week.
>
>
>>
>> On Tue, Jun 25, 2019 at 10:55 AM Deepshikha Khandelwal
>>  wrote:
>> >
>> >
>> >
>> > On Mon, Jun 24, 2019 at 5:30 PM Sankarshan Mukhopadhyay 
>> >  wrote:
>> >>
>> >> Checking back on this - do we need more voices or, amendments to
>> >> Amar's original proposal before we scope the implementation?
>> >>
>> >> I read Amar's proposal as desiring an outcome where the journey of a
>> >> valid/good patch through the test flows is fast and efficient.
>
>
> Absolutely! This is critical for us to be inclusive community.
>
>>
>> >>
>> >> On Wed, Jun 12, 2019 at 11:58 PM Raghavendra Talur  
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Jun 12, 2019, 1:56 PM Atin Mukherjee  
>> >> > wrote:
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Wed, 12 Jun 2019 at 18:04, Amar Tumballi Suryanarayan 
>> >> >>  wrote:
>> >> >>>
>> >> >>>
>> >> >>> Few bullet points:
>> >> >>>
>> >> >>> * Let smoke job sequentially for below, and if successful, in 
>> >> >>> parallel for others.
>> >> >>>   - Sequential:
>> >> >>>   -- clang-format check
>> >> >>>   -- compare-bugzilla-version-git-branch
>> >> >>>   -- bugzilla-post
>> >> >>>   -- comment-on-issue
>> >> >>>   -- fedora-smoke (mainly don't want warning).
>> >> >>
>> >> >>
>> >> >> +1
>> >> >>
>> >> >>>   - Parallel
>> >> >>>-- all devrpm jobs
>> >> >>>-- 32bit smoke
>> >> >>>-- freebsd-smoke
>> >> >>>-- smoke
>> >> >>>-- strfmt_errors
>> >> >>>-- python-lint, and shellcheck.
>> >> >>
>> >> >>
>> >> >> I’m sure there must be a reason but would like to know that why do 
>> >> >> they need to be parallel? Can’t we have them sequentially to have 
>> >> >> similar benefits of the resource utilisation like above? Or are all 
>> >> >> these individual jobs are time consuming such that having them 
>> >> >> sequentially will lead the overall smoke job to consume much longer?
>
>
> Most of these are doing the same thing, make dist, make install, make rpms. 
> but on different arch and with different flags. To start with, we can do 
> these also sequentially. That way, infra team needn't worry about some 
> parallel, some sequential jobs.
>
>>
>> >> >>
>> >> >>>
>> >> >>> * Remove Verified flag. No point in one more extra button which users 
>> >> >>> need to click, anyways CentOS regression is considered as 
>> >> >>> 'Verification'.
>> >> >
>> >> >
>> >> > The requirement of verified flag by patch owner for regression to run 
>> >> > was added because the number of Jenkins machines we had were few and 
>> >> > patches being uploaded were many.
>> >>
>> >> However, do we consider that at present time the situation has
>> >> impr

Re: [Gluster-infra] New workflow proposal for glusterfs repo

2019-06-24 Thread Sankarshan Mukhopadhyay
Amar, can you bring about an agreement/decision on this so that we can
make progress?

On Tue, Jun 25, 2019 at 10:55 AM Deepshikha Khandelwal
 wrote:
>
>
>
> On Mon, Jun 24, 2019 at 5:30 PM Sankarshan Mukhopadhyay 
>  wrote:
>>
>> Checking back on this - do we need more voices or, amendments to
>> Amar's original proposal before we scope the implementation?
>>
>> I read Amar's proposal as desiring an outcome where the journey of a
>> valid/good patch through the test flows is fast and efficient.
>>
>> On Wed, Jun 12, 2019 at 11:58 PM Raghavendra Talur  wrote:
>> >
>> >
>> >
>> > On Wed, Jun 12, 2019, 1:56 PM Atin Mukherjee  wrote:
>> >>
>> >>
>> >>
>> >> On Wed, 12 Jun 2019 at 18:04, Amar Tumballi Suryanarayan 
>> >>  wrote:
>> >>>
>> >>>
>> >>> Few bullet points:
>> >>>
>> >>> * Let smoke job sequentially for below, and if successful, in parallel 
>> >>> for others.
>> >>>   - Sequential:
>> >>>   -- clang-format check
>> >>>   -- compare-bugzilla-version-git-branch
>> >>>   -- bugzilla-post
>> >>>   -- comment-on-issue
>> >>>   -- fedora-smoke (mainly don't want warning).
>> >>
>> >>
>> >> +1
>> >>
>> >>>   - Parallel
>> >>>-- all devrpm jobs
>> >>>-- 32bit smoke
>> >>>-- freebsd-smoke
>> >>>-- smoke
>> >>>-- strfmt_errors
>> >>>-- python-lint, and shellcheck.
>> >>
>> >>
>> >> I’m sure there must be a reason but would like to know that why do they 
>> >> need to be parallel? Can’t we have them sequentially to have similar 
>> >> benefits of the resource utilisation like above? Or are all these 
>> >> individual jobs are time consuming such that having them sequentially 
>> >> will lead the overall smoke job to consume much longer?
>> >>
>> >>>
>> >>> * Remove Verified flag. No point in one more extra button which users 
>> >>> need to click, anyways CentOS regression is considered as 'Verification'.
>> >
>> >
>> > The requirement of verified flag by patch owner for regression to run was 
>> > added because the number of Jenkins machines we had were few and patches 
>> > being uploaded were many.
>>
>> However, do we consider that at present time the situation has
>> improved to consider the change Amar asks for?
>>
>> >
>> >>>
>> >>> * In a normal flow, let CentOS regression which is running after 
>> >>> 'Verified' vote, be triggered on first 'successful' +1 reviewed vote.
>> >>
>> >>
>> >> As I believe some reviewers/maintainers (including me) would like to see 
>> >> the regression vote to put a +1/+2 in most of the patches until and 
>> >> unless they are straight forward ones. So although with this you’re 
>> >> reducing the burden of one extra click from the patch owner, but on the 
>> >> other way you’re introducing the same burden on the reviewers who would 
>> >> like to check the regression vote. IMHO, I don’t see much benefits in 
>> >> implementing this.
>> >
>> >
>> > Agree with Atin here. Burden should be on machines before people. 
>> > Reviewers prefer to look at patches that have passed regression.
>> >
>> > In github heketi, we have configured regression to run on all patches that 
>> > are submitted by heketi developer group. If such configuration is possible 
>> > in gerrit+Jenkins, we should definitely do it that way.
>> >
>> > For patches that are submitted by someone outside of the developer group, 
>> > a maintainer should verify that the patch doesn't do anything harmful and 
>> > mark the regression to run.
>> >
>>
>> Deepshikha, is the above change feasible in the summation of Amar's proposal?
>
> Yes, I'm planning to implement the regression & flag related changes 
> initially if everyone agrees.
>>
>>
>> >>>
>> >>> * For those patches which got pushed to system to just 'validate' 
>> >>> behavior, to run sample tests, WIP patches, continue to support 'recheck 
>> >>> centos'  comment message, so we can run without any vote. Let it not be 
>> >>> the norm.
>> >>>
>> >>>
>> >>> With this, I see that we can reduce smoke failures utilize 90% less 
>> >>> resources for a patch which would fail smoke anyways. (ie, 95% of the 
>> >>> smoke failures would be caught in first 10% of the resource, and time).
>> >>>
>> >>> Also we can reduce number of regression running, as review is mandatory 
>> >>> to run regression.
>> >>>
>> >>> These are just suggestions, happy to discuss more on these.
>> ___
>> Gluster-infra mailing list
>> Gluster-infra@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-infra



-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] New workflow proposal for glusterfs repo

2019-06-24 Thread Sankarshan Mukhopadhyay
Checking back on this - do we need more voices or, amendments to
Amar's original proposal before we scope the implementation?

I read Amar's proposal as desiring an outcome where the journey of a
valid/good patch through the test flows is fast and efficient.

On Wed, Jun 12, 2019 at 11:58 PM Raghavendra Talur  wrote:
>
>
>
> On Wed, Jun 12, 2019, 1:56 PM Atin Mukherjee  wrote:
>>
>>
>>
>> On Wed, 12 Jun 2019 at 18:04, Amar Tumballi Suryanarayan 
>>  wrote:
>>>
>>>
>>> Few bullet points:
>>>
>>> * Let smoke job sequentially for below, and if successful, in parallel for 
>>> others.
>>>   - Sequential:
>>>   -- clang-format check
>>>   -- compare-bugzilla-version-git-branch
>>>   -- bugzilla-post
>>>   -- comment-on-issue
>>>   -- fedora-smoke (mainly don't want warning).
>>
>>
>> +1
>>
>>>   - Parallel
>>>-- all devrpm jobs
>>>-- 32bit smoke
>>>-- freebsd-smoke
>>>-- smoke
>>>-- strfmt_errors
>>>-- python-lint, and shellcheck.
>>
>>
>> I’m sure there must be a reason but would like to know that why do they need 
>> to be parallel? Can’t we have them sequentially to have similar benefits of 
>> the resource utilisation like above? Or are all these individual jobs are 
>> time consuming such that having them sequentially will lead the overall 
>> smoke job to consume much longer?
>>
>>>
>>> * Remove Verified flag. No point in one more extra button which users need 
>>> to click, anyways CentOS regression is considered as 'Verification'.
>
>
> The requirement of verified flag by patch owner for regression to run was 
> added because the number of Jenkins machines we had were few and patches 
> being uploaded were many.

However, do we consider that at present time the situation has
improved to consider the change Amar asks for?

>
>>>
>>> * In a normal flow, let CentOS regression which is running after 'Verified' 
>>> vote, be triggered on first 'successful' +1 reviewed vote.
>>
>>
>> As I believe some reviewers/maintainers (including me) would like to see the 
>> regression vote to put a +1/+2 in most of the patches until and unless they 
>> are straight forward ones. So although with this you’re reducing the burden 
>> of one extra click from the patch owner, but on the other way you’re 
>> introducing the same burden on the reviewers who would like to check the 
>> regression vote. IMHO, I don’t see much benefits in implementing this.
>
>
> Agree with Atin here. Burden should be on machines before people. Reviewers 
> prefer to look at patches that have passed regression.
>
> In github heketi, we have configured regression to run on all patches that 
> are submitted by heketi developer group. If such configuration is possible in 
> gerrit+Jenkins, we should definitely do it that way.
>
> For patches that are submitted by someone outside of the developer group, a 
> maintainer should verify that the patch doesn't do anything harmful and mark 
> the regression to run.
>

Deepshikha, is the above change feasible in the summation of Amar's proposal?

>>>
>>> * For those patches which got pushed to system to just 'validate' behavior, 
>>> to run sample tests, WIP patches, continue to support 'recheck centos'  
>>> comment message, so we can run without any vote. Let it not be the norm.
>>>
>>>
>>> With this, I see that we can reduce smoke failures utilize 90% less 
>>> resources for a patch which would fail smoke anyways. (ie, 95% of the smoke 
>>> failures would be caught in first 10% of the resource, and time).
>>>
>>> Also we can reduce number of regression running, as review is mandatory to 
>>> run regression.
>>>
>>> These are just suggestions, happy to discuss more on these.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] Do we have a monitoring system on our builders?

2019-04-27 Thread Sankarshan Mukhopadhyay
On Sun, Apr 28, 2019 at 12:49 AM Yaniv Kaul  wrote:
>
> I'd like to see what is our status.


is what we have at this point

> Just had CI failures[1] because builder26.int.rht.gluster.org is not 
> available, apparently.
>
> TIA,
> Y.
>
> [1] https://build.gluster.org/job/devrpm-el7/15846/console
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] Product: GlusterFS Component: project-infrastructure RHBZs in NEW and ASSIGNED

2019-04-21 Thread Sankarshan Mukhopadhyay
 will list RHBZs in these 2 status. Please
review the ASSIGNED ones first to check if work on some of these are
actually complete and the RHBZ state is stale.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [automated-testing] What is the current state of the Glusto test framework in upstream?

2019-03-13 Thread Sankarshan Mukhopadhyay
On Wed, Mar 13, 2019 at 3:03 PM Yaniv Kaul  wrote:
> On Wed, Mar 13, 2019, 3:53 AM Sankarshan Mukhopadhyay 
>  wrote:
>>
>> What I am essentially looking to understand is whether there are
>> regular Glusto runs and whether the tests receive refreshes. However,
>> if there is no available Glusto service running upstream - that is a
>> whole new conversation.
>
>
> I'm* still trying to get it running properly on my simple Vagrant+Ansible 
> setup[1].
> Right now I'm installing Gluster + Glusto + creating bricks, pool and a 
> volume in ~3m on my latop.
>

This is good. I think my original question was to the maintainer(s) of
Glusto along with the individuals involved in the automated testing
part of Gluster to understand the challenges in deploying this for the
project.

> Once I do get it fully working, we'll get to make it work faster, clean it up 
> and and see how can we get code coverage.
>
> Unless there's an alternative to the whole framework that I'm not aware of?

I haven't read anything to this effect on any list.

> Surely for most of the positive paths, we can (and perhaps should) use the 
> the Gluster Ansible modules.
> Y.
>
> [1] https://github.com/mykaul/vg
> * with an intern's help.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] What is the current state of the Glusto test framework in upstream?

2019-03-12 Thread Sankarshan Mukhopadhyay
What I am essentially looking to understand is whether there are
regular Glusto runs and whether the tests receive refreshes. However,
if there is no available Glusto service running upstream - that is a
whole new conversation.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [Gluster-devel] 8/10 AWS jenkins builders disconnected

2019-03-06 Thread Sankarshan Mukhopadhyay
On Wed, Mar 6, 2019 at 8:47 PM Michael Scherer  wrote:
>
> Le mercredi 06 mars 2019 à 17:53 +0530, Sankarshan Mukhopadhyay a
> écrit :
> > On Wed, Mar 6, 2019 at 5:38 PM Deepshikha Khandelwal
> >  wrote:
> > >
> > > Hello,
> > >
> > > Today while debugging the centos7-regression failed builds I saw
> > > most of the builders did not pass the instance status check on AWS
> > > and were unreachable.
> > >
> > > Misc investigated this and came to know about the patch[1] which
> > > seems to break the builder one after the other. They all ran the
> > > regression test for this specific change before going offline.
> > > We suspect that this change do result in infinite loop of processes
> > > as we did not see any trace of error in the system logs.
> > >
> > > We did reboot all those builders and they all seem to be running
> > > fine now.
> > >
> >
> > The question though is - what to do about the patch, if the patch
> > itself is the root cause? Is this assigned to anyone to look into?
>
> We also pondered on wether we should protect the builder from that kind
> of issue. But since:
> - we are not sure that the hypothesis is right
> - any protection based on "limit the number of process" would surely
> sooner or later block legitimate tests, and requires adjustement (and
> likely investigation)
>
> we didn't choose to follow that road for now.
>

This is a good topic though. Is there any logical way to fence off the
builders from noisy neighbors?
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] Upgrade of the infra

2018-10-31 Thread Sankarshan Mukhopadhyay
On Wed, Oct 31, 2018 at 9:04 PM Michael Scherer  wrote:
>
> Hi,
>
> so F29 is out, and RHEL 7.6 too (so Centos 7.6 should be out soon). And
> Freebsd 12 is supposed to arrive in 1 month.
>
>
> On F29 side, I already did upgrade most of the infra, without service
> interruption thanks to proper HA. It was so fast that even nagios
> didn't ping me.
>

Thank you for the note and for taking care of this in a planned manner!

> What is left is the few Fedora builders (we need to decide if/when we
> want to upgrade).
>
> On the EL side, we will likely upgrade and reboot as soon as Centos is
> out. It will imply some downtime (neither gerrit nor jenkins are
> replicated), so we will have to coordinate that.
>
> On Freebsd side, we will likely upgrade, and switch the builder as non
> voting, unless I manage to find the time to install a new builder.
>
>
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
>

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [Gluster-devel] Weekly Untriaged Bugs

2018-09-30 Thread Sankarshan Mukhopadhyay
On Mon, Oct 1, 2018 at 7:15 AM  wrote:
>


> https://bugzilla.redhat.com/1625501 / project-infrastructure: gd2 smoke tests 
> fail with cannot create directory ‘/var/lib/glusterd’: Permission denied
> https://bugzilla.redhat.com/1626453 / project-infrastructure: 
> glusterd2-containers nightly job failing
> https://bugzilla.redhat.com/1627624 / project-infrastructure: Run gd2-smoke 
> only after smoke passes
> https://bugzilla.redhat.com/1631390 / project-infrastructure: Run smoke and 
> regression on a patch only after passing clang-format job
> https://bugzilla.redhat.com/1633497 / project-infrastructure: Unable to 
> subscribe to upstream mailing list.

Some of these have on-going commentary but are in NEW, perhaps they
should be in ASSIGNED.

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] Clang-format change postmortem (12th Sept, 2018)

2018-09-13 Thread Sankarshan Mukhopadhyay
On Wed, 12 Sep 2018 at 19:56, Amar Tumballi  wrote:

> People Involved
>
>- Nigel
>- Amar
>
> <https://hackmd.io/_zKtoPXUT7mDAyV6OxzZRQ?both#Timeline-of-events-in-IST>Timeline
> of events (in IST)
>
> 1725 - Nigel merges Amar’s patch with the .clang-format file
> 1727 - Nigel lands the .clang-format changes to master as gluster-ant.
> Smoke jobs pass at this point)
> 1746 - Amar notices that some files are missing in the clang patch.
> 1752 - Nigel lands a new patch with the missing files (all .c files)
> 1811 - Amar notices compilation issues after landing the .c changes
> because it modifes files with the pattern -tmpl.c. Amar starts working on
> a fix.
> 1839 - Nigel notices that the Jenkins job for clang-format doesn’t fail
> when it’s supposed to fail and goes to fix.
> 1855 - Clang-format Jenkins job is now fixed.
> 1906 - Amar’s fixes are merged with manual votes for Smoke and Centos
> Regression from the Infra team. At this point, the builds were passing, but
> we had voting issues
> <https://hackmd.io/_zKtoPXUT7mDAyV6OxzZRQ?both#What-Went-Wrong>What Went
> Wrong
>
>- We staged the changes on Github on Sept 10th. Given the size of
>changes, we we missed that the command used to make the changes only caught
> .h files and not .c files. The following is the command in question.
>find . -path ./contrib -prune -o -name '*.c' -or -name '*.h' -print |
>xargs clang-format -i
>- With the changes that we landed, we did run into build bugs 1
><https://review.gluster.org/#/c/21130/>, 2
><https://review.gluster.org/#/c/21128/> and fixed them. However, we
>did not verify that all the files were in fact modified or sync up on the
>find command.
>- We had a general framework of agreement on the steps to do but we
>looked at it as a code change rather than an infrastructure change. There
>wasn’t a well defined go/no-go checklist.
>
>
Thank you for writing this up and sharing. The “infrastructure piece” part
is a great realisation.

Can we start to use “retrospective” instead of post mortem?

>
>-
>- In the middle of this, we had a freebsd-builder enabled that made
>the smoke job for the final fix not vote. This needed a manual vote from
>the Infra team.
>
> <https://hackmd.io/_zKtoPXUT7mDAyV6OxzZRQ?both#What-Went-Well>What Went
> Well
>
>- We did reasonably good planning to find potential issues and in
>fact, did find some potential issues.
>- Nigel and Amar were on hand and available to fix any potential
>issues that popped up
>- The changes landed at the end of a working day for India the day
>before a public holiday. While there was impact, it was much less than a
>similar change performed at working hours.
>
> <https://hackmd.io/_zKtoPXUT7mDAyV6OxzZRQ?both#Future-recommendations>Future
> recommendations
>
>- Template files need to be caught by the clang-format job correctly
>so that they are not checked for formatting. Or the name of the file needs
>to be changed so that they don’t end with .c or .h.
>- In the future, high impact changes need a good process which has at
>least an acceptance criteria, go/no-go checklist, and a rollback procedure.
>- This work is currently incomplete and the bug3
><https://bugzilla.redhat.com/show_bug.cgi?id=1564149#c39> tracks the
>remaining action items.
>
> -----
>
> Thanks Nigel for the postmortem report.
>
>
> --
> Amar Tumballi (amarts)
> ___
> Gluster-infra mailing list
> Gluster-infra@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-infra

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] Portmortem for gluster jenkins disk full outage on the 15th of August

2018-08-15 Thread Sankarshan Mukhopadhyay
Thank you for (a) addressing the issue and (b) this write up

Does the -infra team have a way to monitor disk space usage?

On Wed, Aug 15, 2018 at 2:40 PM Michael Scherer  wrote:
>
> Hi folks,
>
> So Gluster jenkins disk was full today (cause outages do not respect
> public holiday in India (Independance day) and France(Assumption)),
> here is the post mortem for your reading pleasure
>
> Date: 15/08/2018
>
> Service affected:
>   Jenkins for Gluster (jenkins-el7.rht.gluster.org)
>
> Impact:
>
>   No jenkins job could be triggered.
>
> Root cause:
>
>   A disk full mainly because we got new jobs and more patches, so
> regular growth.
>
> Resolution:
>
>   Increased the disk by 30G, and investigating if cleanup could be
>   improved. This did require a reboot.
>
>
> Involved people:
> - misc
> - nigel
>
> Lessons learned
> - What went well:
>   - we had a documented process for that, and good enough to be used by
> a tired admin.
>
> - What went bad:
>   - we weren't proactive enough to see that before it caused a outage
>   - 15 of August is a holiday for both France and India. Technically,
> none of the infra team should have been up.
>
> - When we were lucky
>   - It was a day off in India, so few people were affected, except
> folks who continue to work on days off
>   - Misc decided to go to work while being in Brno to take days off
> later
>
>
> Timeline (in UTC)
>
> - 05:58 Amar post a mail to say "smoke job fail" on gluster-infra:
> https://lists.gluster.org/pipermail/gluster-infra/2018-August/004795.ht
> ml
>
> - 06:23 Nigel ping Misc on Telegram to deal with it, since Nigel is
> away from laptop for Independence day celebration.
>
> - 06:24 Misc do not hear the ding since he is asleep
>
> - 06:55 Sankarshan open a bug on it, https://bugzilla.redhat.com/show_b
> ug.cgi?id=1616160
>
> - 06:56 Misc do not see the email since he is still asleep
>
> - 07:13 Misc wake up, see a blinking light on the phone and ponder
> about closing his eyes again. He look at it, and start to swear.
>
> - 07:14 Investigation reveal that Jenkins partition is full (100%). A
> quick investigation do not yield any particular issues. The Jenkins
> jobs are taking space and that's it.
>
> - 07:19 After discussion with Nigel, it is decided to increase the size
> of the partition. Misc take a look at it, try to increase without any
> luck. The server is rebooted in case that's what was needed. Still not
> enough.
>
> - 07:25 Misc go quickly shower to wake him up. The warm embrace of
> water make him remember that a documentation on that process do exist:
>
> https://gluster-infra-docs.readthedocs.io/procedures/resize_vm_partitio
> n.html
>
> - 07:30  Following the documentation, we discover that the hypervisor
> is now out of space for future increase. Looking at that will be done
> after the post mortem.
>
> - 07:37 Jenkins is being restarted, with more space, and seems to work
> ok.
>
> - 07:38 Misc rush to his hotel breakfast who close at 10.
>
> - 09:09 Post mortem is finished and being sent
>
>
> Action items:
> - (misc) see what can be done for myrmicinae (the hypervisor where
> jenkins is running) since there is no more space.
>
> Potential improvement to make:
> - we still need to have monitoring in place
> - we need to move munin in the internal lan for looking at the graph
> for jenkins
> - documentation regarding resizing could be clearer, notably on volume
> resizing part
>
>
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
>
> ___
> Gluster-infra mailing list
> Gluster-infra@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-infra



-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Looks like glusterfs's smoke job is not running for the patches posted

2018-08-15 Thread Sankarshan Mukhopadhyay
On Wed, Aug 15, 2018 at 11:28 AM Amar Tumballi  wrote:
>
> Not sure why even when I did 'recheck smoke' the job didn't get triggered.
>

Now at <https://bugzilla.redhat.com/show_bug.cgi?id=1616160>

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Setting up machines from softserve in under 5 mins

2018-08-13 Thread Sankarshan Mukhopadhyay
On Mon, Aug 13, 2018 at 3:38 PM Nigel Babu  wrote:
>
> Hello folks,
>
> Deepshikha did the work to make loaning a machine to running your regressions 
> on them faster a while ago. I've tested them a few times today to confirm it 
> works as expected. In the past, Softserve[1] machines would be a clean Centos 
> 7 image. Now, we have an image with all the dependencies installed and 
> *almost* setup to run regressions. It just needs a few steps run on them and 
> we have a simplified playbook that will setup *just* those steps. This brings 
> down the time from around 30 mins to setup a machine to less than 5 mins. The 
> instructions[2] are on the softserve wiki for now, but will move to the site 
> itself in the future.
>

This is neat! Good to see the continuing work on the softserve system.

> Please let us know if you face troubles by filing a bug.[3]
> [1]: https://softserve.gluster.org/
> [2]: 
> https://github.com/gluster/softserve/wiki/Running-Regressions-on-loaned-Softserve-instances
> [3]: 
> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS=project-infrastructure
>
> --
> nigelb



-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [automated-testing] Gerrit downtime on Aug 8, 2016

2018-07-27 Thread Sankarshan Mukhopadhyay
The staging URL seems to be missing from the note

On Fri, Jul 27, 2018 at 5:28 PM, Nigel Babu  wrote:
> Hello,
>
> It's been a while since we upgraded Gerrit. We plan to do a full upgrade and
> move to 2.15.3. Among other changes, this brings in the new PolyGerrit
> interface which brings significant frontend changes. You can take a look at
> how this would look on the staging site[1].
>
> ## Outage Window
> 0330 EDT to 0730 EDT
> 0730 UTC to 1130 UTC
> 1300 IST to 1700 IST
>
> The actual time needed for the upgrade is about than hour, but we want to
> keep a larger window open to rollback in the event of any problems during
> the upgrade.
>
> --
> nigelb

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [Gluster-devel] Weekly Untriaged Bugs

2018-07-08 Thread Sankarshan Mukhopadhyay
On Mon, Jul 9, 2018 at 7:15 AM,   wrote:

> https://bugzilla.redhat.com/1591276 / project-infrastructure: Automated way 
> to run coverity runs on project every week.
> https://bugzilla.redhat.com/1592736 / project-infrastructure: gerrit-stage 
> SSL cert expired
> https://bugzilla.redhat.com/1597543 / project-infrastructure: Install python3 
> on all the internal machines.
> https://bugzilla.redhat.com/1594857 / project-infrastructure: Make smoke runs 
> detect test cases added to patch
> https://bugzilla.redhat.com/1597731 / project-infrastructure: need 
> 'shellcheck' in smoke.
> https://bugzilla.redhat.com/1593414 / project-infrastructure: Retire/close 
> 3.10 and 4.0 gerrit dashboard
> https://bugzilla.redhat.com/1598326 / project-infrastructure: Setup CI for 
> gluster-block

At this point, is not Bhumika setting it up for the (sub) project? Or,
is this a tracker to federate it?

> https://bugzilla.redhat.com/1597351 / project-infrastructure: Smoke should 
> fail on commits against closed github issues
> https://bugzilla.redhat.com/1598328 / project-infrastructure: Unable to 
> subscribe to automated-testing mailing list

At least this last one seems to have active conversations, is it
expected to be in NEW?

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [Gluster-devel] Announcing Softserve- serve yourself a VM

2018-03-20 Thread Sankarshan Mukhopadhyay
On Tue, Mar 20, 2018 at 1:57 PM, Sanju Rakonde <srako...@redhat.com> wrote:

> I have a suggestion here. It will be good if we have a option like request
> for extension of the VM duration, and the option will be automatically
> activated after 3 hours of usage of VM. If somebody is using the VM after 3
> hours and they feel like they need it for 2 more hours they will request to
> extend the duration by 1 more hour. It will save the time of engineering
> since if a machine is expired, one has to configure the machine and all
> other stuff from the beginning.
>

As Nigel states later on, please file an issue. I fear that very soon
we will require an enhancement to the issue which might have to handle
how many times an extension can be sought. Supplementing it would be
another enhancement around how often can a particular individual
request for extension on a particular machine instance within the
space of a work week. And thus, at some point, a simple machine
instance allocation system would need to be backed by a decision
making system.

This was supposed to be an easy-to-use infrastructure, not where the
Infra team members chase down individuals who are not freeing up
machines for as high as 5 weeks (I am not going to publicly cite this
instance).

That being said, I think it is worth a moment to think about what is
happening with Softserve.

The system is intended to provide a set of finite metered resources
available to developers in order to debug issues. The number of
available machine instances are not high (at this point) and the
resource allocation during this initial period has been designed to be
this way so as to arrive at a better understanding of usage patterns.

A 4 hour allocation is half of a working day. If we do think that this
is not enough, we want to also think deeply about the following (a)
all our issues require more than a day’s worth of debugging (b)
perhaps we are unable to allocate a dedicated 4 hour block on one
topic ie. there are distractions (c) there will not be enough machine
instances to make available to a backlog of developers waiting to
start off with their work

Any one or, a combination of the above possibilities are larger
challenges which need to be addressed. I understand why the
very-short-term approach of “just extend the darned time slice and be
done” is conducive to getting things done. I’d rather ask ourselves -
why do we have all issues that consume multiple days and what can we
do to improve that. Because I am convinced that it surely isn’t
enjoyable to always budget for multiple days when something comes up
to debug, even if it is a race condition (as Atin mentions in his
response).




-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra

Re: [Gluster-infra] [Gluster-devel] Jenkins Issues this weekend and how we're solving them

2018-02-19 Thread Sankarshan Mukhopadhyay
On Mon, Feb 19, 2018 at 5:58 PM, Nithya Balachandran
 wrote:
>
>
> On 19 February 2018 at 13:12, Atin Mukherjee  wrote:
>>
>>
>>
>> On Mon, Feb 19, 2018 at 8:53 AM, Nigel Babu  wrote:
>>>
>>> Hello,
>>>
>>> As you all most likely know, we store the tarball of the binaries and
>>> core if there's a core during regression. Occasionally, we've introduced a
>>> bug in Gluster and this tar can take up a lot of space. This has happened
>>> recently with brick multiplex tests. The build-install tar takes up 25G,
>>> causing the machine to run out of space and continuously fail.
>>
>>
>> AFAIK, we don't have a .t file in upstream regression suits where hundreds
>> of volumes are created. With that scale and brick multiplexing enabled, I
>> can understand the core will be quite heavy loaded and may consume up to
>> this much of crazy amount of space. FWIW, can we first try to figure out
>> which test was causing this crash and see if running a gcore after a certain
>> steps in the tests do left us with a similar size of the core file? IOW,
>> have we actually seen such huge size of core file generated earlier? If not,
>> what changed because which we've started seeing this is something to be
>> invested on.
>
>
> We also need to check if this is only the core file that is causing the
> increase in size or whether there is something else that is taking up a lot
> of space.
>>
>>
>>>
>>>
>>> I've made some changes this morning. Right after we create the tarball,
>>> we'll delete all files in /archive that are greater than 1G. Please be aware
>>> that this means all large files including the newly created tarball will be
>>> deleted. You will have to work with the traceback on the Jenkins job.
>>
>>
>> We'd really need to first investigate on the average size of the core file
>> what we can get with when a system is running with brick multiplexing and
>> ongoing I/O. With out that immediately deleting the core files > 1G will
>> cause trouble to the developers in debugging genuine crashes as traceback
>> alone may not be sufficient.
>>

I'd like to echo what Nithya writes - instead of treating this
incident as an outlier, we might want to do further analysis. If this
has happened on a production system - there would be blood.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] What does the outcome of a build-test-deploy pipeline look like? [was: Re: [Gluster-users] Gluster Summit BOF - Testing]

2017-11-08 Thread Sankarshan Mukhopadhyay
I wanted to latch on to this thread to seek additional detail about
the build-test-deploy pipeline. There has been a number of
conversations; some headway and planning around the topic. What does
the team visualize as the outcome of the pipeline? How far away is the
project from delivering that outcome at a PR/nightly level?

The context to these questions are a need to ascertain when the
project would be able to transparently and publicly provide
information to the user community about the stability and state of a
build; including the coverage of features included in that particular
release as well as bugs which have been validated to be fixed.

The other part is of course around how to build more community around
this important part of the project and not have a limited number of
individuals (and an even fewer number of domain experts) try to solve
this. As we see participation in the community from organizations
other than RHT, how can they be involved in the process of improving,
extending, federating the usage of the test framework and tests?

On Wed, Nov 8, 2017 at 6:36 AM, Jonathan Holloway <jhollo...@redhat.com> wrote:
> Hi all,
>
> We had a BoF about Upstream Testing and increasing coverage.
>
> Discussion included:
>  - More docs on using the gluster-specific libraries.
>  - Templates, examples, and testcase scripts with common functionality as a 
> jumping off point to create a new test script.
>  - Reduce the number of systems required by existing libraries (but scale as 
> needed). e.g., two instead of eight.
>  - Providing scripts, etc. for leveraging Docker, Vagrant, virsh, etc. to 
> easily create test environment on laptops, workstations, servers.
>  - Access to logs for Jenkins tests.
>  - Access to systems for live debugging.
>  - What do we test? Maybe need to create upstream test plans.
>  - Discussion here on gluster-users and updates in testing section of 
> community meeting agenda.
>
> Since returning from Gluster Summit, some of these are already being worked 
> on. :-)
>
> Thank you to all the birds of a feather that participated in the discussion!!!
> Sweta, did I miss anything in that list?
>
> Cheers,
> Jonathan


-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Quarterly Infra Updates

2017-10-31 Thread Sankarshan Mukhopadhyay
On Mon, Oct 30, 2017 at 9:44 PM, Nigel Babu <nig...@redhat.com> wrote:

> Plans for this quarter:
> * Gerrit upgrade to 2.13.9 [DONE!]
> * Jenkins OS upgrade to CentOS 7
> * Get statedumps from aborted regression runs
> * Build Debian packages via Jenkins

How have we been building these until now?

> * Tweak Gerrit permissions
> * Finish the master pipeline job
> * Create the release pipeline job

What are the outcomes to result from these?

> * Get Glusto debugging setups.
> * Get CentOS regressions split into 10 chunks to reduce time.
>
> Michael and I will be working together from the Brno office this week. We
> want to get done with the Gerrit upgrade (DONE, yay!) and the Jenkins OS
> upgrade. The Jenkins upgrade will happen on the 1st of Nov as it's a holiday
> in India. We'll be taking a full 8-hour window to finish everything out. The
> Jenkins OS upgrade will clear out SSH access to everyone. If you have files
> in your home directory on the Jenkins server that you'd like to preserve,
> please do so now.

This might require a separate outage alert type email across the list(s).

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Request for right permission levels on gluster-block github project

2017-03-20 Thread Sankarshan Mukhopadhyay
On Mon, Mar 20, 2017 at 2:14 PM, Nigel Babu <nig...@redhat.com> wrote:
> Please file a bug.
>

I see this very frequently - what can be done to ensure that filing a
RHBZ is the first step towards a request?

> On Mon, Mar 20, 2017 at 1:42 PM, Prasanna Kalever <pkale...@redhat.com>
> wrote:
>>
>> Hi Infra-team,
>>
>>
>> As we choose github/issues as the board for bugs and other upstream
>> release tracking, I request for permissions on
>> https://github.com/gluster/gluster-block for activities like
>> assigning, creating labels on issues and tagging a release and other
>> set of access that a maintainer should get.
>>
>> Can someone please help with it ?
>>

-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Zuul?

2016-09-02 Thread Sankarshan Mukhopadhyay
On Fri, Sep 2, 2016 at 4:06 PM, Nigel Babu <nig...@redhat.com> wrote:
> 4. Zuul will run regressions in the order than patches received Code Review 
> +2.
>If they pass regression, Zuul will merge them into the branch requested. 
> The
>documentation[1] has a good visualization that will help.
> 5. If the regression fails, we can still do a retry. Zuul will retry the job 
> on
>top of the existing patch queue rather than in isolation.

Do we need to enhance/modify the existing regression tests as a
pre-requisite to adopting Zuul? It seems to me that this is a critical
topic which needs to be discussed prior to Zuul.




-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Zuul?

2016-09-02 Thread Sankarshan Mukhopadhyay
On Fri, Sep 2, 2016 at 3:27 PM, Kaushal M <kshlms...@gmail.com> wrote:
> I'd brought up Zuul a long while back. The opinion then was that,
> while a gatekeeper is nice, we didn't want to maintain anymore infra
> over what we had at the time. We tried to make Jenkins itself do the
> work, which hasn't succeeded as well as we hoped.
>
> With you being dedicated to maintain the infra, this will be a nice
> time to revisit/investigate Zuul again.

I'd propose that concerns of maintenance/administration be separated
from the value accrued by this move. This approach worked out well
during the JJB task.

So, a question for Nigel - when you propose Zuul - what is the flow
and benefits that you see being available to the project? Have you
previously worked with Zuul or, can cite situations where adoption of
Zuul has helped?


> On Fri, Sep 2, 2016 at 3:01 PM, Nigel Babu <nig...@redhat.com> wrote:
>> Hello,
>>
>> We've had master breaking twice in this week because we of when we run
>> regressions and how we merge. I think it's time we officially thought of 
>> moving
>> regressions as a gate controlld by Zuul. And Zuul will do the merge onto
>> the correct branch.
>>
>> This is me throwing the idea about to hear any negative thoughts, before I do
>> further investigation. What does everyone think about this?
>>
>> Note: I've purposefully not CC'd gluster-devel here because I'd rather go to
>> the full developer team with a proper plan.
>>
>> --
>> nigelb



-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] Idea: Failure Trends

2016-08-16 Thread Sankarshan Mukhopadhyay
On Wed, Aug 17, 2016 at 6:33 AM, Nigel Babu <nig...@redhat.com> wrote:
> Instead, I propose a small website that'll track failures week-on-week and
> trends of various failures over time. If a particular failure has increased in
> number over one week, we'll know something has gone wrong.
>
> What does everyone think of this before I actually go and write some code?

Sounds like a jolly good thing to have.


-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] All rpm jobs are now in jenkins job builder

2016-06-29 Thread Sankarshan Mukhopadhyay
The detail in this response is much appreciated. Thank you.

On Wed, Jun 29, 2016 at 5:25 PM, Kaleb S. KEITHLEY <kkeit...@redhat.com> wrote:
> For Fedora/RHEL/CentOS rpms:
>
>  + People file bugs, e.g. against Fedora/glusterfs or
> GlusterFS/packaging, and we fix them.
>
>  + Some of us  occasionally do package reviews for new packages and/or
> closely follow the Fedora packaging guidelines and when we see something
> that ought to be done in the glusterfs packaging we fix it.
>
>  + Every once in a while I run rpmlint and address the things it finds.
>
>  + And of course we get feedback from downstream packaging.
>
>  + Finally, I  keep the Fedora dist-git .spec and our upstream .spec in
> sync.
>
> For SuSE RPMs I use a .spec file based on the one that SuSE uses/used
> for their distribution's bundled packages.
>
> For Debian/Ubuntu debs I use packaging bits provided by Louis Zuckerman
> (irc nick: semiosis) that he developed with, I believe, the help of
> Patrick Matthaei, the Debian packager who builds Debian's bundled
> packages. Resyncing with Patrick's packaging bits is on my list of
> things to do in my copious spare time. In the mean time people
> occasionally report issues with the debs and I fix them.




-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] All rpm jobs are now in jenkins job builder

2016-06-28 Thread Sankarshan Mukhopadhyay
On Tue, Jun 28, 2016 at 9:45 PM, Niels de Vos <nde...@redhat.com> wrote:
> Coincidentally I've asked Humble about the option to provide a container
> (and maybe VM) image through the CentOS Storage SIG. Just as with the
> packages, we should try to utilize the integration with different
> distributions.

I agree. Containers (and even VM) images are build-time artifacts
which we should produce and make available in a regular manner.

I have a follow-up question on the production of these artifacts -
when do we check whether the RPMs or, the images produced are sane?
For example, that the RPMs are packaged well and as per specifications
...




-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] All rpm jobs are now in jenkins job builder

2016-06-28 Thread Sankarshan Mukhopadhyay
On Tue, Jun 28, 2016 at 5:54 PM, Nigel Babu <nig...@redhat.com> wrote:
> All the rpm jobs are now in Jenkins job builder format. I've disabled the
> old jobs and replaced them with new ones:
>
> glusterfs-rpm -> rpm-fedora
> glusterfs-rpm-el6 -> rpm-el6
> glusterfs-devrpm -> devrpm-fedora
> glusterfs-devrpm-el6 -> devrpm-el6
> glusterfs-devrpm-el7 -> devrpm-el7

I wonder if it would be worthwhile to have a job which produces a
container image for a build. The availability of such a build would be
useful in generating a bit more interest around the container(ized)
Gluster story. I recollect that Humble was maintaining the build
script and such which lead to a container (ready for push into the
registry)



-- 
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra


Re: [Gluster-infra] [Gluster-Maintainers] Gluster community and external test infra

2016-02-10 Thread Sankarshan Mukhopadhyay
On Wed, Feb 10, 2016 at 6:45 PM, Kaushal M  wrote:
> On Wed, Feb 10, 2016 at 6:29 PM, Raghavendra Talur  wrote:
>> Hi,
>>
>> I read about openstack community and how they allow external test infra to
>> report back on the main CI.
>>
>> Read more here:
>> https://www.mirantis.com/blog/setting-external-openstack-testing-system-part-1/
>>

It would be relevant to highlight "In short, an external testing
platform enables third parties to run tests — ostensibly against an
OpenStack environment that is configured with that third party’s
drivers or hardware — and report the results of those tests on the
code review of a proposed patch."

The 3rd party integration and scenario testing path works for a
project like OpenStack because there are (i) such drivers from third
parties which need to be merged into upstream/mainline after sanity
checks and (ii) the distributions/flavors of OpenStack would need
testing as well.

>> I have setup a jenkins server in RedHat blr lab which has 27 slaves.
>> Also, the jenkins server is intelligent enough to distribute tests across 27
>> nodes for the same patch set so that the run completes in ~30mins.
>>
>>
>> Now onto the questions:
>> 1. Would gluster community accept a +1 from this jenkins, if we follow
>> similar policy like openstack does with external test infra?
>> 2. Would RedHat allow a such thing if it is promised that trigger won't
>> happen from Gerrit server but manually from within the VPN of RedHat?
>>

This harness could also reside in any world-available service
instances, if they are available and maintained. It would be
appropriate to check with the specific teams dealing with instances of
internal hosted services reaching out to externally available ones
before arriving at a decision.

>> NOTE: This mail is sent to maintain...@gluster.org keeping in view the
>> security concerns before asking opinion from wider audience. Also don't
>> reply back with any RedHat confidential informantion.

> +gluster-infra

Perhaps Michael or, anyone else could provide a preliminary guidance.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra