Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Wed, Oct 3, 2018 at 9:28 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Wed, Oct 3, 2018 at 9:04 PM Shyam Ranganathan 
> wrote:
>
>> On 10/03/2018 11:32 AM, Pranith Kumar Karampuri wrote:
>> >
>> >
>> > On Wed, Oct 3, 2018 at 8:50 PM Shyam Ranganathan > > > wrote:
>> >
>> > On 10/03/2018 11:16 AM, Pranith Kumar Karampuri wrote:
>> > > Once we have distributed tests running, such that overall
>> > regression
>> > > time is reduced, we can possibly tackle removing retries for
>> > tests, and
>> > > then getting to a more stringent recheck process/tooling. The
>> > reason
>> > > being, we now run to completion and that takes quite a bit of
>> > time, so
>> > > at this juncture removing retry is not practical, but we
>> > should get
>> > > there (soon?).
>> > >
>> > >
>> > > I agree with you about removing retry. I didn't understand why
>> recheck
>> > > nudging developers has to be post-poned till distributed
>> regression
>> > > tests comes into picture. My thinking is that it is more
>> important to
>> > > have it when tests take longer.
>> >
>> > Above is only retry specific, not recheck specific, as in "we can
>> > possibly tackle removing retries for tests"
>> >
>> > But also reiterating this is orthogonal to the lock down needs
>> discussed
>> > here.
>> >
>> >
>> > As per my understanding the reason why lock down is happening because no
>> > one makes any noise about the failures that they are facing as and when
>> > it happens, and it doesn't get conveyed on gluster-devel. So is there
>> > any reason why you think it is orthogonal considering it is contributing
>> > directly to the problem that we are discussing on this thread?
>>
>> Taking steps to ensure quality is maintained is going to reduce
>> instances of lock down, hence orthogonal.
>>
>
> Purpose of my responses has been to prevent a lock down because I believe
> the existing process of locking down the complete branch doesn't change
> behaviors of developers to prevent a lock down. The process seems to
> reinforce it. Hence I raised it on this thread, because it is contributing
> to it. It doesn't look like the discussion went to a logical end. I still
> don't know what are the new actions we are taking to prevent lock down from
> happening. What are they?
>

So far the responses from Atin/Nigel/Shyam have been helpful in shaping my
understanding of the problem and I documented an automated way to solve
this problem at https://bugzilla.redhat.com/show_bug.cgi?id=1635826


>
>
>>
>> >
>> > --
>> > Pranith
>>
>
>
> --
> Pranith
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Wed, Oct 3, 2018 at 9:04 PM Shyam Ranganathan 
wrote:

> On 10/03/2018 11:32 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > On Wed, Oct 3, 2018 at 8:50 PM Shyam Ranganathan  > > wrote:
> >
> > On 10/03/2018 11:16 AM, Pranith Kumar Karampuri wrote:
> > > Once we have distributed tests running, such that overall
> > regression
> > > time is reduced, we can possibly tackle removing retries for
> > tests, and
> > > then getting to a more stringent recheck process/tooling. The
> > reason
> > > being, we now run to completion and that takes quite a bit of
> > time, so
> > > at this juncture removing retry is not practical, but we
> > should get
> > > there (soon?).
> > >
> > >
> > > I agree with you about removing retry. I didn't understand why
> recheck
> > > nudging developers has to be post-poned till distributed regression
> > > tests comes into picture. My thinking is that it is more important
> to
> > > have it when tests take longer.
> >
> > Above is only retry specific, not recheck specific, as in "we can
> > possibly tackle removing retries for tests"
> >
> > But also reiterating this is orthogonal to the lock down needs
> discussed
> > here.
> >
> >
> > As per my understanding the reason why lock down is happening because no
> > one makes any noise about the failures that they are facing as and when
> > it happens, and it doesn't get conveyed on gluster-devel. So is there
> > any reason why you think it is orthogonal considering it is contributing
> > directly to the problem that we are discussing on this thread?
>
> Taking steps to ensure quality is maintained is going to reduce
> instances of lock down, hence orthogonal.
>

Purpose of my responses has been to prevent a lock down because I believe
the existing process of locking down the complete branch doesn't change
behaviors of developers to prevent a lock down. The process seems to
reinforce it. Hence I raised it on this thread, because it is contributing
to it. It doesn't look like the discussion went to a logical end. I still
don't know what are the new actions we are taking to prevent lock down from
happening. What are they?


>
> >
> > --
> > Pranith
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Shyam Ranganathan
On 10/03/2018 11:32 AM, Pranith Kumar Karampuri wrote:
> 
> 
> On Wed, Oct 3, 2018 at 8:50 PM Shyam Ranganathan  > wrote:
> 
> On 10/03/2018 11:16 AM, Pranith Kumar Karampuri wrote:
> >     Once we have distributed tests running, such that overall
> regression
> >     time is reduced, we can possibly tackle removing retries for
> tests, and
> >     then getting to a more stringent recheck process/tooling. The
> reason
> >     being, we now run to completion and that takes quite a bit of
> time, so
> >     at this juncture removing retry is not practical, but we
> should get
> >     there (soon?).
> >
> >
> > I agree with you about removing retry. I didn't understand why recheck
> > nudging developers has to be post-poned till distributed regression
> > tests comes into picture. My thinking is that it is more important to
> > have it when tests take longer.
> 
> Above is only retry specific, not recheck specific, as in "we can
> possibly tackle removing retries for tests"
> 
> But also reiterating this is orthogonal to the lock down needs discussed
> here.
> 
> 
> As per my understanding the reason why lock down is happening because no
> one makes any noise about the failures that they are facing as and when
> it happens, and it doesn't get conveyed on gluster-devel. So is there
> any reason why you think it is orthogonal considering it is contributing
> directly to the problem that we are discussing on this thread?

Taking steps to ensure quality is maintained is going to reduce
instances of lock down, hence orthogonal.

> 
> -- 
> Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Wed, Oct 3, 2018 at 8:50 PM Shyam Ranganathan 
wrote:

> On 10/03/2018 11:16 AM, Pranith Kumar Karampuri wrote:
> > Once we have distributed tests running, such that overall regression
> > time is reduced, we can possibly tackle removing retries for tests,
> and
> > then getting to a more stringent recheck process/tooling. The reason
> > being, we now run to completion and that takes quite a bit of time,
> so
> > at this juncture removing retry is not practical, but we should get
> > there (soon?).
> >
> >
> > I agree with you about removing retry. I didn't understand why recheck
> > nudging developers has to be post-poned till distributed regression
> > tests comes into picture. My thinking is that it is more important to
> > have it when tests take longer.
>
> Above is only retry specific, not recheck specific, as in "we can
> possibly tackle removing retries for tests"
>
> But also reiterating this is orthogonal to the lock down needs discussed
> here.
>

As per my understanding the reason why lock down is happening because no
one makes any noise about the failures that they are facing as and when it
happens, and it doesn't get conveyed on gluster-devel. So is there any
reason why you think it is orthogonal considering it is contributing
directly to the problem that we are discussing on this thread?

-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Shyam Ranganathan
On 10/03/2018 11:16 AM, Pranith Kumar Karampuri wrote:
> Once we have distributed tests running, such that overall regression
> time is reduced, we can possibly tackle removing retries for tests, and
> then getting to a more stringent recheck process/tooling. The reason
> being, we now run to completion and that takes quite a bit of time, so
> at this juncture removing retry is not practical, but we should get
> there (soon?).
> 
> 
> I agree with you about removing retry. I didn't understand why recheck
> nudging developers has to be post-poned till distributed regression
> tests comes into picture. My thinking is that it is more important to
> have it when tests take longer.

Above is only retry specific, not recheck specific, as in "we can
possibly tackle removing retries for tests"

But also reiterating this is orthogonal to the lock down needs discussed
here.
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Wed, Oct 3, 2018 at 7:02 PM Shyam Ranganathan 
wrote:

> On 10/03/2018 05:36 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > On Thu, Sep 27, 2018 at 8:18 PM Shyam Ranganathan  > > wrote:
> >
> > On 09/27/2018 10:05 AM, Atin Mukherjee wrote:
> > > Now does this mean we block commit rights for component Y
> till
> > > we have the root cause?
> > >
> > >
> > > It was a way of making it someone's priority. If you have
> another
> > > way to make it someone's priority that is better than this,
> please
> > > suggest and we can have a discussion around it and agree on it
> > :-).
> > >
> > >
> > > This is what I can think of:
> > >
> > > 1. Component peers/maintainers take a first triage of the test
> > failure.
> > > Do the initial debugging and (a) point to the component which needs
> > > further debugging or (b) seek for help at gluster-devel ML for
> > > additional insight for identifying the problem and narrowing down
> to a
> > > component.
> > > 2. If it’s (1 a) then we already know the component and the owner.
> If
> > > it’s (2 b) at this juncture, it’s all maintainers responsibility to
> > > ensure the email is well understood and based on the available
> details
> > > the ownership is picked up by respective maintainers. It might be
> also
> > > needed that multiple maintainers might have to be involved and
> this is
> > > why I focus on this as a group effort than individual one.
> >
> > In my thinking, acting as a group here is better than making it a
> > sub-groups/individuals responsibility. Which has been put forth by
> Atin
> > (IMO) well. Thus, keep the merge rights out for all (of course some
> > still need to have it), and get the situation addressed is better.
> >
> >
> > In my experience, it has been rather difficult for developers without
> > domain expertise to solve the problem (at least on the components I am
> > maintaining), so the reality is that not everyone may be able to solve
> > the issues on all the components where the problem is observed. May be
> > you mean we need more participation  when you say we need to act as a
> > group, so with that assumption one way to make that happen is to change
> > the workflow around 'recheck centos'. In my thinking following the tools
> > shouldn't lead to less participation on gluster-devel where developers
> > can just do recheck-centos until the test passes and be done. So maybe
> > tooling should encourage participation. Maybe something like 'recheck
> > centos ' This is
> > just an idea, thoughts are welcome.
>
> I agree, any recheck should have enough reason behind it to state why
> the recheck is being attempted, and what the failures were, which are
> deemed spurious or otherwise to require a recheck.
>
> The manner of enforcing the same is not present yet, and is possibly an
> orthogonal discussion to the one here.
>
> The recheck stringency (and I would add even the retry a test if it
> fails once should be removed), will aid in getting to less frequent
> breakage in nightly, as more effort is put into correcting the tests or
> fixing the code around the same.
>
> Once we have distributed tests running, such that overall regression
> time is reduced, we can possibly tackle removing retries for tests, and
> then getting to a more stringent recheck process/tooling. The reason
> being, we now run to completion and that takes quite a bit of time, so
> at this juncture removing retry is not practical, but we should get
> there (soon?).
>

I agree with you about removing retry. I didn't understand why recheck
nudging developers has to be post-poned till distributed regression tests
comes into picture. My thinking is that it is more important to have it
when tests take longer.

>
>
> >
> >
> > ___
> > maintainers mailing list
> > maintainers@gluster.org 
> > https://lists.gluster.org/mailman/listinfo/maintainers
> >
> >
> >
> > --
> > Pranith
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Shyam Ranganathan
On 10/03/2018 05:36 AM, Pranith Kumar Karampuri wrote:
> 
> 
> On Thu, Sep 27, 2018 at 8:18 PM Shyam Ranganathan  > wrote:
> 
> On 09/27/2018 10:05 AM, Atin Mukherjee wrote:
> >         Now does this mean we block commit rights for component Y till
> >         we have the root cause?
> >
> >
> >     It was a way of making it someone's priority. If you have another
> >     way to make it someone's priority that is better than this, please
> >     suggest and we can have a discussion around it and agree on it
> :-).
> >
> >
> > This is what I can think of:
> >
> > 1. Component peers/maintainers take a first triage of the test
> failure.
> > Do the initial debugging and (a) point to the component which needs
> > further debugging or (b) seek for help at gluster-devel ML for
> > additional insight for identifying the problem and narrowing down to a
> > component. 
> > 2. If it’s (1 a) then we already know the component and the owner. If
> > it’s (2 b) at this juncture, it’s all maintainers responsibility to
> > ensure the email is well understood and based on the available details
> > the ownership is picked up by respective maintainers. It might be also
> > needed that multiple maintainers might have to be involved and this is
> > why I focus on this as a group effort than individual one.
> 
> In my thinking, acting as a group here is better than making it a
> sub-groups/individuals responsibility. Which has been put forth by Atin
> (IMO) well. Thus, keep the merge rights out for all (of course some
> still need to have it), and get the situation addressed is better.
> 
> 
> In my experience, it has been rather difficult for developers without
> domain expertise to solve the problem (at least on the components I am
> maintaining), so the reality is that not everyone may be able to solve
> the issues on all the components where the problem is observed. May be
> you mean we need more participation  when you say we need to act as a
> group, so with that assumption one way to make that happen is to change
> the workflow around 'recheck centos'. In my thinking following the tools
> shouldn't lead to less participation on gluster-devel where developers
> can just do recheck-centos until the test passes and be done. So maybe
> tooling should encourage participation. Maybe something like 'recheck
> centos ' This is
> just an idea, thoughts are welcome.

I agree, any recheck should have enough reason behind it to state why
the recheck is being attempted, and what the failures were, which are
deemed spurious or otherwise to require a recheck.

The manner of enforcing the same is not present yet, and is possibly an
orthogonal discussion to the one here.

The recheck stringency (and I would add even the retry a test if it
fails once should be removed), will aid in getting to less frequent
breakage in nightly, as more effort is put into correcting the tests or
fixing the code around the same.

Once we have distributed tests running, such that overall regression
time is reduced, we can possibly tackle removing retries for tests, and
then getting to a more stringent recheck process/tooling. The reason
being, we now run to completion and that takes quite a bit of time, so
at this juncture removing retry is not practical, but we should get
there (soon?).

>  
> 
> ___
> maintainers mailing list
> maintainers@gluster.org 
> https://lists.gluster.org/mailman/listinfo/maintainers
> 
> 
> 
> -- 
> Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Fri, Sep 28, 2018 at 9:56 AM Amar Tumballi  wrote:

> Top posting as I am not trying to answer any individual points!
>
> It is my wish that we don't get into lock down state! But, there may be
> times when it is needed! My take is, we will go with an approach which
> works for majority of the cases, and when we get to it 1-2 times, lets do
> another retrospective of events happened during the time when there was a
> lock-down, and then improve further. Planning too much for future won't get
> us any value at this time. We have bi-weekly maintainer meetings, where we
> can propose changes, and get to solutions. None of this is written in
> stone, so lets move on :-)
>

I think the discussion has been productive so far. There are some good
suggestions. So let us come to some conclusion and move ahead.


>
> -Amar
>
>
> On Thu, Sep 27, 2018 at 8:18 PM Shyam Ranganathan 
> wrote:
>
>> On 09/27/2018 10:05 AM, Atin Mukherjee wrote:
>> > Now does this mean we block commit rights for component Y till
>> > we have the root cause?
>> >
>> >
>> > It was a way of making it someone's priority. If you have another
>> > way to make it someone's priority that is better than this, please
>> > suggest and we can have a discussion around it and agree on it :-).
>> >
>> >
>> > This is what I can think of:
>> >
>> > 1. Component peers/maintainers take a first triage of the test failure.
>> > Do the initial debugging and (a) point to the component which needs
>> > further debugging or (b) seek for help at gluster-devel ML for
>> > additional insight for identifying the problem and narrowing down to a
>> > component.
>> > 2. If it’s (1 a) then we already know the component and the owner. If
>> > it’s (2 b) at this juncture, it’s all maintainers responsibility to
>> > ensure the email is well understood and based on the available details
>> > the ownership is picked up by respective maintainers. It might be also
>> > needed that multiple maintainers might have to be involved and this is
>> > why I focus on this as a group effort than individual one.
>>
>> In my thinking, acting as a group here is better than making it a
>> sub-groups/individuals responsibility. Which has been put forth by Atin
>> (IMO) well. Thus, keep the merge rights out for all (of course some
>> still need to have it), and get the situation addressed is better.
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> https://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
> --
> Amar Tumballi (amarts)
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Thu, Sep 27, 2018 at 8:18 PM Shyam Ranganathan 
wrote:

> On 09/27/2018 10:05 AM, Atin Mukherjee wrote:
> > Now does this mean we block commit rights for component Y till
> > we have the root cause?
> >
> >
> > It was a way of making it someone's priority. If you have another
> > way to make it someone's priority that is better than this, please
> > suggest and we can have a discussion around it and agree on it :-).
> >
> >
> > This is what I can think of:
> >
> > 1. Component peers/maintainers take a first triage of the test failure.
> > Do the initial debugging and (a) point to the component which needs
> > further debugging or (b) seek for help at gluster-devel ML for
> > additional insight for identifying the problem and narrowing down to a
> > component.
> > 2. If it’s (1 a) then we already know the component and the owner. If
> > it’s (2 b) at this juncture, it’s all maintainers responsibility to
> > ensure the email is well understood and based on the available details
> > the ownership is picked up by respective maintainers. It might be also
> > needed that multiple maintainers might have to be involved and this is
> > why I focus on this as a group effort than individual one.
>
> In my thinking, acting as a group here is better than making it a
> sub-groups/individuals responsibility. Which has been put forth by Atin
> (IMO) well. Thus, keep the merge rights out for all (of course some
> still need to have it), and get the situation addressed is better.
>

In my experience, it has been rather difficult for developers without
domain expertise to solve the problem (at least on the components I am
maintaining), so the reality is that not everyone may be able to solve the
issues on all the components where the problem is observed. May be you mean
we need more participation  when you say we need to act as a group, so with
that assumption one way to make that happen is to change the workflow
around 'recheck centos'. In my thinking following the tools shouldn't lead
to less participation on gluster-devel where developers can just do
recheck-centos until the test passes and be done. So maybe tooling should
encourage participation. Maybe something like 'recheck centos
' This is just an
idea, thoughts are welcome.


> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-10-03 Thread Pranith Kumar Karampuri
On Thu, Sep 27, 2018 at 7:35 PM Atin Mukherjee  wrote:

>
>
> On Thu, 27 Sep 2018 at 18:27, Pranith Kumar Karampuri 
> wrote:
>
>>
>>
>> On Thu, Sep 27, 2018 at 5:27 PM Atin Mukherjee 
>> wrote:
>>
>>> tests/bugs//xxx.t failing can’t always mean there’s a bug
>>> in component Y.
>>>
>>
>> I agree.
>>
>>
>>> It could be anywhere till we root cause the problem.
>>>
>>
>> Some one needs to step in to find out what the root cause is. I agree
>> that for a component like glusterd bugs in other components can easily lead
>> to failures. How do we make sure that someone takes a look at it?
>>
>>
>>> Now does this mean we block commit rights for component Y till we have
>>> the root cause?
>>>
>>
>> It was a way of making it someone's priority. If you have another way to
>> make it someone's priority that is better than this, please suggest and we
>> can have a discussion around it and agree on it :-).
>>
>
> This is what I can think of:
>
> 1. Component peers/maintainers take a first triage of the test failure. Do
> the initial debugging and (a) point to the component which needs further
> debugging or (b) seek for help at gluster-devel ML for additional insight
> for identifying the problem and narrowing down to a component.
>
2. If it’s (1 a) then we already know the component and the owner. If it’s
> (2 b) at this juncture, it’s all maintainers responsibility to ensure the
> email is well understood and based on the available details the ownership
> is picked up by respective maintainers. It might be also needed that
> multiple maintainers might have to be involved and this is why I focus on
> this as a group effort than individual one.
>

I like this approach. One question here regarding (1a), do you guys think
'recheck centos' workflow is preventing folks from posting it on
gluster-devel? I understand there is fstat for people to get the data about
what are the tests that are failing a lot. But considering it didn't lead
to the behavior where the failures are addressed on timely basis should we
change it? I generally see more activity on gluster-devel/users and github
in general from community when issues are posted. What do you guys think?


>
>
>>
>>
>>> That doesn’t make much sense right? This is one of the reasons in such
>>> case we need to work as a group, figure out the problem and fix it, till
>>> then locking down the entire repo for further commits look a better option
>>> (IMHO).
>>>
>>
>> Let us dig deeper into what happens when we work as a group, in general
>> it will be one person who will take the lead and get help. Is there a way
>> to find that person without locking down whole master? If there is, we may
>> never have to get to a place where we lock down master completely. We may
>> not even have to lock down components. Suggestions are welcome.
>>
>>
>>> On Thu, 27 Sep 2018 at 14:04, Nigel Babu  wrote:
>>>
 We know maintainers of the components which are leading to repeated
> failures in that component and we just need to do the same thing we did to
> remove commit access for the maintainer of the component instead of all of
> the people. So in that sense it is not good faith and can be enforced.
>

 Pranith, I believe the difference of opinion is because you're looking
 at this problem in terms of "who" rather than "what". We do not care about
 *who* broke master. Removing commit access from a component owner doesn't
 stop someone else from landing a patch will create a failure in the same
 component or even a different component. We cannot stop patches from
 landing because it touches a specific component. And even if we could, our
 components are not entirely independent of each other. There could still be
 failures. This is a common scenario and it happened the last time we had to
 close master. Let me further re-emphasize our goals:

 * When master is broken, every team member's energy needs to be focused
 on getting master to green. Who broke the build isn't a concern as much as
 *the build is broken*. This is not a situation to punish specific people.
 * If we allow other commits to land, we run the risk of someone else
 breaking master with a different patch. Now we have two failures to debug
 and fix.
 ___
 maintainers mailing list
 maintainers@gluster.org
 https://lists.gluster.org/mailman/listinfo/maintainers

>>> --
>>> - Atin (atinm)
>>>
>>
>>
>> --
>> Pranith
>>
> --
> - Atin (atinm)
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Amar Tumballi
Top posting as I am not trying to answer any individual points!

It is my wish that we don't get into lock down state! But, there may be
times when it is needed! My take is, we will go with an approach which
works for majority of the cases, and when we get to it 1-2 times, lets do
another retrospective of events happened during the time when there was a
lock-down, and then improve further. Planning too much for future won't get
us any value at this time. We have bi-weekly maintainer meetings, where we
can propose changes, and get to solutions. None of this is written in
stone, so lets move on :-)

-Amar


On Thu, Sep 27, 2018 at 8:18 PM Shyam Ranganathan 
wrote:

> On 09/27/2018 10:05 AM, Atin Mukherjee wrote:
> > Now does this mean we block commit rights for component Y till
> > we have the root cause?
> >
> >
> > It was a way of making it someone's priority. If you have another
> > way to make it someone's priority that is better than this, please
> > suggest and we can have a discussion around it and agree on it :-).
> >
> >
> > This is what I can think of:
> >
> > 1. Component peers/maintainers take a first triage of the test failure.
> > Do the initial debugging and (a) point to the component which needs
> > further debugging or (b) seek for help at gluster-devel ML for
> > additional insight for identifying the problem and narrowing down to a
> > component.
> > 2. If it’s (1 a) then we already know the component and the owner. If
> > it’s (2 b) at this juncture, it’s all maintainers responsibility to
> > ensure the email is well understood and based on the available details
> > the ownership is picked up by respective maintainers. It might be also
> > needed that multiple maintainers might have to be involved and this is
> > why I focus on this as a group effort than individual one.
>
> In my thinking, acting as a group here is better than making it a
> sub-groups/individuals responsibility. Which has been put forth by Atin
> (IMO) well. Thus, keep the merge rights out for all (of course some
> still need to have it), and get the situation addressed is better.
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>


-- 
Amar Tumballi (amarts)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Shyam Ranganathan
On 09/27/2018 10:05 AM, Atin Mukherjee wrote:
> Now does this mean we block commit rights for component Y till
> we have the root cause? 
> 
> 
> It was a way of making it someone's priority. If you have another
> way to make it someone's priority that is better than this, please
> suggest and we can have a discussion around it and agree on it :-).
> 
> 
> This is what I can think of:
> 
> 1. Component peers/maintainers take a first triage of the test failure.
> Do the initial debugging and (a) point to the component which needs
> further debugging or (b) seek for help at gluster-devel ML for
> additional insight for identifying the problem and narrowing down to a
> component. 
> 2. If it’s (1 a) then we already know the component and the owner. If
> it’s (2 b) at this juncture, it’s all maintainers responsibility to
> ensure the email is well understood and based on the available details
> the ownership is picked up by respective maintainers. It might be also
> needed that multiple maintainers might have to be involved and this is
> why I focus on this as a group effort than individual one.

In my thinking, acting as a group here is better than making it a
sub-groups/individuals responsibility. Which has been put forth by Atin
(IMO) well. Thus, keep the merge rights out for all (of course some
still need to have it), and get the situation addressed is better.
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Atin Mukherjee
On Thu, 27 Sep 2018 at 18:27, Pranith Kumar Karampuri 
wrote:

>
>
> On Thu, Sep 27, 2018 at 5:27 PM Atin Mukherjee 
> wrote:
>
>> tests/bugs//xxx.t failing can’t always mean there’s a bug in
>> component Y.
>>
>
> I agree.
>
>
>> It could be anywhere till we root cause the problem.
>>
>
> Some one needs to step in to find out what the root cause is. I agree that
> for a component like glusterd bugs in other components can easily lead to
> failures. How do we make sure that someone takes a look at it?
>
>
>> Now does this mean we block commit rights for component Y till we have
>> the root cause?
>>
>
> It was a way of making it someone's priority. If you have another way to
> make it someone's priority that is better than this, please suggest and we
> can have a discussion around it and agree on it :-).
>

This is what I can think of:

1. Component peers/maintainers take a first triage of the test failure. Do
the initial debugging and (a) point to the component which needs further
debugging or (b) seek for help at gluster-devel ML for additional insight
for identifying the problem and narrowing down to a component.
2. If it’s (1 a) then we already know the component and the owner. If it’s
(2 b) at this juncture, it’s all maintainers responsibility to ensure the
email is well understood and based on the available details the ownership
is picked up by respective maintainers. It might be also needed that
multiple maintainers might have to be involved and this is why I focus on
this as a group effort than individual one.


>
>
>> That doesn’t make much sense right? This is one of the reasons in such
>> case we need to work as a group, figure out the problem and fix it, till
>> then locking down the entire repo for further commits look a better option
>> (IMHO).
>>
>
> Let us dig deeper into what happens when we work as a group, in general it
> will be one person who will take the lead and get help. Is there a way to
> find that person without locking down whole master? If there is, we may
> never have to get to a place where we lock down master completely. We may
> not even have to lock down components. Suggestions are welcome.
>
>
>> On Thu, 27 Sep 2018 at 14:04, Nigel Babu  wrote:
>>
>>> We know maintainers of the components which are leading to repeated
 failures in that component and we just need to do the same thing we did to
 remove commit access for the maintainer of the component instead of all of
 the people. So in that sense it is not good faith and can be enforced.

>>>
>>> Pranith, I believe the difference of opinion is because you're looking
>>> at this problem in terms of "who" rather than "what". We do not care about
>>> *who* broke master. Removing commit access from a component owner doesn't
>>> stop someone else from landing a patch will create a failure in the same
>>> component or even a different component. We cannot stop patches from
>>> landing because it touches a specific component. And even if we could, our
>>> components are not entirely independent of each other. There could still be
>>> failures. This is a common scenario and it happened the last time we had to
>>> close master. Let me further re-emphasize our goals:
>>>
>>> * When master is broken, every team member's energy needs to be focused
>>> on getting master to green. Who broke the build isn't a concern as much as
>>> *the build is broken*. This is not a situation to punish specific people.
>>> * If we allow other commits to land, we run the risk of someone else
>>> breaking master with a different patch. Now we have two failures to debug
>>> and fix.
>>> ___
>>> maintainers mailing list
>>> maintainers@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/maintainers
>>>
>> --
>> - Atin (atinm)
>>
>
>
> --
> Pranith
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Pranith Kumar Karampuri
On Thu, Sep 27, 2018 at 5:27 PM Atin Mukherjee  wrote:

> tests/bugs//xxx.t failing can’t always mean there’s a bug in
> component Y.
>

I agree.


> It could be anywhere till we root cause the problem.
>

Some one needs to step in to find out what the root cause is. I agree that
for a component like glusterd bugs in other components can easily lead to
failures. How do we make sure that someone takes a look at it?


> Now does this mean we block commit rights for component Y till we have the
> root cause?
>

It was a way of making it someone's priority. If you have another way to
make it someone's priority that is better than this, please suggest and we
can have a discussion around it and agree on it :-).


> That doesn’t make much sense right? This is one of the reasons in such
> case we need to work as a group, figure out the problem and fix it, till
> then locking down the entire repo for further commits look a better option
> (IMHO).
>

Let us dig deeper into what happens when we work as a group, in general it
will be one person who will take the lead and get help. Is there a way to
find that person without locking down whole master? If there is, we may
never have to get to a place where we lock down master completely. We may
not even have to lock down components. Suggestions are welcome.


> On Thu, 27 Sep 2018 at 14:04, Nigel Babu  wrote:
>
>> We know maintainers of the components which are leading to repeated
>>> failures in that component and we just need to do the same thing we did to
>>> remove commit access for the maintainer of the component instead of all of
>>> the people. So in that sense it is not good faith and can be enforced.
>>>
>>
>> Pranith, I believe the difference of opinion is because you're looking at
>> this problem in terms of "who" rather than "what". We do not care about
>> *who* broke master. Removing commit access from a component owner doesn't
>> stop someone else from landing a patch will create a failure in the same
>> component or even a different component. We cannot stop patches from
>> landing because it touches a specific component. And even if we could, our
>> components are not entirely independent of each other. There could still be
>> failures. This is a common scenario and it happened the last time we had to
>> close master. Let me further re-emphasize our goals:
>>
>> * When master is broken, every team member's energy needs to be focused
>> on getting master to green. Who broke the build isn't a concern as much as
>> *the build is broken*. This is not a situation to punish specific people.
>> * If we allow other commits to land, we run the risk of someone else
>> breaking master with a different patch. Now we have two failures to debug
>> and fix.
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> https://lists.gluster.org/mailman/listinfo/maintainers
>>
> --
> - Atin (atinm)
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Nigel Babu
>
> We know maintainers of the components which are leading to repeated
> failures in that component and we just need to do the same thing we did to
> remove commit access for the maintainer of the component instead of all of
> the people. So in that sense it is not good faith and can be enforced.
>

Pranith, I believe the difference of opinion is because you're looking at
this problem in terms of "who" rather than "what". We do not care about
*who* broke master. Removing commit access from a component owner doesn't
stop someone else from landing a patch will create a failure in the same
component or even a different component. We cannot stop patches from
landing because it touches a specific component. And even if we could, our
components are not entirely independent of each other. There could still be
failures. This is a common scenario and it happened the last time we had to
close master. Let me further re-emphasize our goals:

* When master is broken, every team member's energy needs to be focused on
getting master to green. Who broke the build isn't a concern as much as
*the build is broken*. This is not a situation to punish specific people.
* If we allow other commits to land, we run the risk of someone else
breaking master with a different patch. Now we have two failures to debug
and fix.
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Pranith Kumar Karampuri
On Wed, Sep 26, 2018 at 8:14 PM Shyam Ranganathan 
wrote:

> This was discussed in the maintainers meeting (see notes [1]), and the
> conclusion is as follows,
>

I had to leave early that day due to a conflicting meeting. Comments below.


>
> - Merge lock down would be across the code base, and not component
> specific. As component level decision goes into more 'good faith'
> category and requires more tooling to avoid the same.
>

We know maintainers of the components which are leading to repeated
failures in that component and we just need to do the same thing we did to
remove commit access for the maintainer of the component instead of all of
the people. So in that sense it is not good faith and can be enforced.


>
> - Merge lock down would get closer to when repeated failures are
> noticed, than as it stands now (looking for failures across) as we
> strengthen the code base
>
> In all testing health maintained at always GREEN is where we want to
> reach over time and take a step back to correct any anomalies when we
> detect the same to retain the said health.
>
> Shyam
>
> [1] Maintainer meeting notes:
> https://lists.gluster.org/pipermail/maintainers/2018-September/005054.html
> (see Round table section)
> On 09/03/2018 01:47 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > On Wed, Aug 22, 2018 at 5:54 PM Shyam Ranganathan  > > wrote:
> >
> > On 08/18/2018 12:45 AM, Pranith Kumar Karampuri wrote:
> > >
> > >
> > > On Tue, Aug 14, 2018 at 5:29 PM Shyam Ranganathan
> > mailto:srang...@redhat.com>
> > > >> wrote:
> > >
> > > On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
> > > >
> > > >
> > > > On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan
> > > mailto:srang...@redhat.com>
> > >
> > > > 
> >  > > >
> > > > Maintainers,
> > > >
> > > > The following thread talks about a merge during a merge
> > > lockdown, with
> > > > differing view points (this mail is not to discuss the
> view
> > > points).
> > > >
> > > > The root of the problem is that we leave the current
> process
> > > to good
> > > > faith. If we have a simple rule that we will not merge
> > > anything during a
> > > > lock down period, this confusion and any future
> > repetitions of
> > > the same
> > > > would not occur.
> > > >
> > > > I propose that we follow the simpler rule, and would
> > like to hear
> > > > thoughts around this.
> > > >
> > > > This also means that in the future, we may not need to
> > remove
> > > commit
> > > > access for other maintainers, as we do *not* follow a
> good
> > > faith policy,
> > > > and instead depend on being able to revert and announce
> > on the
> > > threads
> > > > why we do so.
> > > >
> > > >
> > > > I think it is a good opportunity to establish guidelines and
> > > process so
> > > > that we don't end up in this state in future where one needs
> > to lock
> > > > down the branch to make it stable. From that p.o.v.
> > discussion on this
> > > > thread about establishing a process for lock down probably
> > doesn't add
> > > > much value. My personal opinion for this instance at least
> > is that
> > > it is
> > > > good that it was locked down. I tend to forget things and not
> > > having the
> > > > access to commit helped follow the process automatically :-).
> > >
> > > The intention is that master and release branches are always
> > maintained
> > > in good working order. This involves, tests and related checks
> > passing
> > > *always*.
> > >
> > > When this situation is breached, correcting it immediately is
> > better
> > > than letting it build up, as that would entail longer times
> > and more
> > > people to fix things up.
> > >
> > > In an ideal world, if nightly runs fail, the next thing done
> > would be to
> > > examine patches that were added between the 2 runs, and see if
> > they are
> > > the cause for failure, and back them out.
> > >
> > > Hence calling to backout patches is something that would
> > happen more
> > > regularly in the future if things are breaking.
> > >
> > >
> > > I'm with you till here.
> > >
> > >
> > >
> > > Lock down may happen if 2 consecutive nightly 

Re: [Gluster-Maintainers] Lock down period merge process

2018-09-26 Thread Shyam Ranganathan
This was discussed in the maintainers meeting (see notes [1]), and the
conclusion is as follows,

- Merge lock down would be across the code base, and not component
specific. As component level decision goes into more 'good faith'
category and requires more tooling to avoid the same.

- Merge lock down would get closer to when repeated failures are
noticed, than as it stands now (looking for failures across) as we
strengthen the code base

In all testing health maintained at always GREEN is where we want to
reach over time and take a step back to correct any anomalies when we
detect the same to retain the said health.

Shyam

[1] Maintainer meeting notes:
https://lists.gluster.org/pipermail/maintainers/2018-September/005054.html
(see Round table section)
On 09/03/2018 01:47 AM, Pranith Kumar Karampuri wrote:
> 
> 
> On Wed, Aug 22, 2018 at 5:54 PM Shyam Ranganathan  > wrote:
> 
> On 08/18/2018 12:45 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > On Tue, Aug 14, 2018 at 5:29 PM Shyam Ranganathan
> mailto:srang...@redhat.com>
> > >> wrote:
> >
> >     On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
> >     >
> >     >
> >     > On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan
> >     mailto:srang...@redhat.com>
> >
> >     > 
>  >     >
> >     >     Maintainers,
> >     >
> >     >     The following thread talks about a merge during a merge
> >     lockdown, with
> >     >     differing view points (this mail is not to discuss the view
> >     points).
> >     >
> >     >     The root of the problem is that we leave the current process
> >     to good
> >     >     faith. If we have a simple rule that we will not merge
> >     anything during a
> >     >     lock down period, this confusion and any future
> repetitions of
> >     the same
> >     >     would not occur.
> >     >
> >     >     I propose that we follow the simpler rule, and would
> like to hear
> >     >     thoughts around this.
> >     >
> >     >     This also means that in the future, we may not need to
> remove
> >     commit
> >     >     access for other maintainers, as we do *not* follow a good
> >     faith policy,
> >     >     and instead depend on being able to revert and announce
> on the
> >     threads
> >     >     why we do so.
> >     >
> >     >
> >     > I think it is a good opportunity to establish guidelines and
> >     process so
> >     > that we don't end up in this state in future where one needs
> to lock
> >     > down the branch to make it stable. From that p.o.v.
> discussion on this
> >     > thread about establishing a process for lock down probably
> doesn't add
> >     > much value. My personal opinion for this instance at least
> is that
> >     it is
> >     > good that it was locked down. I tend to forget things and not
> >     having the
> >     > access to commit helped follow the process automatically :-).
> >
> >     The intention is that master and release branches are always
> maintained
> >     in good working order. This involves, tests and related checks
> passing
> >     *always*.
> >
> >     When this situation is breached, correcting it immediately is
> better
> >     than letting it build up, as that would entail longer times
> and more
> >     people to fix things up.
> >
> >     In an ideal world, if nightly runs fail, the next thing done
> would be to
> >     examine patches that were added between the 2 runs, and see if
> they are
> >     the cause for failure, and back them out.
> >
> >     Hence calling to backout patches is something that would
> happen more
> >     regularly in the future if things are breaking.
> >
> >
> > I'm with you till here.
> >  
> >
> >
> >     Lock down may happen if 2 consecutive nightly builds fail, so
> as to
> >     rectify the situation ASAP, and then move onto other work.
> >
> >     In short, what I wanted to say is that preventing lock downs
> in the
> >     future, is not a state we aspire for.
> >
> >
> > What are the problems you foresee in aspiring for preventing lock
> downs
> > for everyone?
> 
> Any project will have test failures, when things are put together and
> exercised in different environments.
> 
> For us, these environments, at present, are nightly regression on
> centOS7, Brick-mux enabled regression, lcov, clang and in the future we
> hope to increase 

Re: [Gluster-Maintainers] Lock down period merge process

2018-09-02 Thread Pranith Kumar Karampuri
On Wed, Aug 22, 2018 at 5:54 PM Shyam Ranganathan 
wrote:

> On 08/18/2018 12:45 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > On Tue, Aug 14, 2018 at 5:29 PM Shyam Ranganathan  > > wrote:
> >
> > On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
> > >
> > >
> > > On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan
> > mailto:srang...@redhat.com>
> > > >> wrote:
> > >
> > > Maintainers,
> > >
> > > The following thread talks about a merge during a merge
> > lockdown, with
> > > differing view points (this mail is not to discuss the view
> > points).
> > >
> > > The root of the problem is that we leave the current process
> > to good
> > > faith. If we have a simple rule that we will not merge
> > anything during a
> > > lock down period, this confusion and any future repetitions of
> > the same
> > > would not occur.
> > >
> > > I propose that we follow the simpler rule, and would like to
> hear
> > > thoughts around this.
> > >
> > > This also means that in the future, we may not need to remove
> > commit
> > > access for other maintainers, as we do *not* follow a good
> > faith policy,
> > > and instead depend on being able to revert and announce on the
> > threads
> > > why we do so.
> > >
> > >
> > > I think it is a good opportunity to establish guidelines and
> > process so
> > > that we don't end up in this state in future where one needs to
> lock
> > > down the branch to make it stable. From that p.o.v. discussion on
> this
> > > thread about establishing a process for lock down probably doesn't
> add
> > > much value. My personal opinion for this instance at least is that
> > it is
> > > good that it was locked down. I tend to forget things and not
> > having the
> > > access to commit helped follow the process automatically :-).
> >
> > The intention is that master and release branches are always
> maintained
> > in good working order. This involves, tests and related checks
> passing
> > *always*.
> >
> > When this situation is breached, correcting it immediately is better
> > than letting it build up, as that would entail longer times and more
> > people to fix things up.
> >
> > In an ideal world, if nightly runs fail, the next thing done would
> be to
> > examine patches that were added between the 2 runs, and see if they
> are
> > the cause for failure, and back them out.
> >
> > Hence calling to backout patches is something that would happen more
> > regularly in the future if things are breaking.
> >
> >
> > I'm with you till here.
> >
> >
> >
> > Lock down may happen if 2 consecutive nightly builds fail, so as to
> > rectify the situation ASAP, and then move onto other work.
> >
> > In short, what I wanted to say is that preventing lock downs in the
> > future, is not a state we aspire for.
> >
> >
> > What are the problems you foresee in aspiring for preventing lock downs
> > for everyone?
>
> Any project will have test failures, when things are put together and
> exercised in different environments.
>
> For us, these environments, at present, are nightly regression on
> centOS7, Brick-mux enabled regression, lcov, clang and in the future we
> hope to increase this to ASAN, performance baselines, memory leak
> checks, etc.
>
> The whole intent of adding such tests and checks to the test pipeline,
> is to ensure that regressions to the good state, are caught early and
> regularly.
>
> When these tests and checks actually show up errors/issues, is when we
> need to pay attention to the same and focus first on getting us back on
> track.
>
> At the above juncture, is when we need the lockdown or commit blackout
> (or whatever we want to call it). The intent is to ensure that we do not
> add more patches and further destabilize the branch, but stabilize it
> first and then get other changes in later.
>
> There is a certain probability with which the above event will happen,
> and that can be reduced, if we are more stringent in our development
> practices, and ensuring code health is maintained (both by more checks
> in smoke and more tests per patch or even otherwise, and also by keener
> reviews and other such means).
>
> We also need to be proactive to, monitoring and fixing, failures in the
> tests, so that we can address them quickly, rather than someone calling
> out a lockdown.
>
> Now, is your question that we should avoid the above state altogether?
> Or, how to retain commit access for all, but still have such states as
> above, where only patches that help stabilization are merged?
>
> For the former, I do not have an answer, we can reduce the event
> probability as stated above, but it will never disappear 

Re: [Gluster-Maintainers] Lock down period merge process

2018-08-21 Thread Pranith Kumar Karampuri
On Sat, Aug 18, 2018 at 10:15 AM Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Tue, Aug 14, 2018 at 5:29 PM Shyam Ranganathan 
> wrote:
>
>> On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
>> >
>> >
>> > On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan > > > wrote:
>> >
>> > Maintainers,
>> >
>> > The following thread talks about a merge during a merge lockdown,
>> with
>> > differing view points (this mail is not to discuss the view points).
>> >
>> > The root of the problem is that we leave the current process to good
>> > faith. If we have a simple rule that we will not merge anything
>> during a
>> > lock down period, this confusion and any future repetitions of the
>> same
>> > would not occur.
>> >
>> > I propose that we follow the simpler rule, and would like to hear
>> > thoughts around this.
>> >
>> > This also means that in the future, we may not need to remove commit
>> > access for other maintainers, as we do *not* follow a good faith
>> policy,
>> > and instead depend on being able to revert and announce on the
>> threads
>> > why we do so.
>> >
>> >
>> > I think it is a good opportunity to establish guidelines and process so
>> > that we don't end up in this state in future where one needs to lock
>> > down the branch to make it stable. From that p.o.v. discussion on this
>> > thread about establishing a process for lock down probably doesn't add
>> > much value. My personal opinion for this instance at least is that it is
>> > good that it was locked down. I tend to forget things and not having the
>> > access to commit helped follow the process automatically :-).
>>
>> The intention is that master and release branches are always maintained
>> in good working order. This involves, tests and related checks passing
>> *always*.
>>
>> When this situation is breached, correcting it immediately is better
>> than letting it build up, as that would entail longer times and more
>> people to fix things up.
>>
>> In an ideal world, if nightly runs fail, the next thing done would be to
>> examine patches that were added between the 2 runs, and see if they are
>> the cause for failure, and back them out.
>>
>> Hence calling to backout patches is something that would happen more
>> regularly in the future if things are breaking.
>>
>
> I'm with you till here.
>
>
>>
>> Lock down may happen if 2 consecutive nightly builds fail, so as to
>> rectify the situation ASAP, and then move onto other work.
>>
>> In short, what I wanted to say is that preventing lock downs in the
>> future, is not a state we aspire for.
>
>
> What are the problems you foresee in aspiring for preventing lock downs
> for everyone?
>

Bringing this up just in case you missed this mail.


>
>
>> Lock downs may/will happen, it is
>> done to get the branches stable quicker, than spend long times trying to
>> find what caused the instability in the first place.
>>
>
>
>
>
>>
>> >
>> >
>> >
>> > Please note, if there are extraneous situations (say reported
>> security
>> > vulnerabilities that need fixes ASAP) we may need to loosen up the
>> > stringency, as that would take precedence over the lock down. These
>> > exceptions though, can be called out and hence treated as such.
>> >
>> > Thoughts?
>> >
>> >
>> > This is again my personal opinion. We don't need to merge patches in any
>> > branch unless we need to make an immediate release with that patch. For
>> > example if there is a security issue reported we *must* make a release
>> > with the fix immediately so it makes sense to merge it and do the
>> release.
>>
>> Agree, keeps the rule simple during lock down and not open to
>> interpretations.
>>
>> >
>> >
>> >
>> > Shyam
>> >
>> > PS: Added Yaniv to the CC as he reported the deviance
>> >
>> >  Forwarded Message 
>> > Subject:Re: [Gluster-devel] Release 5: Master branch health
>> > report
>> > (Week of 30th July)
>> > Date:   Tue, 7 Aug 2018 23:22:09 +0300
>> > From:   Yaniv Kaul mailto:yk...@redhat.com>>
>> > To: Shyam Ranganathan > > >
>> > CC: GlusterFS Maintainers > > >, Gluster Devel
>> > mailto:gluster-de...@gluster.org>>,
>> > Nigel Babu mailto:nig...@redhat.com>>
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan <
>> srang...@redhat.com
>> > 
>> > >> wrote:
>> >
>> > On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
>> > > The intention is to stabilize master and not add more
>> patches
>> > that my
>> > > destabilize it.
>> > >
>> > >
>> > > https://review.gluster.org/#/c/20603/ has been merged.
>> > > As far as I can see, it has nothing to do with stabilization
>> and

Re: [Gluster-Maintainers] Lock down period merge process

2018-08-17 Thread Pranith Kumar Karampuri
On Tue, Aug 14, 2018 at 5:29 PM Shyam Ranganathan 
wrote:

> On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
> >
> >
> > On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan  > > wrote:
> >
> > Maintainers,
> >
> > The following thread talks about a merge during a merge lockdown,
> with
> > differing view points (this mail is not to discuss the view points).
> >
> > The root of the problem is that we leave the current process to good
> > faith. If we have a simple rule that we will not merge anything
> during a
> > lock down period, this confusion and any future repetitions of the
> same
> > would not occur.
> >
> > I propose that we follow the simpler rule, and would like to hear
> > thoughts around this.
> >
> > This also means that in the future, we may not need to remove commit
> > access for other maintainers, as we do *not* follow a good faith
> policy,
> > and instead depend on being able to revert and announce on the
> threads
> > why we do so.
> >
> >
> > I think it is a good opportunity to establish guidelines and process so
> > that we don't end up in this state in future where one needs to lock
> > down the branch to make it stable. From that p.o.v. discussion on this
> > thread about establishing a process for lock down probably doesn't add
> > much value. My personal opinion for this instance at least is that it is
> > good that it was locked down. I tend to forget things and not having the
> > access to commit helped follow the process automatically :-).
>
> The intention is that master and release branches are always maintained
> in good working order. This involves, tests and related checks passing
> *always*.
>
> When this situation is breached, correcting it immediately is better
> than letting it build up, as that would entail longer times and more
> people to fix things up.
>
> In an ideal world, if nightly runs fail, the next thing done would be to
> examine patches that were added between the 2 runs, and see if they are
> the cause for failure, and back them out.
>
> Hence calling to backout patches is something that would happen more
> regularly in the future if things are breaking.
>

I'm with you till here.


>
> Lock down may happen if 2 consecutive nightly builds fail, so as to
> rectify the situation ASAP, and then move onto other work.
>
> In short, what I wanted to say is that preventing lock downs in the
> future, is not a state we aspire for.


What are the problems you foresee in aspiring for preventing lock downs for
everyone?


> Lock downs may/will happen, it is
> done to get the branches stable quicker, than spend long times trying to
> find what caused the instability in the first place.
>




>
> >
> >
> >
> > Please note, if there are extraneous situations (say reported
> security
> > vulnerabilities that need fixes ASAP) we may need to loosen up the
> > stringency, as that would take precedence over the lock down. These
> > exceptions though, can be called out and hence treated as such.
> >
> > Thoughts?
> >
> >
> > This is again my personal opinion. We don't need to merge patches in any
> > branch unless we need to make an immediate release with that patch. For
> > example if there is a security issue reported we *must* make a release
> > with the fix immediately so it makes sense to merge it and do the
> release.
>
> Agree, keeps the rule simple during lock down and not open to
> interpretations.
>
> >
> >
> >
> > Shyam
> >
> > PS: Added Yaniv to the CC as he reported the deviance
> >
> >  Forwarded Message 
> > Subject:Re: [Gluster-devel] Release 5: Master branch health
> > report
> > (Week of 30th July)
> > Date:   Tue, 7 Aug 2018 23:22:09 +0300
> > From:   Yaniv Kaul mailto:yk...@redhat.com>>
> > To: Shyam Ranganathan  > >
> > CC: GlusterFS Maintainers  > >, Gluster Devel
> > mailto:gluster-de...@gluster.org>>,
> > Nigel Babu mailto:nig...@redhat.com>>
> >
> >
> >
> >
> >
> > On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  > 
> > >> wrote:
> >
> > On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> > > The intention is to stabilize master and not add more
> patches
> > that my
> > > destabilize it.
> > >
> > >
> > > https://review.gluster.org/#/c/20603/ has been merged.
> > > As far as I can see, it has nothing to do with stabilization
> and
> > should
> > > be reverted.
> >
> > Posted this on the gerrit review as well:
> >
> > 
> > 4.1 does not have nightly tests, those run on master only.
> >
> >
> > That should change of course. We cannot strive for stability
> otherwise,
> > AFAIK.
> >
> >
> > 

Re: [Gluster-Maintainers] Lock down period merge process

2018-08-14 Thread Raghavendra Gowdappa
On Thu, Aug 9, 2018 at 9:59 AM, Nigel Babu  wrote:

> I would trust tooling that prevents merges rather than good faith.
>

+1

I have worked on projects where we trust good faith, but still enforce that
> with tooling[1]. It's highly likely for one or two committers to be unaware
> of an ongoing lock down. As the number of maintainers increase, the chances
> of someone coming back from PTO and accidentally merging something is high.
>
> The extraneous situation exception applies even now. I expect the janitors
> who have commit access in the event of a lock down to use their judgment to
> merge such patches.
>
> [1]: https://mozilla-releng.net/treestatus
>
>
> On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan 
> wrote:
>
>> Maintainers,
>>
>> The following thread talks about a merge during a merge lockdown, with
>> differing view points (this mail is not to discuss the view points).
>>
>> The root of the problem is that we leave the current process to good
>> faith. If we have a simple rule that we will not merge anything during a
>> lock down period, this confusion and any future repetitions of the same
>> would not occur.
>>
>> I propose that we follow the simpler rule, and would like to hear
>> thoughts around this.
>>
>> This also means that in the future, we may not need to remove commit
>> access for other maintainers, as we do *not* follow a good faith policy,
>> and instead depend on being able to revert and announce on the threads
>> why we do so.
>>
>> Please note, if there are extraneous situations (say reported security
>> vulnerabilities that need fixes ASAP) we may need to loosen up the
>> stringency, as that would take precedence over the lock down. These
>> exceptions though, can be called out and hence treated as such.
>>
>> Thoughts?
>>
>> Shyam
>>
>> PS: Added Yaniv to the CC as he reported the deviance
>>
>>  Forwarded Message 
>> Subject:Re: [Gluster-devel] Release 5: Master branch health report
>> (Week of 30th July)
>> Date:   Tue, 7 Aug 2018 23:22:09 +0300
>> From:   Yaniv Kaul 
>> To: Shyam Ranganathan 
>> CC: GlusterFS Maintainers , Gluster Devel
>> , Nigel Babu 
>>
>>
>>
>>
>>
>> On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan > > wrote:
>>
>> On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
>> > The intention is to stabilize master and not add more patches
>> that my
>> > destabilize it.
>> >
>> >
>> > https://review.gluster.org/#/c/20603/ has been merged.
>> > As far as I can see, it has nothing to do with stabilization and
>> should
>> > be reverted.
>>
>> Posted this on the gerrit review as well:
>>
>> 
>> 4.1 does not have nightly tests, those run on master only.
>>
>>
>> That should change of course. We cannot strive for stability otherwise,
>> AFAIK.
>>
>>
>> Stability of master does not (will not), in the near term guarantee
>> stability of release branches, unless patches that impact code already
>> on release branches, get fixes on master and are back ported.
>>
>> Release branches get fixes back ported (as is normal), this fix and
>> its
>> merge should not impact current master stability in any way, and
>> neither
>> stability of 4.1 branch.
>> 
>>
>> The current hold is on master, not on release branches. I agree that
>> merging further code changes on release branches (for example geo-rep
>> issues that are backported (see [1]), as there are tests that fail
>> regularly on master), may further destabilize the release branch. This
>> patch is not one of those.
>>
>>
>> Two issues I have with the merge:
>> 1. It just makes comparing master branch to release branch harder. For
>> example, to understand if there's a test that fails on master but
>> succeeds on release branch, or vice versa.
>> 2. It means we are not focused on stabilizing master branch.
>> Y.
>>
>>
>> Merging patches on release branches are allowed by release owners
>> only,
>> and usual practice is keeping the backlog low (merging weekly) in
>> these
>> cases as per the dashboard [1].
>>
>> Allowing for the above 2 reasons this patch was found,
>> - Not on master
>> - Not stabilizing or destabilizing the release branch
>> and hence was merged.
>>
>> If maintainers disagree I can revert the same.
>>
>> Shyam
>>
>> [1] Release 4.1 dashboard:
>>
>> https://review.gluster.org/#/projects/glusterfs,dashboards/
>> dashboard:4-1-dashboard
>>
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> https://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
> --
> nigelb
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org

Re: [Gluster-Maintainers] Lock down period merge process

2018-08-14 Thread Nigel Babu
On Tue, Aug 14, 2018 at 5:31 PM Shyam Ranganathan 
wrote:

> The intention is that master and release branches are always maintained
> in good working order. This involves, tests and related checks passing
> *always*.
>
> When this situation is breached, correcting it immediately is better
> than letting it build up, as that would entail longer times and more
> people to fix things up.
>
> In an ideal world, if nightly runs fail, the next thing done would be to
> examine patches that were added between the 2 runs, and see if they are
> the cause for failure, and back them out.
>

To be clear, at this point, we'll need to lock down master as well until
it's green again. If we know that master is broken, it's best we find the
culprit and back it out/land a fix. If it's backed out, the author has a
chance to fix the issue and re-land their patch. The problem with a broken
master is that we may break it yet again and it will be more difficult to
track down future failures.

This is more inconvenience, but this is what will help in the health of the
project long-term.



>
> Hence calling to backout patches is something that would happen more
> regularly in the future if things are breaking.
>
> Lock down may happen if 2 consecutive nightly builds fail, so as to
> rectify the situation ASAP, and then move onto other work.
>
> In short, what I wanted to say is that preventing lock downs in the
> future, is not a state we aspire for. Lock downs may/will happen, it is
> done to get the branches stable quicker, than spend long times trying to
> find what caused the instability in the first place.
>

Also adding a note that release branches have already been locked for a few
releases in case nobody has noticed. Only release managers can land to
release branches. This is done specifically so that the same process can
happen there as well.


-- 
nigelb
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-08-14 Thread Shyam Ranganathan
On 08/09/2018 01:24 AM, Pranith Kumar Karampuri wrote:
> 
> 
> On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan  > wrote:
> 
> Maintainers,
> 
> The following thread talks about a merge during a merge lockdown, with
> differing view points (this mail is not to discuss the view points).
> 
> The root of the problem is that we leave the current process to good
> faith. If we have a simple rule that we will not merge anything during a
> lock down period, this confusion and any future repetitions of the same
> would not occur.
> 
> I propose that we follow the simpler rule, and would like to hear
> thoughts around this.
> 
> This also means that in the future, we may not need to remove commit
> access for other maintainers, as we do *not* follow a good faith policy,
> and instead depend on being able to revert and announce on the threads
> why we do so.
> 
> 
> I think it is a good opportunity to establish guidelines and process so
> that we don't end up in this state in future where one needs to lock
> down the branch to make it stable. From that p.o.v. discussion on this
> thread about establishing a process for lock down probably doesn't add
> much value. My personal opinion for this instance at least is that it is
> good that it was locked down. I tend to forget things and not having the
> access to commit helped follow the process automatically :-).

The intention is that master and release branches are always maintained
in good working order. This involves, tests and related checks passing
*always*.

When this situation is breached, correcting it immediately is better
than letting it build up, as that would entail longer times and more
people to fix things up.

In an ideal world, if nightly runs fail, the next thing done would be to
examine patches that were added between the 2 runs, and see if they are
the cause for failure, and back them out.

Hence calling to backout patches is something that would happen more
regularly in the future if things are breaking.

Lock down may happen if 2 consecutive nightly builds fail, so as to
rectify the situation ASAP, and then move onto other work.

In short, what I wanted to say is that preventing lock downs in the
future, is not a state we aspire for. Lock downs may/will happen, it is
done to get the branches stable quicker, than spend long times trying to
find what caused the instability in the first place.

>  
> 
> 
> Please note, if there are extraneous situations (say reported security
> vulnerabilities that need fixes ASAP) we may need to loosen up the
> stringency, as that would take precedence over the lock down. These
> exceptions though, can be called out and hence treated as such.
> 
> Thoughts?
> 
> 
> This is again my personal opinion. We don't need to merge patches in any
> branch unless we need to make an immediate release with that patch. For
> example if there is a security issue reported we *must* make a release
> with the fix immediately so it makes sense to merge it and do the release.

Agree, keeps the rule simple during lock down and not open to
interpretations.

>  
> 
> 
> Shyam
> 
> PS: Added Yaniv to the CC as he reported the deviance
> 
>  Forwarded Message 
> Subject:        Re: [Gluster-devel] Release 5: Master branch health
> report
> (Week of 30th July)
> Date:   Tue, 7 Aug 2018 23:22:09 +0300
> From:   Yaniv Kaul mailto:yk...@redhat.com>>
> To:     Shyam Ranganathan  >
> CC:     GlusterFS Maintainers  >, Gluster Devel
> mailto:gluster-de...@gluster.org>>,
> Nigel Babu mailto:nig...@redhat.com>>
> 
> 
> 
> 
> 
> On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  
> >> wrote:
> 
>     On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
>     >     The intention is to stabilize master and not add more patches
>     that my
>     >     destabilize it.
>     >
>     >
>     > https://review.gluster.org/#/c/20603/ has been merged.
>     > As far as I can see, it has nothing to do with stabilization and
>     should
>     > be reverted.
> 
>     Posted this on the gerrit review as well:
> 
>     
>     4.1 does not have nightly tests, those run on master only.
> 
> 
> That should change of course. We cannot strive for stability otherwise,
> AFAIK. 
> 
> 
>     Stability of master does not (will not), in the near term guarantee
>     stability of release branches, unless patches that impact code
> already
>     on release branches, get fixes on master and are back ported.
> 
>     Release branches get fixes back ported (as is normal), this fix
> and its
>     merge should not impact current master stability in any way, and
> neither
>   

Re: [Gluster-Maintainers] Lock down period merge process

2018-08-14 Thread Shyam Ranganathan
On 08/09/2018 12:29 AM, Nigel Babu wrote:
> I would trust tooling that prevents merges rather than good faith. I
> have worked on projects where we trust good faith, but still enforce
> that with tooling[1]. It's highly likely for one or two committers to be
> unaware of an ongoing lock down. As the number of maintainers increase,
> the chances of someone coming back from PTO and accidentally merging
> something is high.

Agree, I would also go with a few having merge rights, to prevent above
cases.

> 
> The extraneous situation exception applies even now. I expect the
> janitors who have commit access in the event of a lock down to use their
> judgment to merge such patches.
> 
> [1]: https://mozilla-releng.net/treestatus
> 
> 
> On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan  > wrote:
> 
> Maintainers,
> 
> The following thread talks about a merge during a merge lockdown, with
> differing view points (this mail is not to discuss the view points).
> 
> The root of the problem is that we leave the current process to good
> faith. If we have a simple rule that we will not merge anything during a
> lock down period, this confusion and any future repetitions of the same
> would not occur.
> 
> I propose that we follow the simpler rule, and would like to hear
> thoughts around this.
> 
> This also means that in the future, we may not need to remove commit
> access for other maintainers, as we do *not* follow a good faith policy,
> and instead depend on being able to revert and announce on the threads
> why we do so.
> 
> Please note, if there are extraneous situations (say reported security
> vulnerabilities that need fixes ASAP) we may need to loosen up the
> stringency, as that would take precedence over the lock down. These
> exceptions though, can be called out and hence treated as such.
> 
> Thoughts?
> 
> Shyam
> 
> PS: Added Yaniv to the CC as he reported the deviance
> 
>  Forwarded Message 
> Subject:        Re: [Gluster-devel] Release 5: Master branch health
> report
> (Week of 30th July)
> Date:   Tue, 7 Aug 2018 23:22:09 +0300
> From:   Yaniv Kaul mailto:yk...@redhat.com>>
> To:     Shyam Ranganathan  >
> CC:     GlusterFS Maintainers  >, Gluster Devel
> mailto:gluster-de...@gluster.org>>,
> Nigel Babu mailto:nig...@redhat.com>>
> 
> 
> 
> 
> 
> On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  
> >> wrote:
> 
>     On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
>     >     The intention is to stabilize master and not add more patches
>     that my
>     >     destabilize it.
>     >
>     >
>     > https://review.gluster.org/#/c/20603/ has been merged.
>     > As far as I can see, it has nothing to do with stabilization and
>     should
>     > be reverted.
> 
>     Posted this on the gerrit review as well:
> 
>     
>     4.1 does not have nightly tests, those run on master only.
> 
> 
> That should change of course. We cannot strive for stability otherwise,
> AFAIK. 
> 
> 
>     Stability of master does not (will not), in the near term guarantee
>     stability of release branches, unless patches that impact code
> already
>     on release branches, get fixes on master and are back ported.
> 
>     Release branches get fixes back ported (as is normal), this fix
> and its
>     merge should not impact current master stability in any way, and
> neither
>     stability of 4.1 branch.
>     
> 
>     The current hold is on master, not on release branches. I agree that
>     merging further code changes on release branches (for example
> geo-rep
>     issues that are backported (see [1]), as there are tests that fail
>     regularly on master), may further destabilize the release
> branch. This
>     patch is not one of those.
> 
> 
> Two issues I have with the merge:
> 1. It just makes comparing master branch to release branch harder. For
> example, to understand if there's a test that fails on master but
> succeeds on release branch, or vice versa. 
> 2. It means we are not focused on stabilizing master branch.
> Y.
> 
> 
>     Merging patches on release branches are allowed by release
> owners only,
>     and usual practice is keeping the backlog low (merging weekly)
> in these
>     cases as per the dashboard [1].
> 
>     Allowing for the above 2 reasons this patch was found,
>     - Not on master
>     - Not stabilizing or destabilizing the release branch
>     and hence was merged.
> 
>     If maintainers disagree I can revert the same.
> 
>     Shyam
> 
>     [1] Release 4.1 

Re: [Gluster-Maintainers] Lock down period merge process

2018-08-08 Thread Pranith Kumar Karampuri
On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan 
wrote:

> Maintainers,
>
> The following thread talks about a merge during a merge lockdown, with
> differing view points (this mail is not to discuss the view points).
>
> The root of the problem is that we leave the current process to good
> faith. If we have a simple rule that we will not merge anything during a
> lock down period, this confusion and any future repetitions of the same
> would not occur.
>
> I propose that we follow the simpler rule, and would like to hear
> thoughts around this.
>
> This also means that in the future, we may not need to remove commit
> access for other maintainers, as we do *not* follow a good faith policy,
> and instead depend on being able to revert and announce on the threads
> why we do so.
>

I think it is a good opportunity to establish guidelines and process so
that we don't end up in this state in future where one needs to lock down
the branch to make it stable. From that p.o.v. discussion on this thread
about establishing a process for lock down probably doesn't add much value.
My personal opinion for this instance at least is that it is good that it
was locked down. I tend to forget things and not having the access to
commit helped follow the process automatically :-).


>
> Please note, if there are extraneous situations (say reported security
> vulnerabilities that need fixes ASAP) we may need to loosen up the
> stringency, as that would take precedence over the lock down. These
> exceptions though, can be called out and hence treated as such.
>
> Thoughts?
>

This is again my personal opinion. We don't need to merge patches in any
branch unless we need to make an immediate release with that patch. For
example if there is a security issue reported we *must* make a release with
the fix immediately so it makes sense to merge it and do the release.


>
> Shyam
>
> PS: Added Yaniv to the CC as he reported the deviance
>
>  Forwarded Message 
> Subject:Re: [Gluster-devel] Release 5: Master branch health report
> (Week of 30th July)
> Date:   Tue, 7 Aug 2018 23:22:09 +0300
> From:   Yaniv Kaul 
> To: Shyam Ranganathan 
> CC: GlusterFS Maintainers , Gluster Devel
> , Nigel Babu 
>
>
>
>
>
> On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  > wrote:
>
> On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> > The intention is to stabilize master and not add more patches
> that my
> > destabilize it.
> >
> >
> > https://review.gluster.org/#/c/20603/ has been merged.
> > As far as I can see, it has nothing to do with stabilization and
> should
> > be reverted.
>
> Posted this on the gerrit review as well:
>
> 
> 4.1 does not have nightly tests, those run on master only.
>
>
> That should change of course. We cannot strive for stability otherwise,
> AFAIK.
>
>
> Stability of master does not (will not), in the near term guarantee
> stability of release branches, unless patches that impact code already
> on release branches, get fixes on master and are back ported.
>
> Release branches get fixes back ported (as is normal), this fix and its
> merge should not impact current master stability in any way, and
> neither
> stability of 4.1 branch.
> 
>
> The current hold is on master, not on release branches. I agree that
> merging further code changes on release branches (for example geo-rep
> issues that are backported (see [1]), as there are tests that fail
> regularly on master), may further destabilize the release branch. This
> patch is not one of those.
>
>
> Two issues I have with the merge:
> 1. It just makes comparing master branch to release branch harder. For
> example, to understand if there's a test that fails on master but
> succeeds on release branch, or vice versa.
> 2. It means we are not focused on stabilizing master branch.
> Y.
>
>
> Merging patches on release branches are allowed by release owners only,
> and usual practice is keeping the backlog low (merging weekly) in these
> cases as per the dashboard [1].
>
> Allowing for the above 2 reasons this patch was found,
> - Not on master
> - Not stabilizing or destabilizing the release branch
> and hence was merged.
>
> If maintainers disagree I can revert the same.
>
> Shyam
>
> [1] Release 4.1 dashboard:
>
>
> https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>


-- 
Pranith
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


Re: [Gluster-Maintainers] Lock down period merge process

2018-08-08 Thread Nigel Babu
I would trust tooling that prevents merges rather than good faith. I have
worked on projects where we trust good faith, but still enforce that with
tooling[1]. It's highly likely for one or two committers to be unaware of
an ongoing lock down. As the number of maintainers increase, the chances of
someone coming back from PTO and accidentally merging something is high.

The extraneous situation exception applies even now. I expect the janitors
who have commit access in the event of a lock down to use their judgment to
merge such patches.

[1]: https://mozilla-releng.net/treestatus


On Thu, Aug 9, 2018 at 1:25 AM Shyam Ranganathan 
wrote:

> Maintainers,
>
> The following thread talks about a merge during a merge lockdown, with
> differing view points (this mail is not to discuss the view points).
>
> The root of the problem is that we leave the current process to good
> faith. If we have a simple rule that we will not merge anything during a
> lock down period, this confusion and any future repetitions of the same
> would not occur.
>
> I propose that we follow the simpler rule, and would like to hear
> thoughts around this.
>
> This also means that in the future, we may not need to remove commit
> access for other maintainers, as we do *not* follow a good faith policy,
> and instead depend on being able to revert and announce on the threads
> why we do so.
>
> Please note, if there are extraneous situations (say reported security
> vulnerabilities that need fixes ASAP) we may need to loosen up the
> stringency, as that would take precedence over the lock down. These
> exceptions though, can be called out and hence treated as such.
>
> Thoughts?
>
> Shyam
>
> PS: Added Yaniv to the CC as he reported the deviance
>
>  Forwarded Message 
> Subject:Re: [Gluster-devel] Release 5: Master branch health report
> (Week of 30th July)
> Date:   Tue, 7 Aug 2018 23:22:09 +0300
> From:   Yaniv Kaul 
> To: Shyam Ranganathan 
> CC: GlusterFS Maintainers , Gluster Devel
> , Nigel Babu 
>
>
>
>
>
> On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  > wrote:
>
> On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> > The intention is to stabilize master and not add more patches
> that my
> > destabilize it.
> >
> >
> > https://review.gluster.org/#/c/20603/ has been merged.
> > As far as I can see, it has nothing to do with stabilization and
> should
> > be reverted.
>
> Posted this on the gerrit review as well:
>
> 
> 4.1 does not have nightly tests, those run on master only.
>
>
> That should change of course. We cannot strive for stability otherwise,
> AFAIK.
>
>
> Stability of master does not (will not), in the near term guarantee
> stability of release branches, unless patches that impact code already
> on release branches, get fixes on master and are back ported.
>
> Release branches get fixes back ported (as is normal), this fix and its
> merge should not impact current master stability in any way, and
> neither
> stability of 4.1 branch.
> 
>
> The current hold is on master, not on release branches. I agree that
> merging further code changes on release branches (for example geo-rep
> issues that are backported (see [1]), as there are tests that fail
> regularly on master), may further destabilize the release branch. This
> patch is not one of those.
>
>
> Two issues I have with the merge:
> 1. It just makes comparing master branch to release branch harder. For
> example, to understand if there's a test that fails on master but
> succeeds on release branch, or vice versa.
> 2. It means we are not focused on stabilizing master branch.
> Y.
>
>
> Merging patches on release branches are allowed by release owners only,
> and usual practice is keeping the backlog low (merging weekly) in these
> cases as per the dashboard [1].
>
> Allowing for the above 2 reasons this patch was found,
> - Not on master
> - Not stabilizing or destabilizing the release branch
> and hence was merged.
>
> If maintainers disagree I can revert the same.
>
> Shyam
>
> [1] Release 4.1 dashboard:
>
>
> https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>


-- 
nigelb
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers


[Gluster-Maintainers] Lock down period merge process

2018-08-08 Thread Shyam Ranganathan
Maintainers,

The following thread talks about a merge during a merge lockdown, with
differing view points (this mail is not to discuss the view points).

The root of the problem is that we leave the current process to good
faith. If we have a simple rule that we will not merge anything during a
lock down period, this confusion and any future repetitions of the same
would not occur.

I propose that we follow the simpler rule, and would like to hear
thoughts around this.

This also means that in the future, we may not need to remove commit
access for other maintainers, as we do *not* follow a good faith policy,
and instead depend on being able to revert and announce on the threads
why we do so.

Please note, if there are extraneous situations (say reported security
vulnerabilities that need fixes ASAP) we may need to loosen up the
stringency, as that would take precedence over the lock down. These
exceptions though, can be called out and hence treated as such.

Thoughts?

Shyam

PS: Added Yaniv to the CC as he reported the deviance

 Forwarded Message 
Subject:Re: [Gluster-devel] Release 5: Master branch health report
(Week of 30th July)
Date:   Tue, 7 Aug 2018 23:22:09 +0300
From:   Yaniv Kaul 
To: Shyam Ranganathan 
CC: GlusterFS Maintainers , Gluster Devel
, Nigel Babu 





On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan mailto:srang...@redhat.com>> wrote:

On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
>     The intention is to stabilize master and not add more patches
that my
>     destabilize it.
>
>
> https://review.gluster.org/#/c/20603/ has been merged.
> As far as I can see, it has nothing to do with stabilization and
should
> be reverted.

Posted this on the gerrit review as well:


4.1 does not have nightly tests, those run on master only.


That should change of course. We cannot strive for stability otherwise,
AFAIK. 


Stability of master does not (will not), in the near term guarantee
stability of release branches, unless patches that impact code already
on release branches, get fixes on master and are back ported.

Release branches get fixes back ported (as is normal), this fix and its
merge should not impact current master stability in any way, and neither
stability of 4.1 branch.


The current hold is on master, not on release branches. I agree that
merging further code changes on release branches (for example geo-rep
issues that are backported (see [1]), as there are tests that fail
regularly on master), may further destabilize the release branch. This
patch is not one of those.


Two issues I have with the merge:
1. It just makes comparing master branch to release branch harder. For
example, to understand if there's a test that fails on master but
succeeds on release branch, or vice versa. 
2. It means we are not focused on stabilizing master branch.
Y.


Merging patches on release branches are allowed by release owners only,
and usual practice is keeping the backlog low (merging weekly) in these
cases as per the dashboard [1].

Allowing for the above 2 reasons this patch was found,
- Not on master
- Not stabilizing or destabilizing the release branch
and hence was merged.

If maintainers disagree I can revert the same.

Shyam

[1] Release 4.1 dashboard:

https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard

___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers