[openstack-dev] [nova] libvirt version_cap, a postmortem

2014-08-30 Thread Mark McLoughlin

Hey

The libvirt version_cap debacle continues to come up in conversation and
one perception of the whole thing appears to be:

  A controversial patch was "ninjaed" by three Red Hat nova-cores and 
  then the same individuals piled on with -2s when a revert was proposed
  to allow further discussion.

I hope it's clear to everyone why that's a pretty painful thing to hear.
However, I do see that I didn't behave perfectly here. I apologize for
that.

In order to understand where this perception came from, I've gone back
over the discussions spread across gerrit and the mailing list in order
to piece together a precise timeline. I've appended that below.

Some conclusions I draw from that tedious exercise:

 - Some people came at this from the perspective that we already have 
   a firm, unwritten policy that all code must have functional written 
   tests. Others see that "test all the things" is interpreted as a
   worthy aspiration, but is only one of a number of nuanced factors
   that needs to be taken into account when considering the addition of
   a new feature.

   i.e. the former camp saw Dan Smith's devref addition as attempting 
   to document an existing policy (perhaps even a more forgiving 
   version of an existing policy), whereas other see it as a dramatic 
   shift to a draconian implementation of "test all the things".

 - Dan Berrange, Russell and I didn't feel like we were "ninjaing a
   controversial patch" - you can see our perspective expressed in 
   multiple places. The patch would have helped the "live snapshot" 
   issue, and has other useful applications. It does not affect the 
   broader testing debate.

   Johannes was a solitary voice expressing concerns with the patch, 
   and you could see that Dan was particularly engaged in trying to 
   address those concerns and repeating his feeling that the patch was 
   orthogonal to the testing debate.

   That all being said - the patch did merge too quickly.

 - What exacerbates the situation - particularly when people attempt to 
   look back at what happened - is how spread out our conversations 
   are. You look at the version_cap review and don't see any of the 
   related discussions on the devref policy review nor the mailing list 
   threads. Our disjoint methods of communicating contribute to 
   misunderstandings.

 - When it came to the revert, a couple of things resulted in 
   misunderstandings, hurt feelings and frayed tempers - (a) that our 
   "retrospective veto revert policy" wasn't well understood and (b) 
   a feeling that there was private, in-person grumbling about us at 
   the mid-cycle while we were absent, with no attempt to talk to us 
   directly.


To take an even further step back - successful communities like ours
require a huge amount of trust between the participants. Trust requires
communication and empathy. If communication breaks down and the pressure
we're all under erodes our empathy for each others' positions, then
situations can easily get horribly out of control.

This isn't a pleasant situation and we should all strive for better.
However, I tend to measure our "flamewars" against this:

  https://mail.gnome.org/archives/gnome-2-0-list/2001-June/msg00132.html

GNOME in June 2001 was my introduction to full-time open-source
development, so this episode sticks out in my mind. The two individuals
in that email were/are immensely capable and reasonable people, yet ...

So far, we're doing pretty okay compared to that and many other
open-source flamewars. Let's make sure we continue that way by avoiding
letting situations fester.


Thanks, and sorry for being a windbag,
Mark.

---

= July 1 =

The starting point is this review:

   https://review.openstack.org/103923

Dan Smith proposes a policy that the libvirt driver may not use libvirt
features until they have been available in Ubuntu or Fedora for at least
30 days.

The commit message mentions:

  "broken us in the past when we add a new feature that requires a newer
   libvirt than we test with, and we discover that it's totally broken
   when we upgrade in the gate."

which AIUI is a reference to the libvirt "live snapshot" issue the
previous week, which is described here:

  https://review.openstack.org/102643

where upgrading to Ubuntu Trusty meant the libvirt version in use in the
gate went from 0.9.8 to 1.2.2, which caused the "live snapshot" code
paths in Nova for the first time, which appeared to be related to some
serious gate instability (although the exact root cause wasn't
identified).

Some background on the libvirt version upgrade can be seen here:

  
http://lists.openstack.org/pipermail/openstack-dev/2014-March/thread.html#30284

= July 1 - July 8 =

Back and forth debate mostly between Dan Smith and Dan Berrange. Sean
votes +2, Dan Berrange votes -2.

= July 14 =

Russell adds his support to Dan Berrange's position, votes -2. Some
debate between Dan and Dan continues. Joe Gordon votes +2. Matt
Riedemann expresses support-

Re: [openstack-dev] [nova] libvirt version_cap, a postmortem

2014-08-31 Thread Gary Kotton
Hi,
Very nice write up. I take my hat off for taking the time and doing a
postmortem. As a community we really need to work on how we communicate
with one another. At the end of the day we all have the common goal in the
success of the project.
At times I feel like things are done very quickly without any discussion
at all, for example the reverting of patches. I can understand when the
gate is broken that this is a must, but in other cases I think that we
need to discuss things in order to build trust and for people in the
community to learn what they may or may not have done correctly.
Thanks
Gary

On 8/30/14, 7:08 PM, "Mark McLoughlin"  wrote:

>
>Hey
>
>The libvirt version_cap debacle continues to come up in conversation and
>one perception of the whole thing appears to be:
>
>  A controversial patch was "ninjaed" by three Red Hat nova-cores and
>  then the same individuals piled on with -2s when a revert was proposed
>  to allow further discussion.
>
>I hope it's clear to everyone why that's a pretty painful thing to hear.
>However, I do see that I didn't behave perfectly here. I apologize for
>that.
>
>In order to understand where this perception came from, I've gone back
>over the discussions spread across gerrit and the mailing list in order
>to piece together a precise timeline. I've appended that below.
>
>Some conclusions I draw from that tedious exercise:
>
> - Some people came at this from the perspective that we already have
>   a firm, unwritten policy that all code must have functional written
>   tests. Others see that "test all the things" is interpreted as a
>   worthy aspiration, but is only one of a number of nuanced factors
>   that needs to be taken into account when considering the addition of
>   a new feature.
>
>   i.e. the former camp saw Dan Smith's devref addition as attempting
>   to document an existing policy (perhaps even a more forgiving
>   version of an existing policy), whereas other see it as a dramatic
>   shift to a draconian implementation of "test all the things".
>
> - Dan Berrange, Russell and I didn't feel like we were "ninjaing a
>   controversial patch" - you can see our perspective expressed in
>   multiple places. The patch would have helped the "live snapshot"
>   issue, and has other useful applications. It does not affect the
>   broader testing debate.
>
>   Johannes was a solitary voice expressing concerns with the patch,
>   and you could see that Dan was particularly engaged in trying to
>   address those concerns and repeating his feeling that the patch was
>   orthogonal to the testing debate.
>
>   That all being said - the patch did merge too quickly.
>
> - What exacerbates the situation - particularly when people attempt to
>   look back at what happened - is how spread out our conversations
>   are. You look at the version_cap review and don't see any of the
>   related discussions on the devref policy review nor the mailing list
>   threads. Our disjoint methods of communicating contribute to
>   misunderstandings.
>
> - When it came to the revert, a couple of things resulted in
>   misunderstandings, hurt feelings and frayed tempers - (a) that our
>   "retrospective veto revert policy" wasn't well understood and (b)
>   a feeling that there was private, in-person grumbling about us at
>   the mid-cycle while we were absent, with no attempt to talk to us
>   directly.
>
>
>To take an even further step back - successful communities like ours
>require a huge amount of trust between the participants. Trust requires
>communication and empathy. If communication breaks down and the pressure
>we're all under erodes our empathy for each others' positions, then
>situations can easily get horribly out of control.
>
>This isn't a pleasant situation and we should all strive for better.
>However, I tend to measure our "flamewars" against this:
>
>  
>https://urldefense.proofpoint.com/v1/url?u=https://mail.gnome.org/archives
>/gnome-2-0-list/2001-June/msg00132.html&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0
>A&r=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxPEq8%3D%0A&m=WnWRoI3TSZlh7lPB
>Z67S7KqCg6LUo1tMHirwwCXEY0o%3D%0A&s=cb0293d31053f67f603946a43e6ebb99333df6
>d73adeb6d3965380aa94424519
>
>GNOME in June 2001 was my introduction to full-time open-source
>development, so this episode sticks out in my mind. The two individuals
>in that email were/are immensely capable and reasonable people, yet ...
>
>So far, we're doing pretty okay compared to that and many other
>open-source flamewars. Let's make sure we continue that way by avoiding
>letting situations fester.
>
>
>Thanks, and sorry for being a windbag,
>Mark.
>
>---
>
>= July 1 =
>
>The starting point is this review:
>
>   https://review.openstack.org/103923
>
>Dan Smith proposes a policy that the libvirt driver may not use libvirt
>features until they have been available in Ubuntu or Fedora for at least
>30 days.
>
>The commit message mentions:
>
>  "broken us in the past when we add a new feature that requires a ne

Re: [openstack-dev] [nova] libvirt version_cap, a postmortem

2014-08-31 Thread Dean Troyer
On Sat, Aug 30, 2014 at 11:08 AM, Mark McLoughlin  wrote:

> In order to understand where this perception came from, I've gone back
> over the discussions spread across gerrit and the mailing list in order
> to piece together a precise timeline. I've appended that below.
>

Thanks for doing this Mark, as someone who only saw the edges of the
discussion this clarifies the goals of what was going on.

The key thing that jumps out at me isn't in the details of these events, it
is something you mentioned about the diverse communications paths.  Often
we don't see the whole conversation even when it is available via review
comments or ML archives.

Without straying too far off here, I'm sifting through some thoughts about
how we might be able to help link these discussions together without adding
too much (or any!) overhead to them, something that would be available
_during_ the discussion to pull it all together.  I'll follow-up after some
Sunday afternoon right-brain processing while I'm at the park.

Film at eleven...
dt

-- 

Dean Troyer
dtro...@gmail.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] libvirt version_cap, a postmortem

2014-09-01 Thread Kashyap Chamarthy
On Sat, Aug 30, 2014 at 05:08:16PM +0100, Mark McLoughlin wrote:
> 
> Hey
> 
> The libvirt version_cap debacle continues to come up in conversation and
> one perception of the whole thing appears to be:
> 
>   A controversial patch was "ninjaed" by three Red Hat nova-cores and 
>   then the same individuals piled on with -2s when a revert was proposed
>   to allow further discussion.

As someone who tried to be a little helper to troubleshoot this "live
snapshot" bug (mentioned below) when it surfaced, I've been following
this discussion and ensuing threads about version_cap closely. 

And, FWIW, never in a moment did I feel there was any such intention of
"ninjaing" going on by the said folks, and felt (still do) it was done
in a 'good technical faith'. Maybe my eyes are blindsided by the fact of
observing the work and integrity of these people in different open
source communities over the years.

Thanks for taking time to do this write-up, Mark.

PS: Since email says @redhat.com, hope people reading this thread won't
misinterpret this comment.

> I hope it's clear to everyone why that's a pretty painful thing to hear.
> However, I do see that I didn't behave perfectly here. I apologize for
> that.
> 
> In order to understand where this perception came from, I've gone back
> over the discussions spread across gerrit and the mailing list in order
> to piece together a precise timeline. I've appended that below.
> 
> Some conclusions I draw from that tedious exercise:
> 
>  - Some people came at this from the perspective that we already have 
>a firm, unwritten policy that all code must have functional written 
>tests. Others see that "test all the things" is interpreted as a
>worthy aspiration, but is only one of a number of nuanced factors
>that needs to be taken into account when considering the addition of
>a new feature.
> 
>i.e. the former camp saw Dan Smith's devref addition as attempting 
>to document an existing policy (perhaps even a more forgiving 
>version of an existing policy), whereas other see it as a dramatic 
>shift to a draconian implementation of "test all the things".
> 
>  - Dan Berrange, Russell and I didn't feel like we were "ninjaing a
>controversial patch" - you can see our perspective expressed in 
>multiple places. The patch would have helped the "live snapshot" 
>issue, and has other useful applications. It does not affect the 
>broader testing debate.
> 
>Johannes was a solitary voice expressing concerns with the patch, 
>and you could see that Dan was particularly engaged in trying to 
>address those concerns and repeating his feeling that the patch was 
>orthogonal to the testing debate.
> 
>That all being said - the patch did merge too quickly.
> 
>  - What exacerbates the situation - particularly when people attempt to 
>look back at what happened - is how spread out our conversations 
>are. You look at the version_cap review and don't see any of the 
>related discussions on the devref policy review nor the mailing list 
>threads. Our disjoint methods of communicating contribute to 
>misunderstandings.
> 
>  - When it came to the revert, a couple of things resulted in 
>misunderstandings, hurt feelings and frayed tempers - (a) that our 
>"retrospective veto revert policy" wasn't well understood and (b) 
>a feeling that there was private, in-person grumbling about us at 
>the mid-cycle while we were absent, with no attempt to talk to us 
>directly.
> 
> 
> To take an even further step back - successful communities like ours
> require a huge amount of trust between the participants. Trust requires
> communication and empathy. If communication breaks down and the pressure
> we're all under erodes our empathy for each others' positions, then
> situations can easily get horribly out of control.

--
/kashyap

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] libvirt version_cap, a postmortem

2014-09-03 Thread Joe Gordon
On Sat, Aug 30, 2014 at 9:08 AM, Mark McLoughlin  wrote:

>
> Hey
>
> The libvirt version_cap debacle continues to come up in conversation and
> one perception of the whole thing appears to be:
>
>   A controversial patch was "ninjaed" by three Red Hat nova-cores and
>   then the same individuals piled on with -2s when a revert was proposed
>   to allow further discussion.
>
> I hope it's clear to everyone why that's a pretty painful thing to hear.
> However, I do see that I didn't behave perfectly here. I apologize for
> that.
>
> In order to understand where this perception came from, I've gone back
> over the discussions spread across gerrit and the mailing list in order
> to piece together a precise timeline. I've appended that below.
>
> Some conclusions I draw from that tedious exercise:
>

Thank you for going through and doing this.


>
>  - Some people came at this from the perspective that we already have
>a firm, unwritten policy that all code must have functional written
>tests. Others see that "test all the things" is interpreted as a
>worthy aspiration, but is only one of a number of nuanced factors
>that needs to be taken into account when considering the addition of
>a new feature.
>

Confusion over our testing policy sounds like the crux of one of the issues
here. Having so many unwritten policies has led to confusion in the past
which is why I started
http://docs.openstack.org/developer/nova/devref/policies.html, hopefully by
writing these things down in the future this sort of confusion will arise
less often.

Until this whole debacle I didn't even know there was a dissenting opinion
on what our testing policy is. In every conversation I have seen up until
this point, the question was always how to raise the bar on testing.  I
don't expect us to be able to get to the bottom of this issue in a ML
thread, but hopefully we can begin the testing policy conversation here so
that we may be able to make a breakthrough and the summit.



>
>i.e. the former camp saw Dan Smith's devref addition as attempting
>to document an existing policy (perhaps even a more forgiving
>version of an existing policy), whereas other see it as a dramatic
>shift to a draconian implementation of "test all the things".
>
>  - Dan Berrange, Russell and I didn't feel like we were "ninjaing a
>controversial patch" - you can see our perspective expressed in
>multiple places. The patch would have helped the "live snapshot"
>issue, and has other useful applications. It does not affect the
>broader testing debate.
>
>Johannes was a solitary voice expressing concerns with the patch,
>and you could see that Dan was particularly engaged in trying to
>address those concerns and repeating his feeling that the patch was
>orthogonal to the testing debate.
>
>That all being said - the patch did merge too quickly.
>
>  - What exacerbates the situation - particularly when people attempt to
>look back at what happened - is how spread out our conversations
>are. You look at the version_cap review and don't see any of the
>related discussions on the devref policy review nor the mailing list
>threads. Our disjoint methods of communicating contribute to
>misunderstandings.
>
>  - When it came to the revert, a couple of things resulted in
>misunderstandings, hurt feelings and frayed tempers - (a) that our
>"retrospective veto revert policy" wasn't well understood and (b)
>a feeling that there was private, in-person grumbling about us at
>the mid-cycle while we were absent, with no attempt to talk to us
>directly.
>

While I cannot speak for anyone else, I did grumble a bit at the mid-cycle
about the behavior on Dan's first devref patch,
https://review.openstack.org/#/c/103923/. This was the first time I saw 3
'-2's on a single patch revision. To me 1 or 2 '-2's gives the perception
of 'hold on there, lets discuss this more first,' but 3 '-2's is just
piling on and is very confrontational in nature. I was taken aback by this
behavior and still don't know what to say or even if my reaction is
justified.


>
> To take an even further step back - successful communities like ours
> require a huge amount of trust between the participants. Trust requires
> communication and empathy. If communication breaks down and the pressure
> we're all under erodes our empathy for each others' positions, then
> situations can easily get horribly out of control.
>
> This isn't a pleasant situation and we should all strive for better.
> However, I tend to measure our "flamewars" against this:
>
>   https://mail.gnome.org/archives/gnome-2-0-list/2001-June/msg00132.html
>
> GNOME in June 2001 was my introduction to full-time open-source
> development, so this episode sticks out in my mind. The two individuals
> in that email were/are immensely capable and reasonable people, yet ...
>
> So far, we're doing pretty okay compared to that and many other
> open

Re: [openstack-dev] [nova] libvirt version_cap, a postmortem

2014-09-05 Thread John Garbutt
On 3 September 2014 21:57, Joe Gordon  wrote:
> On Sat, Aug 30, 2014 at 9:08 AM, Mark McLoughlin  wrote:
>> Hey
>>
>> The libvirt version_cap debacle continues to come up in conversation and
>> one perception of the whole thing appears to be:
>>
>>   A controversial patch was "ninjaed" by three Red Hat nova-cores and
>>   then the same individuals piled on with -2s when a revert was proposed
>>   to allow further discussion.
>>
>> I hope it's clear to everyone why that's a pretty painful thing to hear.
>> However, I do see that I didn't behave perfectly here. I apologize for
>> that.
>>
>> In order to understand where this perception came from, I've gone back
>> over the discussions spread across gerrit and the mailing list in order
>> to piece together a precise timeline. I've appended that below.
>>
>> Some conclusions I draw from that tedious exercise:
>
> Thank you for going through and doing this.

+1

>>  - Some people came at this from the perspective that we already have
>>a firm, unwritten policy that all code must have functional written
>>tests. Others see that "test all the things" is interpreted as a
>>worthy aspiration, but is only one of a number of nuanced factors
>>that needs to be taken into account when considering the addition of
>>a new feature.
>
> Confusion over our testing policy sounds like the crux of one of the issues
> here. Having so many unwritten policies has led to confusion in the past
> which is why I started
> http://docs.openstack.org/developer/nova/devref/policies.html, hopefully by
> writing these things down in the future this sort of confusion will arise
> less often.
>
> Until this whole debacle I didn't even know there was a dissenting opinion
> on what our testing policy is. In every conversation I have seen up until
> this point, the question was always how to raise the bar on testing.  I
> don't expect us to be able to get to the bottom of this issue in a ML
> thread, but hopefully we can begin the testing policy conversation here so
> that we may be able to make a breakthrough and the summit.

+1

I certainly feel that we need a test policy we are all happy to
enforce. I am sure we can resolve this. I have some ideas, but I feel
like we should meet in person to discuss this one. I am really bad at
trying to discuss this kind of thing in text form.

> While I cannot speak for anyone else, I did grumble a bit at the mid-cycle
> about the behavior on Dan's first devref patch,
> https://review.openstack.org/#/c/103923/. This was the first time I saw 3
> '-2's on a single patch revision. To me 1 or 2 '-2's gives the perception of
> 'hold on there, lets discuss this more first,' but 3 '-2's is just piling on
> and is very confrontational in nature. I was taken aback by this behavior
> and still don't know what to say or even if my reaction is justified.

People were angry, this highlighted that disagreement.
That lead to us trying to resolving the immediate point of conflict.
It would be worse if there had been no communication.

>> To take an even further step back - successful communities like ours
>> require a huge amount of trust between the participants. Trust requires
>> communication and empathy. If communication breaks down and the pressure
>> we're all under erodes our empathy for each others' positions, then
>> situations can easily get horribly out of control.
>>
>> This isn't a pleasant situation and we should all strive for better.

+1

I think we have now identified where we don't agree.

Looking forward to resolving this, in person, at the summit.

Thanks,
John

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev