Re: [openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-05 Thread Armando M.
On 3 February 2016 at 18:49, Armando M.  wrote:

>
>
> On 3 February 2016 at 04:28, Sean Dague  wrote:
>
>> On 02/02/2016 10:03 PM, Matthew Treinish wrote:
>> > On Tue, Feb 02, 2016 at 05:09:47PM -0800, Armando M. wrote:
>> >> Folks,
>> >>
>> >> We have some IPv6 related bugs [1,2,3] that have been lingering for
>> some
>> >> time now. They have been hurting the gate (e.g. [4] the most recent
>> >> offending failure) and since it looks like they have been without
>> owners
>> >> nor a plan of action for some time, I made the hard decision of
>> skipping
>> >> them [5] ahead of the busy times ahead.
>> >
>> > So TBH I don't think the failure rate for these tests are really at a
>> point
>> > necessitating a skip:
>> >
>> >
>> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
>> >
>> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
>> >
>> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
>> >
>> > (also just a cool side-note, you can see the very obvious performance
>> regression
>> > caused by the keystonemiddleware release and when we excluded that
>> version in
>> > requirements)
>> >
>> > Well, test_dualnet_dhcp6_stateless_from_os is kinda there with a ~10%
>> failure
>> > rate, but the other 2 really aren't. I normally would be -1 on the skip
>> patch
>> > because of that. We try to save the skips for cases where the bugs are
>> really
>> > severe and preventing productivity at a large scale.
>> >
>> > But, in this case these ipv6 tests are kinda of out of place in
>> tempest. Having
>> > all the permutations of possible ip allocation configurations always
>> seemed a
>> > bit too heavy handed. These tests are also consistently in the top 10
>> slowest
>> > for a run. We really should have trimmed down this set a while ago so
>> we're only
>> > have a single case in tempest. Neutron should own the other possible
>> > configurations as an in-tree test.
>> >
>> > Brian Haley has a patch up from Dec. that was trying to clean it up:
>> >
>> > https://review.openstack.org/#/c/239868/
>> >
>> > We probably should revisit that soon, since quite clearly no one is
>> looking at
>> > these right now.
>>
>> We definitely shouldn't be running all the IPv6 tests.
>>
>> But I also think the assumption that the failure rate is low is not a
>> valid reason to keep a test. Unreliable tests that don't have anyone
>> looking into them should be deleted. They are providing negative value.
>> Because people just recheck past them even if their code made the race
>> worse. So any legitimate issues they are exposing are being ignored.
>>
>> If the neutron PTL wants tests pulled, we should just do it.
>>
>>
> Thanks for the support! Having said, I think it's important to make a
> judgement call on a case by case basis, because removing tests blindly
> might as well backfire.
>
> In this specific instance and all things considered, merging [2] (or even
> better [1]) feel like a net gain.
>
> Cheers,
> Armando
>
> [1] https://review.openstack.org/#/c/239868/
> [2] https://review.openstack.org/#/c/275457/
>
>

Btw I did respin [1], because I am still seeing intermittent failures.

[1] https://review.openstack.org/#/c/275457/

-Sean
>>
>> --
>> Sean Dague
>> http://dague.net
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-03 Thread Armando M.
On 3 February 2016 at 04:28, Sean Dague  wrote:

> On 02/02/2016 10:03 PM, Matthew Treinish wrote:
> > On Tue, Feb 02, 2016 at 05:09:47PM -0800, Armando M. wrote:
> >> Folks,
> >>
> >> We have some IPv6 related bugs [1,2,3] that have been lingering for some
> >> time now. They have been hurting the gate (e.g. [4] the most recent
> >> offending failure) and since it looks like they have been without owners
> >> nor a plan of action for some time, I made the hard decision of skipping
> >> them [5] ahead of the busy times ahead.
> >
> > So TBH I don't think the failure rate for these tests are really at a
> point
> > necessitating a skip:
> >
> >
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
> >
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
> >
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
> >
> > (also just a cool side-note, you can see the very obvious performance
> regression
> > caused by the keystonemiddleware release and when we excluded that
> version in
> > requirements)
> >
> > Well, test_dualnet_dhcp6_stateless_from_os is kinda there with a ~10%
> failure
> > rate, but the other 2 really aren't. I normally would be -1 on the skip
> patch
> > because of that. We try to save the skips for cases where the bugs are
> really
> > severe and preventing productivity at a large scale.
> >
> > But, in this case these ipv6 tests are kinda of out of place in tempest.
> Having
> > all the permutations of possible ip allocation configurations always
> seemed a
> > bit too heavy handed. These tests are also consistently in the top 10
> slowest
> > for a run. We really should have trimmed down this set a while ago so
> we're only
> > have a single case in tempest. Neutron should own the other possible
> > configurations as an in-tree test.
> >
> > Brian Haley has a patch up from Dec. that was trying to clean it up:
> >
> > https://review.openstack.org/#/c/239868/
> >
> > We probably should revisit that soon, since quite clearly no one is
> looking at
> > these right now.
>
> We definitely shouldn't be running all the IPv6 tests.
>
> But I also think the assumption that the failure rate is low is not a
> valid reason to keep a test. Unreliable tests that don't have anyone
> looking into them should be deleted. They are providing negative value.
> Because people just recheck past them even if their code made the race
> worse. So any legitimate issues they are exposing are being ignored.
>
> If the neutron PTL wants tests pulled, we should just do it.
>
>
Thanks for the support! Having said, I think it's important to make a
judgement call on a case by case basis, because removing tests blindly
might as well backfire.

In this specific instance and all things considered, merging [2] (or even
better [1]) feel like a net gain.

Cheers,
Armando

[1] https://review.openstack.org/#/c/239868/
[2] https://review.openstack.org/#/c/275457/


> -Sean
>
> --
> Sean Dague
> http://dague.net
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-03 Thread Sean Dague
On 02/02/2016 10:03 PM, Matthew Treinish wrote:
> On Tue, Feb 02, 2016 at 05:09:47PM -0800, Armando M. wrote:
>> Folks,
>>
>> We have some IPv6 related bugs [1,2,3] that have been lingering for some
>> time now. They have been hurting the gate (e.g. [4] the most recent
>> offending failure) and since it looks like they have been without owners
>> nor a plan of action for some time, I made the hard decision of skipping
>> them [5] ahead of the busy times ahead.
> 
> So TBH I don't think the failure rate for these tests are really at a point
> necessitating a skip:
> 
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
> 
> (also just a cool side-note, you can see the very obvious performance 
> regression
> caused by the keystonemiddleware release and when we excluded that version in
> requirements)
> 
> Well, test_dualnet_dhcp6_stateless_from_os is kinda there with a ~10% failure
> rate, but the other 2 really aren't. I normally would be -1 on the skip patch
> because of that. We try to save the skips for cases where the bugs are really
> severe and preventing productivity at a large scale. 
> 
> But, in this case these ipv6 tests are kinda of out of place in tempest. 
> Having
> all the permutations of possible ip allocation configurations always seemed a
> bit too heavy handed. These tests are also consistently in the top 10 slowest
> for a run. We really should have trimmed down this set a while ago so we're 
> only
> have a single case in tempest. Neutron should own the other possible
> configurations as an in-tree test.
> 
> Brian Haley has a patch up from Dec. that was trying to clean it up:
> 
> https://review.openstack.org/#/c/239868/
> 
> We probably should revisit that soon, since quite clearly no one is looking at
> these right now.

We definitely shouldn't be running all the IPv6 tests.

But I also think the assumption that the failure rate is low is not a
valid reason to keep a test. Unreliable tests that don't have anyone
looking into them should be deleted. They are providing negative value.
Because people just recheck past them even if their code made the race
worse. So any legitimate issues they are exposing are being ignored.

If the neutron PTL wants tests pulled, we should just do it.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-02 Thread Armando M.
On 2 February 2016 at 19:03, Matthew Treinish  wrote:

> On Tue, Feb 02, 2016 at 05:09:47PM -0800, Armando M. wrote:
> > Folks,
> >
> > We have some IPv6 related bugs [1,2,3] that have been lingering for some
> > time now. They have been hurting the gate (e.g. [4] the most recent
> > offending failure) and since it looks like they have been without owners
> > nor a plan of action for some time, I made the hard decision of skipping
> > them [5] ahead of the busy times ahead.
>
> So TBH I don't think the failure rate for these tests are really at a point
> necessitating a skip:
>
>
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
>
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
>
> http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
>
> (also just a cool side-note, you can see the very obvious performance
> regression
> caused by the keystonemiddleware release and when we excluded that version
> in
> requirements)
>
> Well, test_dualnet_dhcp6_stateless_from_os is kinda there with a ~10%
> failure
> rate, but the other 2 really aren't. I normally would be -1 on the skip
> patch
> because of that. We try to save the skips for cases where the bugs are
> really
> severe and preventing productivity at a large scale.
>

I am being overly aggressive here, just because I am conscious of the time
of the year :)


>
> But, in this case these ipv6 tests are kinda of out of place in tempest.
> Having
> all the permutations of possible ip allocation configurations always
> seemed a
> bit too heavy handed. These tests are also consistently in the top 10
> slowest
> for a run. We really should have trimmed down this set a while ago so
> we're only
> have a single case in tempest. Neutron should own the other possible
> configurations as an in-tree test.
>

+1


>
> Brian Haley has a patch up from Dec. that was trying to clean it up:
>
> https://review.openstack.org/#/c/239868/
>
> We probably should revisit that soon, since quite clearly no one is
> looking at
> these right now.
>
>
I thought that had merged already...my memory doesn't serve me as it used
to anymore :(


>
> -Matt Treinish
>
>
> >
> > Now one might argue that skipping them is counterproductive because it
> may
> > allow other regressions to sneak in, but I am hoping that this
> > controversial action will indeed smoke out the right folks.
> >
> > Comments welcome.
> >
> > Regards,
> > Armando
> >
> > [1] https://bugs.launchpad.net/neutron/+bug/1477192
> > [2] https://bugs.launchpad.net/neutron/+bug/1509004
> > [3] https://bugs.launchpad.net/openstack-gate/+bug/1540983
> > [4]
> >
> http://logs.openstack.org/37/264937/5/gate/gate-tempest-dsvm-neutron-full/afeaabd//logs/testr_results.html.gz
> > [5] https://review.openstack.org/#/c/275457/
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-02 Thread Brian Haley

On 02/02/2016 10:03 PM, Matthew Treinish wrote:

On Tue, Feb 02, 2016 at 05:09:47PM -0800, Armando M. wrote:

Folks,

We have some IPv6 related bugs [1,2,3] that have been lingering for some
time now. They have been hurting the gate (e.g. [4] the most recent
offending failure) and since it looks like they have been without owners
nor a plan of action for some time, I made the hard decision of skipping
them [5] ahead of the busy times ahead.


So TBH I don't think the failure rate for these tests are really at a point
necessitating a skip:

http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os

(also just a cool side-note, you can see the very obvious performance regression
caused by the keystonemiddleware release and when we excluded that version in
requirements)

Well, test_dualnet_dhcp6_stateless_from_os is kinda there with a ~10% failure
rate, but the other 2 really aren't. I normally would be -1 on the skip patch
because of that. We try to save the skips for cases where the bugs are really
severe and preventing productivity at a large scale.

But, in this case these ipv6 tests are kinda of out of place in tempest. Having
all the permutations of possible ip allocation configurations always seemed a
bit too heavy handed. These tests are also consistently in the top 10 slowest
for a run. We really should have trimmed down this set a while ago so we're only
have a single case in tempest. Neutron should own the other possible
configurations as an in-tree test.

Brian Haley has a patch up from Dec. that was trying to clean it up:

https://review.openstack.org/#/c/239868/


I just updated that to mark six of the eight tests as "slow" per your previous 
comment, such that only the dual-NIC/dual-stack tests are run in the gate, the 
others will run in the periodic nightly job.


http://status.openstack.org/openstack-health/#/job/periodic-tempest-dsvm-all-master

Will help lessen the impact until we can determine if it's the test or Neutron.

-Brian


We probably should revisit that soon, since quite clearly no one is looking at
these right now.


-Matt Treinish




Now one might argue that skipping them is counterproductive because it may
allow other regressions to sneak in, but I am hoping that this
controversial action will indeed smoke out the right folks.

Comments welcome.

Regards,
Armando

[1] https://bugs.launchpad.net/neutron/+bug/1477192
[2] https://bugs.launchpad.net/neutron/+bug/1509004
[3] https://bugs.launchpad.net/openstack-gate/+bug/1540983
[4]
http://logs.openstack.org/37/264937/5/gate/gate-tempest-dsvm-neutron-full/afeaabd//logs/testr_results.html.gz
[5] https://review.openstack.org/#/c/275457/




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-02 Thread Matthew Treinish
On Tue, Feb 02, 2016 at 05:09:47PM -0800, Armando M. wrote:
> Folks,
> 
> We have some IPv6 related bugs [1,2,3] that have been lingering for some
> time now. They have been hurting the gate (e.g. [4] the most recent
> offending failure) and since it looks like they have been without owners
> nor a plan of action for some time, I made the hard decision of skipping
> them [5] ahead of the busy times ahead.

So TBH I don't think the failure rate for these tests are really at a point
necessitating a skip:

http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os

(also just a cool side-note, you can see the very obvious performance regression
caused by the keystonemiddleware release and when we excluded that version in
requirements)

Well, test_dualnet_dhcp6_stateless_from_os is kinda there with a ~10% failure
rate, but the other 2 really aren't. I normally would be -1 on the skip patch
because of that. We try to save the skips for cases where the bugs are really
severe and preventing productivity at a large scale. 

But, in this case these ipv6 tests are kinda of out of place in tempest. Having
all the permutations of possible ip allocation configurations always seemed a
bit too heavy handed. These tests are also consistently in the top 10 slowest
for a run. We really should have trimmed down this set a while ago so we're only
have a single case in tempest. Neutron should own the other possible
configurations as an in-tree test.

Brian Haley has a patch up from Dec. that was trying to clean it up:

https://review.openstack.org/#/c/239868/

We probably should revisit that soon, since quite clearly no one is looking at
these right now.


-Matt Treinish


> 
> Now one might argue that skipping them is counterproductive because it may
> allow other regressions to sneak in, but I am hoping that this
> controversial action will indeed smoke out the right folks.
> 
> Comments welcome.
> 
> Regards,
> Armando
> 
> [1] https://bugs.launchpad.net/neutron/+bug/1477192
> [2] https://bugs.launchpad.net/neutron/+bug/1509004
> [3] https://bugs.launchpad.net/openstack-gate/+bug/1540983
> [4]
> http://logs.openstack.org/37/264937/5/gate/gate-tempest-dsvm-neutron-full/afeaabd//logs/testr_results.html.gz
> [5] https://review.openstack.org/#/c/275457/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [QA][Neutron] IPv6 related intermittent test failures

2016-02-02 Thread Armando M.
Folks,

We have some IPv6 related bugs [1,2,3] that have been lingering for some
time now. They have been hurting the gate (e.g. [4] the most recent
offending failure) and since it looks like they have been without owners
nor a plan of action for some time, I made the hard decision of skipping
them [5] ahead of the busy times ahead.

Now one might argue that skipping them is counterproductive because it may
allow other regressions to sneak in, but I am hoping that this
controversial action will indeed smoke out the right folks.

Comments welcome.

Regards,
Armando

[1] https://bugs.launchpad.net/neutron/+bug/1477192
[2] https://bugs.launchpad.net/neutron/+bug/1509004
[3] https://bugs.launchpad.net/openstack-gate/+bug/1540983
[4]
http://logs.openstack.org/37/264937/5/gate/gate-tempest-dsvm-neutron-full/afeaabd//logs/testr_results.html.gz
[5] https://review.openstack.org/#/c/275457/
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev