Re: [Openstack] Lossage in nova test suite?

2012-06-04 Thread Chris Behrens
Ok, I take that back.  I do see one issue.  It looks like each test run is 
leaving 2 stuck runner.py's for me… even if the tests complete successfully.



On Jun 4, 2012, at 4:57 PM, Chris Behrens wrote:

 The only thing I notice is about a 50% increase in the unit test run time 
 very recently… I don't know when that started..  maybe today.  Not seeing the 
 100s of python processes.  Hm!
 
 - Chris
 
 
 On Jun 4, 2012, at 4:00 PM, Kevin L. Mitchell wrote:
 
 Today I've noticed some significant problems with nova's test suite
 leaving literally hundreds of python processes out.  I'm guessing that
 this has to do with the unit tests for the multiprocess patch, which was
 just approved.  This could be causing problems with jenkins, too…
 
 Anybody have any other insights?
 -- 
 Kevin L. Mitchell kevin.mitch...@rackspace.com
 
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp
 


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Lossage in nova test suite?

2012-06-04 Thread Chris Behrens
Sorry, replying too soon before fully investigating.  They *are* forking over 
and over again repeatedly as fast as possible… making them hard to kill.  They 
fork a child and the parent exits immediately..  and that repeats endlessly.  
My VM is too slow right now that even a tight pkill loop doesn't hit them 
before they've forked again.

http://paste.openstack.org/show/18338/

Only recourse may be to reboot my VM.


On Jun 4, 2012, at 4:57 PM, Chris Behrens wrote:

 The only thing I notice is about a 50% increase in the unit test run time 
 very recently… I don't know when that started..  maybe today.  Not seeing the 
 100s of python processes.  Hm!
 
 - Chris
 
 
 On Jun 4, 2012, at 4:00 PM, Kevin L. Mitchell wrote:
 
 Today I've noticed some significant problems with nova's test suite
 leaving literally hundreds of python processes out.  I'm guessing that
 this has to do with the unit tests for the multiprocess patch, which was
 just approved.  This could be causing problems with jenkins, too…
 
 Anybody have any other insights?
 -- 
 Kevin L. Mitchell kevin.mitch...@rackspace.com
 
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp
 

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Lossage in nova test suite?

2012-06-04 Thread Gabe Westmaas
Should we revert this change till we get it cleared up?

On 6/4/12 8:29 PM, James E. Blair cor...@inaugust.com wrote:

On 06/04/2012 04:00 PM, Kevin L. Mitchell wrote:
 Today I've noticed some significant problems with nova's test suite
 leaving literally hundreds of python processes out.  I'm guessing that
 this has to do with the unit tests for the multiprocess patch, which was
 just approved.  This could be causing problems with jenkins, tooŠ

 Anybody have any other insights?

Yes, several Jenkins slaves have been taken out by running nova unit
tests.  The one that I am able to log into seems to be continuously
respawing python processes.  Other slaves are inaccessible due to having
exhausted their RAM.

I note that all of the tests run after that change merged carry this
warning notice from Jenkins:

   Process leaked file descriptors. See
http://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build
for more information

So I think it's fair to say that Jenkins corroborates your suspicion
that change introduced a problem with leaking processes.

This is affecting any Jenkins slave that the nova unit tests job runs
on, which in turn affects jobs for unrelated projects that happen to
later run on that slave.

In addition to correcting this problem, I believe we should add a build
step to Jenkins to ensure that all of the test processes have terminated
correctly.

-Jim

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Lossage in nova test suite?

2012-06-04 Thread Joe Gordon
On Mon, Jun 4, 2012 at 5:47 PM, Gabe Westmaas
gabe.westm...@rackspace.comwrote:

 Should we revert this change till we get it cleared up?


+1



 On 6/4/12 8:29 PM, James E. Blair cor...@inaugust.com wrote:

 On 06/04/2012 04:00 PM, Kevin L. Mitchell wrote:
  Today I've noticed some significant problems with nova's test suite
  leaving literally hundreds of python processes out.  I'm guessing that
  this has to do with the unit tests for the multiprocess patch, which was
  just approved.  This could be causing problems with jenkins, tooŠ
 
  Anybody have any other insights?
 
 Yes, several Jenkins slaves have been taken out by running nova unit
 tests.  The one that I am able to log into seems to be continuously
 respawing python processes.  Other slaves are inaccessible due to having
 exhausted their RAM.
 
 I note that all of the tests run after that change merged carry this
 warning notice from Jenkins:
 
Process leaked file descriptors. See
 http://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build
 for more information
 
 So I think it's fair to say that Jenkins corroborates your suspicion
 that change introduced a problem with leaking processes.
 
 This is affecting any Jenkins slave that the nova unit tests job runs
 on, which in turn affects jobs for unrelated projects that happen to
 later run on that slave.
 
 In addition to correcting this problem, I believe we should add a build
 step to Jenkins to ensure that all of the test processes have terminated
 correctly.
 
 -Jim
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp


 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Lossage in nova test suite?

2012-06-04 Thread James E. Blair

On 06/04/2012 05:47 PM, Gabe Westmaas wrote:

Should we revert this change till we get it cleared up?


Here's a proposal to do that:

  https://review.openstack.org/#/c/8166/

-Jim

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Lossage in nova test suite?

2012-06-04 Thread Johannes Erdfelt
On Mon, Jun 04, 2012, James E. Blair cor...@inaugust.com wrote:
 On 06/04/2012 05:47 PM, Gabe Westmaas wrote:
 Should we revert this change till we get it cleared up?
 
 Here's a proposal to do that:
 
   https://review.openstack.org/#/c/8166/

I approved it because of how bad Jenkins is right now, but it probably
won't merge because the gate-nova-python27 job fails immediately right
now.

Jenkins needs some TLC it seems to get things back to where stuff can
get merged again.

JE


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp