[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2016-02-22 Thread Sean Dague
** Changed in: nova/juno
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive juno series:
  Fix Released
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) juno series:
  Won't Fix
Status in OpenStack Compute (nova) kilo series:
  Fix Released
Status in nova package in Ubuntu:
  Invalid

Bug description:
  [Impact]

   * Users may sometimes fail to shutdown an instance if the associated qemu
 process is on uninterruptable sleep (typically IO).

  [Test Case]

   * 1. create some IO load in a VM
 2. look at the associated qemu, make sure it has STAT D in ps output
 3. shutdown the instance
 4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the 
qemu process.

  [Regression Potential]

   * None


  message: "Failed to terminate process" AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:"screen-n-cpu.txt"

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: "Failed to terminate process" AND tags:"screen-n-cpu.txt"

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state "None" within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-12-15 Thread Billy Olsen
This fix was made available in 1:2014.2.4-0ubuntu1~cloud4 of nova in the
Ubuntu Cloud Archive for Juno.

** Changed in: cloud-archive/juno
   Status: In Progress => Fix Committed

** Changed in: cloud-archive/juno
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive juno series:
  Fix Released
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) juno series:
  New
Status in OpenStack Compute (nova) kilo series:
  Fix Released
Status in nova package in Ubuntu:
  Invalid

Bug description:
  [Impact]

   * Users may sometimes fail to shutdown an instance if the associated qemu
 process is on uninterruptable sleep (typically IO).

  [Test Case]

   * 1. create some IO load in a VM
 2. look at the associated qemu, make sure it has STAT D in ps output
 3. shutdown the instance
 4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the 
qemu process.

  [Regression Potential]

   * None


  message: "Failed to terminate process" AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:"screen-n-cpu.txt"

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: "Failed to terminate process" AND tags:"screen-n-cpu.txt"

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state "None" within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-11-24 Thread Corey Bryant
** Changed in: nova (Ubuntu)
   Status: New => Invalid

** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/juno
   Importance: Undecided
   Status: New

** Changed in: cloud-archive
   Status: New => Invalid

** Changed in: cloud-archive/juno
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive juno series:
  In Progress
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) juno series:
  New
Status in OpenStack Compute (nova) kilo series:
  Fix Released
Status in nova package in Ubuntu:
  Invalid

Bug description:
  [Impact]

   * Users may sometimes fail to shutdown an instance if the associated qemu
 process is on uninterruptable sleep (typically IO).

  [Test Case]

   * 1. create some IO load in a VM
 2. look at the associated qemu, make sure it has STAT D in ps output
 3. shutdown the instance
 4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the 
qemu process.

  [Regression Potential]

   * None


  message: "Failed to terminate process" AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:"screen-n-cpu.txt"

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: "Failed to terminate process" AND tags:"screen-n-cpu.txt"

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state "None" within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 292, in decorated_function
  2014-08-07 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-11-18 Thread Chuck Short
** Also affects: nova/juno
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) juno series:
  New
Status in OpenStack Compute (nova) kilo series:
  Fix Released
Status in nova package in Ubuntu:
  New

Bug description:
  [Impact]

   * Users may sometimes fail to shutdown an instance if the associated qemu
 process is on uninterruptable sleep (typically IO).

  [Test Case]

   * 1. create some IO load in a VM
 2. look at the associated qemu, make sure it has STAT D in ps output
 3. shutdown the instance
 4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the 
qemu process.

  [Regression Potential]

   * None


  message: "Failed to terminate process" AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:"screen-n-cpu.txt"

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: "Failed to terminate process" AND tags:"screen-n-cpu.txt"

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state "None" within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-07-29 Thread Alan Pevec
** Changed in: nova/kilo
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) kilo series:
  Fix Released
Status in nova package in Ubuntu:
  New

Bug description:
  [Impact]

   * Users may sometimes fail to shutdown an instance if the associated qemu
 process is on uninterruptable sleep (typically IO).

  [Test Case]

   * 1. create some IO load in a VM
 2. look at the associated qemu, make sure it has STAT D in ps output
 3. shutdown the instance
 4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the 
qemu process.

  [Regression Potential]

   * None


  message: Failed to terminate process AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:screen-n-cpu.txt

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: Failed to terminate process AND tags:screen-n-cpu.txt

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state None within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 278, in 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-07-23 Thread Alan Pevec
** Also affects: nova/kilo
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) kilo series:
  New
Status in nova package in Ubuntu:
  New

Bug description:
  [Impact]

   * Users may sometimes fail to shutdown an instance if the associated qemu
 process is on uninterruptable sleep (typically IO).

  [Test Case]

   * 1. create some IO load in a VM
 2. look at the associated qemu, make sure it has STAT D in ps output
 3. shutdown the instance
 4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the 
qemu process.

  [Regression Potential]

   * None


  message: Failed to terminate process AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:screen-n-cpu.txt

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: Failed to terminate process AND tags:screen-n-cpu.txt

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state None within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 278, in 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-07-21 Thread Liang Chen
** Description changed:

+ [Impact]
+ 
+  * Users may sometimes fail to shutdown an instance if the associated qemu
+process is on uninterruptable sleep (typically IO).
+ 
+ [Test Case]
+ 
+  * 1. create some IO load in a VM
+2. look at the associated qemu, make sure it has STAT D in ps output
+3. shutdown the instance
+4. with the patch in place, nova will retry calling libvirt to shutdown
+   the instance 3 times to wait for the signal to be delivered to the 
+   qemu process.
+ 
+ [Regression Potential]
+ 
+  * None
+ 
+ 
  message: Failed to terminate process AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:screen-n-cpu.txt
  
  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.
  
  message: Failed to terminate process AND tags:screen-n-cpu.txt
  
  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state None within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.
  
  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520
  
  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855
  
  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 278, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 342, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE 

[Yahoo-eng-team] [Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log

2015-06-24 Thread Thierry Carrez
** Changed in: nova
   Status: Fix Committed = Fix Released

** Changed in: nova
Milestone: None = liberty-1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1353939

Title:
  Rescue fails with 'Failed to terminate process: Device or resource
  busy' in the n-cpu log

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  message: Failed to terminate process AND
  message:'InstanceNotRescuable' AND message: 'Exception during message
  handling' AND tags:screen-n-cpu.txt

  The above log stash-query reports back only the failed jobs, the 'Failed to 
terminate process' close other failed rescue tests,
  but tempest does not always reports them as an error at the end.

  message: Failed to terminate process AND tags:screen-n-cpu.txt

  Usual console log:
  Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 
0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task 
state None within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.

  http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
  full/90726cb/console.html#_2014-08-07_03_50_26_520

  Usual n-cpu exception:
  
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855

  2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher 
[req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 
ServerRescueTestJSON-1017508309] Exception during message handling: Instance 
0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to 
terminate process 26425 with SIGKILL: Device or resource busy
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
134, in _dispatch_and_reply
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
177, in _dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py, line 
123, in _do_dispatch
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 408, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 88, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/exception.py, line 71, in wrapped
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 292, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/openstack/common/excutils.py, line 82, in __exit__
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 278, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 342, in decorated_function
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher   File 
/opt/stack/new/nova/nova/compute/manager.py, line 320, in decorated_function