Re: [Openstack] Diagnosing RPC timeouts when attaching volumes

2012-06-22 Thread Lars Kellogg-Stedman
 The timeout occurs when nova-compute is trying to do an rpc call to
 nova-volume.  It looks like this is just the compute log.  Do you have
 an error in the volume log?

There were no errors in the volume log.  It may have been a networking
problem caused by the local iptables firewall on the volume server
getting reset...but it's part of a larger issue we're struggling with,
which is that in general OpenStack makes it very hard to track down
errors along the RPC chain.

Thanks!

-- 
Lars Kellogg-Stedman l...@seas.harvard.edu   |
Senior Technologist| http://ac.seas.harvard.edu/
Academic Computing | 
http://code.seas.harvard.edu/
Harvard School of Engineering and Applied Sciences |

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] Diagnosing RPC timeouts when attaching volumes

2012-06-21 Thread Lars Kellogg-Stedman
Hello all,

I've run into a series of frustrating problems trying to get images to
attach correctly to running instances.  The current issue is that
after running a nova volume-attach ... command, I get the following
in compute.log on the compute host:

  2012-06-21 12:32:03 ERROR nova.rpc.impl_qpid
[req-a4720bff-afa5-48a3-a01c-d6697d53e835 22bb8e502d3944ad953e72fc77879c2f
76e2726cacca4be0bde6d8840f88c136] Timed out waiting for RPC response: None

...followed by a page or two of tracebacks from nova.rpc.impl_qpid and
nova.compute.manager, which seem to buried so far inside decorators
and RPC calls that I have a hard time figuring out what is actually
happening.  It *looks* like an Exception inside of attach_volume(),
which is a good sign, I guess.

I've posted the complete traceback here: https://gist.github.com/2966898

There is a nova-volume service running (on the compute host, because this is
where the disk space was available):

  Binary   Host Zone Status 
State Updated_At
  nova-network os-controller.int.seas.harvard.edu   nova 
disabled   XXX   2012-06-21 14:01:41
  nova-certos-controller.int.seas.harvard.edu   nova 
enabled:-)   2012-06-21 16:35:22
  nova-scheduler   os-controller.int.seas.harvard.edu   nova 
enabled:-)   2012-06-21 16:35:22
  nova-consoleauth os-controller.int.seas.harvard.edu   nova 
enabled:-)   2012-06-21 16:35:22
  nova-compute os-host.int.seas.harvard.edu nova 
enabled:-)   2012-06-21 16:35:22
  nova-volume  os-host.int.seas.harvard.edu nova 
enabled:-)   2012-06-21 16:35:15
  nova-console os-controller.int.seas.harvard.edu   nova 
enabled:-)   2012-06-21 16:35:16
  nova-network os-host.int.seas.harvard.edu nova 
enabled:-)   2012-06-21 16:35:17

Creating volumes works just fine.

-- 
Lars Kellogg-Stedman l...@seas.harvard.edu   |
Senior Technologist| http://ac.seas.harvard.edu/
Academic Computing | 
http://code.seas.harvard.edu/
Harvard School of Engineering and Applied Sciences |

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Diagnosing RPC timeouts when attaching volumes

2012-06-21 Thread Russell Bryant
On 06/21/2012 12:56 PM, Lars Kellogg-Stedman wrote:
 Hello all,
 
 I've run into a series of frustrating problems trying to get images to
 attach correctly to running instances.  The current issue is that
 after running a nova volume-attach ... command, I get the following
 in compute.log on the compute host:
 
   2012-06-21 12:32:03 ERROR nova.rpc.impl_qpid
 [req-a4720bff-afa5-48a3-a01c-d6697d53e835 22bb8e502d3944ad953e72fc77879c2f
 76e2726cacca4be0bde6d8840f88c136] Timed out waiting for RPC response: None
 
 ...followed by a page or two of tracebacks from nova.rpc.impl_qpid and
 nova.compute.manager, which seem to buried so far inside decorators
 and RPC calls that I have a hard time figuring out what is actually
 happening.  It *looks* like an Exception inside of attach_volume(),
 which is a good sign, I guess.
 
 I've posted the complete traceback here: https://gist.github.com/2966898
 
 There is a nova-volume service running (on the compute host, because this is
 where the disk space was available):
 
   Binary   Host Zone 
 Status State Updated_At
   nova-network os-controller.int.seas.harvard.edu   nova 
 disabled   XXX   2012-06-21 14:01:41
   nova-certos-controller.int.seas.harvard.edu   nova 
 enabled:-)   2012-06-21 16:35:22
   nova-scheduler   os-controller.int.seas.harvard.edu   nova 
 enabled:-)   2012-06-21 16:35:22
   nova-consoleauth os-controller.int.seas.harvard.edu   nova 
 enabled:-)   2012-06-21 16:35:22
   nova-compute os-host.int.seas.harvard.edu nova 
 enabled:-)   2012-06-21 16:35:22
   nova-volume  os-host.int.seas.harvard.edu nova 
 enabled:-)   2012-06-21 16:35:15
   nova-console os-controller.int.seas.harvard.edu   nova 
 enabled:-)   2012-06-21 16:35:16
   nova-network os-host.int.seas.harvard.edu nova 
 enabled:-)   2012-06-21 16:35:17
 
 Creating volumes works just fine.

The timeout occurs when nova-compute is trying to do an rpc call to
nova-volume.  It looks like this is just the compute log.  Do you have
an error in the volume log?

-- 
Russell Bryant



___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp