Public bug reported:

Description:

Ceph RBD live-migration fails to modify rbd_user/libvirt secret UUID to
the receiving hosts information, causing live migration to fail.

Steps to reproduce:

Compute node A:

/etc/nova/nova.conf:
rbd_user=compute_node_A
rbd_secret_uuid = secretA

Secret file:
/etc/libvirt/secrets/secretA.xml

Compute node B:

/etc/nova/nova.conf:
rbd_user=compute_node_B
rbd_secret_uuid = secretB

Secret file:
/etc/libvirt/secret/secretB.xml


Expected result:

Live migration completes

Current result:

Live migration fails because it sets the secret/key/id to the
information from cmopute_node_A instead of compute_node_B.

Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.613 
175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 
4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] 
[instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Live Migration failure: 
internal error: process exited while connecting to monitor: 
2016-09-29T18:50:40.220091Z qemu-system-x86_64: -drive 
file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:keysomecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none:
 error connecting
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: Traceback (most recent 
call last):
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in 
fire_timers
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     timer()
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     cb(*args, **kw)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     waiter.switch(result)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     result = 
function(*args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/nova/utils.py", line 1145, in context_wrapper
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     return func(*args, 
**kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6104, in 
_live_migration_operation
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     instance=instance)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     self.force_reraise()
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     
six.reraise(self.type_, self.value, self.tb)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6064, in 
_live_migration_operation
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     migration_flags)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     result = 
proxy_call(self._autowrap, f, *args, **kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     rv = execute(f, *args, 
**kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     six.reraise(c, e, tb)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     rv = meth(*args, 
**kwargs)
Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/libvirt.py", line 1833, in migrateToURI3
Sep 29 18:50:40 compute_node_A nova-compute[175448]:     if ret == -1: raise 
libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Sep 29 18:50:40 compute_node_A nova-compute[175448]: libvirtError: internal 
error: process exited while connecting to monitor: 2016-09-29T18:50:40.220091Z 
qemu-system-x86_64: -drive 
file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:key=somecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none:
 error connecting
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.703 
175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 
4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] 
[instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Migration operation has aborted

Environment
===========
1. OpenStack Mitaka from Ubuntu

2. KVM

2. Ceph

3. Neutron with Calico

----

More information:

The issue occurs when the new libvirt XML config is generated for the
disk configuration:

https://github.com/openstack/nova/blob/643aed652d0e51e36dbe7cb106285b51e3b5941b/nova/virt/libvirt/volume/net.py#L67

        if (conf.source_protocol == 'rbd' and
                CONF.libvirt.rbd_secret_uuid):
            conf.auth_secret_uuid = CONF.libvirt.rbd_secret_uuid
            auth_enabled = True  # Force authentication locally
            if CONF.libvirt.rbd_user:
                conf.auth_username = CONF.libvirt.rbd_user

Instead of getting the configuration information from the remote host
(node B when live migrating from node A -> node B) it is pulling the
information from the local /etc/nova/nova.conf file (using the CONF
object) instead of getting that information from the remote host that
the VM is about to be migrated to.

node A's nova.conf file does not match node B's nova.conf file when it
comes to the "rbd_user"/"rbd_secret".

This causes failures to migrate the VM over because Ceph won't let
compute_node_B authenticate because there are no credentials.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1629114

Title:
  Ceph RBD live-migration failure due to wrong rbd_user/rbd_secret

Status in OpenStack Compute (nova):
  New

Bug description:
  Description:

  Ceph RBD live-migration fails to modify rbd_user/libvirt secret UUID
  to the receiving hosts information, causing live migration to fail.

  Steps to reproduce:

  Compute node A:

  /etc/nova/nova.conf:
  rbd_user=compute_node_A
  rbd_secret_uuid = secretA

  Secret file:
  /etc/libvirt/secrets/secretA.xml

  Compute node B:

  /etc/nova/nova.conf:
  rbd_user=compute_node_B
  rbd_secret_uuid = secretB

  Secret file:
  /etc/libvirt/secret/secretB.xml

  
  Expected result:

  Live migration completes

  Current result:

  Live migration fails because it sets the secret/key/id to the
  information from cmopute_node_A instead of compute_node_B.

  Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.613 
175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 
4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] 
[instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Live Migration failure: 
internal error: process exited while connecting to monitor: 
2016-09-29T18:50:40.220091Z qemu-system-x86_64: -drive 
file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:keysomecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none:
 error connecting
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: Traceback (most recent 
call last):
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in 
fire_timers
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     timer()
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     cb(*args, **kw)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     waiter.switch(result)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     result = 
function(*args, **kwargs)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/nova/utils.py", line 1145, in context_wrapper
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     return func(*args, 
**kwargs)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6104, in 
_live_migration_operation
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     instance=instance)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     self.force_reraise()
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     
six.reraise(self.type_, self.value, self.tb)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6064, in 
_live_migration_operation
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     migration_flags)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     result = 
proxy_call(self._autowrap, f, *args, **kwargs)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     rv = execute(f, 
*args, **kwargs)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     six.reraise(c, e, tb)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     rv = meth(*args, 
**kwargs)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:   File 
"/usr/lib/python2.7/dist-packages/libvirt.py", line 1833, in migrateToURI3
  Sep 29 18:50:40 compute_node_A nova-compute[175448]:     if ret == -1: raise 
libvirtError ('virDomainMigrateToURI3() failed', dom=self)
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: libvirtError: internal 
error: process exited while connecting to monitor: 2016-09-29T18:50:40.220091Z 
qemu-system-x86_64: -drive 
file=rbd:nova/b4407d16-8946-45a0-8e58-3a1bf8b0edfc_disk:id=nova-compute-c07:key=somecephkey:auth_supported=cephx\;none:mon_host=[fd2d\:dec4\:cf59\:3c12\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c13\:0\:1\:\:]\:6789\;[fd2d\:dec4\:cf59\:3c14\:0\:1\:\:]\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none:
 error connecting
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.246712Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: 
2016-09-29T18:50:40.274406Z qemu-system-x86_64: network script /etc/qemu-ifdown 
failed with status 256
  Sep 29 18:50:40 compute_node_A nova-compute[175448]: 2016-09-29 18:50:40.703 
175448 ERROR nova.virt.libvirt.driver [req-77ce1a5a-6588-420d-8c77-7b106e4ca3f0 
4c8a770be6c54c23bbf20e8a63803d63 2d98cd4d4fdf43f5b9db5e39846922d8 - - -] 
[instance: b4407d16-8946-45a0-8e58-3a1bf8b0edfc] Migration operation has aborted

  Environment
  ===========
  1. OpenStack Mitaka from Ubuntu

  2. KVM

  2. Ceph

  3. Neutron with Calico

  ----

  More information:

  The issue occurs when the new libvirt XML config is generated for the
  disk configuration:

  
https://github.com/openstack/nova/blob/643aed652d0e51e36dbe7cb106285b51e3b5941b/nova/virt/libvirt/volume/net.py#L67

          if (conf.source_protocol == 'rbd' and
                  CONF.libvirt.rbd_secret_uuid):
              conf.auth_secret_uuid = CONF.libvirt.rbd_secret_uuid
              auth_enabled = True  # Force authentication locally
              if CONF.libvirt.rbd_user:
                  conf.auth_username = CONF.libvirt.rbd_user

  Instead of getting the configuration information from the remote host
  (node B when live migrating from node A -> node B) it is pulling the
  information from the local /etc/nova/nova.conf file (using the CONF
  object) instead of getting that information from the remote host that
  the VM is about to be migrated to.

  node A's nova.conf file does not match node B's nova.conf file when it
  comes to the "rbd_user"/"rbd_secret".

  This causes failures to migrate the VM over because Ceph won't let
  compute_node_B authenticate because there are no credentials.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1629114/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to