I can tell that during migrate_disk_and_power_off from the virt driver
on the source host, that the bdm.connection_info has the multiattach
flag set because we get this message when disconnecting the volume from
the first instance on the first host:

http://logs.openstack.org/17/554317/1/check/nova-
multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_12_057443

Mar 19 19:48:12.057443 ubuntu-xenial-inap-mtl01-0003062768 nova-
compute[27735]: INFO nova.virt.libvirt.driver [None req-f044373a-53dd-
4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929
tempest-AttachVolumeMultiAttachTest-1944074929] [instance: 0eed0237
-245e-4a18-9e30-9e72accd36c6] Detected multiple connections on this host
for volume: 652600d5-f6dc-4089-ba95-d71d7640cafa, skipping target
disconnect.

Which comes from the _should_disconnect_target code.

Then I eventually see _terminate_volume_connections which creates a new
empty attachment between the server and volume:

http://logs.openstack.org/17/554317/1/check/nova-
multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_12_307926

Mar 19 19:48:12.307926 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: 
DEBUG cinderclient.v3.client [None req-f044373a-53dd-4de7-b881-ff511ecb3a70 
tempest-AttachVolumeMultiAttachTest-1944074929 
tempest-AttachVolumeMultiAttachTest-1944074929] REQ: curl -g -i -X POST 
http://198.72.124.95/volume/v3/774fc170bdb24f20959a551dd666ef06/attachments -H 
"Accept: application/json" -H "User-Agent: python-cinderclient" -H 
"OpenStack-API-Version: volume 3.44" -H "X-Auth-Token: 
{SHA1}6ba4939df9b73a2d082ac56165bd1e04d7f33e6d" -H "Content-Type: 
application/json" -H "X-OpenStack-Request-ID: 
req-f044373a-53dd-4de7-b881-ff511ecb3a70" -d '{"attachment": {"instance_uuid": 
"0eed0237-245e-4a18-9e30-9e72accd36c6", "connector": null, "volume_uuid": 
"652600d5-f6dc-4089-ba95-d71d7640cafa"}}' {{(pid=27735) _http_log_request 
/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py:372}}
...
Mar 19 19:48:12.412178 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: 
RESP BODY: {"attachment": {"status": "reserved", "detached_at": "", 
"connection_info": {}, "attached_at": "", "attach_mode": null, "instance": 
"0eed0237-245e-4a18-9e30-9e72accd36c6", "volume_id": 
"652600d5-f6dc-4089-ba95-d71d7640cafa", "id": 
"c33fe43e-ac88-48df-93bd-3293746213b6"}}

And deletes the existing attachment:

Mar 19 19:48:12.418026 ubuntu-xenial-inap-mtl01-0003062768 nova-
compute[27735]: DEBUG cinderclient.v3.client [None req-f044373a-53dd-
4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929
tempest-AttachVolumeMultiAttachTest-1944074929] REQ: curl -g -i -X
DELETE
http://198.72.124.95/volume/v3/774fc170bdb24f20959a551dd666ef06/attachments/448fc228
-0e4c-49b4-b0a9-00f3b17468dd -H "OpenStack-API-Version: volume 3.44" -H
"User-Agent: python-cinderclient" -H "X-OpenStack-Request-ID: req-
f044373a-53dd-4de7-b881-ff511ecb3a70" -H "Accept: application/json" -H
"X-Auth-Token: {SHA1}6ba4939df9b73a2d082ac56165bd1e04d7f33e6d"
{{(pid=27735) _http_log_request /usr/local/lib/python2.7/dist-
packages/keystoneauth1/session.py:372}}

Later once we're in _finish_resize, we call _update_volume_attachments.
I can see that start happening here because we're getting the host
connector from the virt driver:

http://logs.openstack.org/17/554317/1/check/nova-
multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_13_521171

Mar 19 19:48:13.521171 ubuntu-xenial-inap-mtl01-0003062768 nova-
compute[27735]: DEBUG os_brick.utils [None req-f044373a-53dd-
4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929
tempest-AttachVolumeMultiAttachTest-1944074929] ==>
get_connector_properties: call u"{'execute': None, 'my_ip':
'198.72.124.95', 'enforce_multipath': True, 'host': 'ubuntu-xenial-inap-
mtl01-0003062768', 'root_helper': 'sudo nova-rootwrap
/etc/nova/rootwrap.conf', 'multipath': False}" {{(pid=27735)
trace_logging_wrapper /usr/local/lib/python2.7/dist-
packages/os_brick/utils.py:146}}

And then we update the new attachment with the host connector:

Mar 19 19:48:13.536842 ubuntu-xenial-inap-mtl01-0003062768 nova-
compute[27735]: DEBUG cinderclient.v3.client [None req-f044373a-53dd-
4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929
tempest-AttachVolumeMultiAttachTest-1944074929] REQ: curl -g -i -X PUT
http://198.72.124.95/volume/v3/774fc170bdb24f20959a551dd666ef06/attachments
/c33fe43e-ac88-48df-93bd-3293746213b6 -H "Accept: application/json" -H
"User-Agent: python-cinderclient" -H "OpenStack-API-Version: volume
3.44" -H "X-Auth-Token: {SHA1}6ba4939df9b73a2d082ac56165bd1e04d7f33e6d"
-H "Content-Type: application/json" -H "X-OpenStack-Request-ID: req-
f044373a-53dd-4de7-b881-ff511ecb3a70" -d '{"attachment": {"connector":
{"initiator": "iqn.1993-08.org.debian:01:d44f672dfb2a", "ip":
"198.72.124.95", "platform": "x86_64", "host": "ubuntu-xenial-inap-
mtl01-0003062768", "do_local_attach": false, "mountpoint": "/dev/vdb",
"os_type": "linux2", "multipath": false}}}' {{(pid=27735)
_http_log_request /usr/local/lib/python2.7/dist-
packages/keystoneauth1/session.py:372}}

I think I now see the problem.

_finish_resize in the compute manager calls this code:

        block_device_info = self._get_instance_block_device_info(
            context, instance, refresh_conn_info=True, bdms=bdms)

refresh_conn_info=True eventually gets us here:

https://github.com/openstack/nova/blob/f80b4e50093002f84b43ff245a605fbe44d34711/nova/virt/block_device.py#L639

I see the GET call to attachments here:

http://logs.openstack.org/17/554317/1/check/nova-
multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_15_652636

Mar 19 19:48:15.684627 ubuntu-xenial-inap-mtl01-0003062768 nova-
compute[27735]: RESP BODY: {"attachment": {"status": "attaching",
"detached_at": "", "connection_info": {"auth_password":
"5GDZJC3pBdPEJUgh", "attachment_id": "c33fe43e-ac88-48df-93bd-
3293746213b6", "target_discovered": false, "encrypted": false,
"driver_volume_type": "iscsi", "qos_specs": null, "target_iqn":
"iqn.2010-10.org.openstack:volume-652600d5-f6dc-4089-ba95-d71d7640cafa",
"target_portal": "198.72.124.95:3260", "volume_id":
"652600d5-f6dc-4089-ba95-d71d7640cafa", "target_lun": 1, "access_mode":
"rw", "auth_username": "67J6MYJni5Yu5VEEw8Q8", "auth_method": "CHAP"},
"attached_at": "2018-03-19T19:48:16.000000", "attach_mode": "rw",
"instance": "0eed0237-245e-4a18-9e30-9e72accd36c6", "volume_id":
"652600d5-f6dc-4089-ba95-d71d7640cafa", "id": "c33fe43e-ac88-48df-93bd-
3293746213b6"}}

And that connection_info gets updated in the BlockDeviceMapping object,
which we use later to set the shareable flag when connecting the volume
on the destination host. And that's where I see the guest xml doesn't
have the shareable flag because the BDM.connection_info doesn't have the
multiattach flag set:

http://logs.openstack.org/17/554317/1/check/nova-
multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_15_706213

block_device_info={'swap': None, 'root_device_name': u'/dev/vda',
'ephemerals': [], 'block_device_mapping': [{'guest_format': None,
'boot_index': None, 'mount_device': u'/dev/vdb', 'connection_info':
{'status': u'attaching', 'detached_at': u'', u'volume_id':
u'652600d5-f6dc-4089-ba95-d71d7640cafa', 'attach_mode': u'rw',
'driver_volume_type': u'iscsi', 'instance': u'0eed0237-245e-
4a18-9e30-9e72accd36c6', 'attached_at': u'2018-03-19T19:48:16.000000',
'serial': u'652600d5-f6dc-4089-ba95-d71d7640cafa', 'data':
{u'auth_password': u'***', u'target_discovered': False, u'encrypted':
False, u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack
:volume-652600d5-f6dc-4089-ba95-d71d7640cafa', u'target_portal':
u'198.72.124.95:3260', u'volume_id':
u'652600d5-f6dc-4089-ba95-d71d7640cafa', u'target_lun': 1,
u'access_mode': u'rw', u'auth_username': u'67J6MYJni5Yu5VEEw8Q8',
u'auth_method': u'CHAP'}}, 'disk_bus': 'virtio', 'device_type': 'disk',
'attachment_id': 'c33fe43e-ac88-48df-93bd-3293746213b6',
'delete_on_termination': False}]}

I have no idea why this didn't fail the resize tests before with older
libvirt and qemu, but maybe there was something different in those
versions that preserved the shareable flag or something, but this is
definitely obviously broken just by code inspection.

** Also affects: nova/queens
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1757190

Title:
  resize fails with volume multiattach using with libvirt 4.0.0 (and
  qemu 2.11.1): Failed to get shared "write" lock

Status in OpenStack Compute (nova):
  Triaged
Status in OpenStack Compute (nova) queens series:
  New

Bug description:
  Seeing this in a patch https://review.openstack.org/#/c/554317/ that
  runs the nova-multiattach job with the Queens Ubuntu Cloud Archive
  which has libvirt 4.0.0 and qemu 2.11.1:

  http://logs.openstack.org/17/554317/1/check/nova-
  multiattach/8e97832/logs/libvirt/qemu/instance-00000066.txt.gz

  2018-03-19T19:48:16.175548Z qemu-system-x86_64: -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1:
 Failed to get shared "write" lock
  Is another process using the image?

  
  
http://logs.openstack.org/17/554317/1/check/nova-multiattach/8e97832/logs/screen-n-cpu.txt.gz?level=TRACE#_Mar_19_19_48_16_261051

  Mar 19 19:48:17.132940 ubuntu-xenial-inap-mtl01-0003062768 
nova-compute[27735]: ERROR nova.compute.manager [None 
req-3a092a4b-7ae7-4f29-9f78-97bf1dc0d46d service nova] [instance: 
0eed0237-245e-4a18-9e30-9e72accd36c6] Setting instance vm_state to ERROR: 
libvirtError: internal error: process exited while connecting to monitor: 
2018-03-19T19:48:16.147136Z qemu-system-x86_64: -drive 
file=/dev/sdb,format=raw,if=none,id=drive-virtio-disk1,serial=652600d5-f6dc-4089-ba95-d71d7640cafa,cache=none,aio=native:
 'serial' is deprecated, please use the corresponding option of '-device' 
instead
  Mar 19 19:48:17.133724 ubuntu-xenial-inap-mtl01-0003062768 
nova-compute[27735]: 2018-03-19T19:48:16.155115Z qemu-system-x86_64: warning: 
TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
  Mar 19 19:48:17.134022 ubuntu-xenial-inap-mtl01-0003062768 
nova-compute[27735]: 2018-03-19T19:48:16.175548Z qemu-system-x86_64: -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1:
 Failed to get shared "write" lock

  That last error likely means the 'shareable' element isn't in the disk
  config xml, and it's not:

  <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/sdb'/>
      <target dev='vdb' bus='virtio'/>
      <serial>652600d5-f6dc-4089-ba95-d71d7640cafa</serial>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' 
function='0x0'/>
  </disk>

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1757190/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to