[Yahoo-eng-team] [Bug 1989232] Re: MultiAttachVolumeSwap fails or takes a long time to detach volume
Thanks Aboubacar for the well written bug report. I agree that we have a race between the disconnect and the swap operation. Both uses a lock but they use different locks so they can overlap. Failed case: Sep 30 02:51:47 Lock "connect_volume" "released" by "os_brick.initiator.connectors.iscsi.ISCSIConnector.disconnect_volume" :: held 0.156s {{(pid=2571444) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:400}} Sep 30 02:51:55 Lock "0d711b7b-4693-4a7e-9a94-ca4186b4a670" "released" by "nova.compute.manager.ComputeManager.swap_volume.._do_locked_swap_volume" :: held 153.400s {{(pid=2571444) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:400}} Successful case: Sep 29 17:12:21 Lock "4aeaeb5d-295f-4149-9330-a016d9da1730" "released" by "nova.compute.manager.ComputeManager.swap_volume.._do_locked_swap_volume" :: held 632.783s {{(pid=2571444) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:400}} Sep 29 17:12:25 Lock "connect_volume" "released" by "os_brick.initiator.connectors.iscsi.ISCSIConnector.disconnect_volume" :: held 0.142s {{(pid=2571444) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:400}} ** Also affects: os-brick Importance: Undecided Status: New ** Changed in: nova Status: New => Triaged ** Changed in: nova Importance: Undecided => Medium ** Tags added: volumes -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1989232 Title: MultiAttachVolumeSwap fails or takes a long time to detach volume Status in OpenStack Compute (nova): Triaged Status in os-brick: New Status in tempest: New Bug description: ERROR: tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach fails during tempest iSCSI tests due to volume taking a long time to detach or failing to detach from instance. The logs herein show an example of a failure to detach. EXEPCTED BEHAVIOR: Volume successfully detaches and test passes. HOW TO DUPLICATE: Run: tox -e all -- tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach | tee -a console.log.out CONFIG: - DevStack Zed Release - Single node using iSCSI - Host OS: Ubuntu 20.04 Distributor ID: Ubuntu Description:Ubuntu 20.04.3 LTS Release:20.04 Codename: focal From tempest console.log: tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach[id-e8f 8f9d1-d7b7-4cd2-8213-ab85ef697b6e,slow,volume] - -- Captured traceback: ~~~ Traceback (most recent call last): File "/opt/stack/tempest/tempest/lib/decorators.py", line 81, in wrapper return f(*func_args, **func_kwargs) File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper return f(*func_args, **func_kwargs) File "/opt/stack/tempest/tempest/api/compute/admin/test_volume_swap.py", line 245, in test_volume_swap_ with_multiattach waiters.wait_for_volume_resource_status(self.volumes_client, File "/opt/stack/tempest/tempest/common/waiters.py", line 301, in wait_for_volume_resource_status time.sleep(client.build_interval) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/fixtures/_fixtures/timeout.py", line 52, in signal_handler raise TimeoutException() fixtures._fixtures.timeout.TimeoutException Captured traceback-1: ~ Traceback (most recent call last): File "/opt/stack/tempest/tempest/common/waiters.py", line 385, in wait_for_volume_attachment_remove_from_server raise lib_exc.TimeoutException(message) tempest.lib.exceptions.TimeoutException: Request timed out Details: Volume a54c67b7-786e-4ba7-94ea-d1e0a722424a failed to detach from server 986b2dd5-542a-4344-a929-9ac7bbf35d7c within the required time (3600 s) from the compute API perspective In waiters.py: 373 while any(volume for volume in volumes if volume['volumeId'] == volume_id): 374 time.sleep(client.build_interval) 375 376 timed_out = int(time.time()) - start >= client.build_timeout 377 if timed_out: 378 console_output = client.get_console_output(server_id)['output'] 379 LOG.debug('Console output for %s\nbody=\n%s', 380 server_id, console_output) 381 message = ('Volume %s failed to detach from server %s within ' 382'the required time (%s s) from the compute API ' 383
[Yahoo-eng-team] [Bug 1989232] Re: MultiAttachVolumeSwap fails or takes a long time to detach volume
** Also affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1989232 Title: MultiAttachVolumeSwap fails or takes a long time to detach volume Status in OpenStack Compute (nova): New Status in tempest: New Bug description: ERROR: tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach fails during tempest iSCSI tests due to volume taking a long time to detach or failing to detach from instance. The logs herein show an example of a failure to detach. EXEPCTED BEHAVIOR: Volume successfully detaches and test passes. HOW TO DUPLICATE: Run: tox -e all -- tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach | tee -a console.log.out CONFIG: - DevStack Zed Release - Single node using iSCSI - Host OS: Ubuntu 20.04 Distributor ID: Ubuntu Description:Ubuntu 20.04.3 LTS Release:20.04 Codename: focal From tempest console.log: tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach[id-e8f 8f9d1-d7b7-4cd2-8213-ab85ef697b6e,slow,volume] - -- Captured traceback: ~~~ Traceback (most recent call last): File "/opt/stack/tempest/tempest/lib/decorators.py", line 81, in wrapper return f(*func_args, **func_kwargs) File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper return f(*func_args, **func_kwargs) File "/opt/stack/tempest/tempest/api/compute/admin/test_volume_swap.py", line 245, in test_volume_swap_ with_multiattach waiters.wait_for_volume_resource_status(self.volumes_client, File "/opt/stack/tempest/tempest/common/waiters.py", line 301, in wait_for_volume_resource_status time.sleep(client.build_interval) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/fixtures/_fixtures/timeout.py", line 52, in signal_handler raise TimeoutException() fixtures._fixtures.timeout.TimeoutException Captured traceback-1: ~ Traceback (most recent call last): File "/opt/stack/tempest/tempest/common/waiters.py", line 385, in wait_for_volume_attachment_remove_from_server raise lib_exc.TimeoutException(message) tempest.lib.exceptions.TimeoutException: Request timed out Details: Volume a54c67b7-786e-4ba7-94ea-d1e0a722424a failed to detach from server 986b2dd5-542a-4344-a929-9ac7bbf35d7c within the required time (3600 s) from the compute API perspective In waiters.py: 373 while any(volume for volume in volumes if volume['volumeId'] == volume_id): 374 time.sleep(client.build_interval) 375 376 timed_out = int(time.time()) - start >= client.build_timeout 377 if timed_out: 378 console_output = client.get_console_output(server_id)['output'] 379 LOG.debug('Console output for %s\nbody=\n%s', 380 server_id, console_output) 381 message = ('Volume %s failed to detach from server %s within ' 382'the required time (%s s) from the compute API ' 383'perspective' % 384(volume_id, server_id, client.build_timeout)) 385 raise lib_exc.TimeoutException(message) 386 try: 387 volumes = client.list_volume_attachments( 388 server_id)['volumeAttachments'] 389 except lib_exc.NotFound: 390 # Ignore 404s on detach in case the server is deleted or the volume 391 # is already detached. 392 return 393 return To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1989232/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp