This issue was fixed in the openstack/nova rocky-eol release. ** Changed in: nova/rocky Status: Fix Committed => Fix Released
-- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1774249 Title: update_available_resource will raise DiskNotFound after resize but before confirm Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive queens series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) ocata series: Triaged Status in OpenStack Compute (nova) pike series: Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Released Status in OpenStack Compute (nova) stein series: Fix Released Status in OpenStack Compute (nova) train series: Fix Released Status in nova package in Ubuntu: Invalid Status in nova source package in Bionic: Fix Released Bug description: Original reported in RH Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1584315 Tested on OSP12 (Pike), but appears to be still present on master. Should only occur if nova compute is configured to use local file instance storage. Create instance A on compute X Resize instance A to compute Y Domain is powered off /var/lib/nova/instances/<uuid A> renamed to <uuid A>_resize on X Domain is *not* undefined On compute X: update_available_resource runs as a periodic task First action is to update self rt calls driver.get_available_resource() ...calls _get_disk_over_committed_size_total ...iterates over all defined domains, including the ones whose disks we renamed ...fails because a referenced disk no longer exists Results in errors in nova-compute.log: 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager [req-bd52371f-c6ec-4a83-9584-c00c5377acd8 - - - - -] Error updating resources for node compute-0.localdomain.: DiskNotFound: No disk at /var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager Traceback (most recent call last): 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6695, in update_available_resource_for_node 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 641, in update_available_resource 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename) 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5892, in get_available_resource 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager disk_over_committed = self._get_disk_over_committed_size_total() 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7393, in _get_disk_over_committed_size_total 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager config, block_device_info) 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7301, in _get_instance_disk_info_from_config 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager dk_size = disk_api.get_allocated_disk_size(path) 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/disk/api.py", line 156, in get_allocated_disk_size 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager return images.qemu_img_info(path).disk_size 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 57, in qemu_img_info 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager raise exception.DiskNotFound(location=path) 2018-05-30 02:17:08.647 1 ERROR nova.compute.manager DiskNotFound: No disk at /var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk And resource tracker is no longer updated. We can find lots of these in the gate. Note that change Icec2769bf42455853cbe686fb30fda73df791b25 nearly mitigates this, but doesn't because task_state is not set while the instance is awaiting confirm. ================================================================================= [Impact] See above [Test Plan] Deploy Openstack Queens with one compute node. Create a VM instance. Eg: openstack server create --wait --image $image_name --flavor $flavor --key-name testkey --nic net-id=${net_id} test-instance-1234 Get the details for that instance and copy the instance_name. Eg: openstack server show test-instance-1234 -c OS-EXT-SRV-ATTR:instance_name -f value Get the disk location used based on the instance name we retrieved before. Eg: disk_location=`juju run -a nova-compute -- virsh domblklist $var_name | grep nova | awk -v N=2 '{print $N}'` Move that file in a different location. Eg: juju run -a nova-compute -- mv $disk_location "$disk_location"_backup Check the nova compute logs on the compute node for a warning. Eg: juju run -a nova-compute -- grep "DiskNotFound" /var/log/nova/nova-compute.log The output should look like the following: ``` 2021-09-22 11:07:46.009 26176 WARNING nova.virt.libvirt.driver [req-6e8eb87e-4024-4908-9b7f-0648ecd03eaf - - - - -] Periodic task is updating the host stats, it is trying to get disk info for instance-00000001, but the backing disk storage was removed by a concurrent operation such as resize. Error: No disk at /var/lib/nova/instances/3bd9578f-e7d7-48bc-bdef-d2d4cb25ea29/disk: DiskNotFound: No disk at /var/lib/nova/instances/3bd9578f-e7d7-48bc-bdef-d2d4cb25ea29/disk ``` [Where problems could occur] Users which were relying on an error could be affected. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1774249/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp