[Yahoo-eng-team] [Bug 2035911] [NEW] Race conditions attaching/detaching volumes
Public bug reported: For cinder volume attach and detach operations Nova is properly using os-brick's `guard_connection` as a context manager to protect against race conditions the the local disconnection of the volume and the call to cinder to unmap/unexport the volume with other volumes' local connection and map/export. This is the code: def detach(self, context, instance, volume_api, virt_driver, attachment_id=None, destroy_bdm=False): volume = self._get_volume(context, volume_api, self.volume_id) # Let OS-Brick handle high level locking that covers the local os-brick # detach and the Cinder call to call unmap the volume. Not all volume # backends or hosts require locking. with brick_utils.guard_connection(volume): self._do_detach(context, instance, volume_api, virt_driver, attachment_id, destroy_bdm) @update_db def attach(self, context, instance, volume_api, virt_driver, do_driver_attach=False, **kwargs): volume = self._get_volume(context, volume_api, self.volume_id) volume_api.check_availability_zone(context, volume, instance=instance) # Let OS-Brick handle high level locking that covers the call to # Cinder that exports & maps the volume, and for the local os-brick # attach. Not all volume backends or hosts require locking. with brick_utils.guard_connection(volume): self._do_attach(context, instance, volume, volume_api, virt_driver, do_driver_attach) But there are many other places where Nova does attach or detach not using those 2 methods (`detach` and `attach`). One example is when we delete an instance that has cinder volumes attached, another one is when finishing an instance live migration. Nova needs to always use the `guard_connection` context manager. There are places where it will be easy to fix such as the `_remove_volume_connection` method in cinder/volume/manager.py. But there are others where it looks like it will be harder like `_shutdown_instance` in the same file, because the volumes are locally detached through the `self.driver.destroy` call and only after all the volumes (potentially from different backends) have been locally removed does it call `self.volume_api.attachment_delete` for each of the volumes. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2035911 Title: Race conditions attaching/detaching volumes Status in OpenStack Compute (nova): New Bug description: For cinder volume attach and detach operations Nova is properly using os-brick's `guard_connection` as a context manager to protect against race conditions the the local disconnection of the volume and the call to cinder to unmap/unexport the volume with other volumes' local connection and map/export. This is the code: def detach(self, context, instance, volume_api, virt_driver, attachment_id=None, destroy_bdm=False): volume = self._get_volume(context, volume_api, self.volume_id) # Let OS-Brick handle high level locking that covers the local os-brick # detach and the Cinder call to call unmap the volume. Not all volume # backends or hosts require locking. with brick_utils.guard_connection(volume): self._do_detach(context, instance, volume_api, virt_driver, attachment_id, destroy_bdm) @update_db def attach(self, context, instance, volume_api, virt_driver, do_driver_attach=False, **kwargs): volume = self._get_volume(context, volume_api, self.volume_id) volume_api.check_availability_zone(context, volume, instance=instance) # Let OS-Brick handle high level locking that covers the call to # Cinder that exports & maps the volume, and for the local os-brick # attach. Not all volume backends or hosts require locking. with brick_utils.guard_connection(volume): self._do_attach(context, instance, volume, volume_api, virt_driver, do_driver_attach) But there are many other places where Nova does attach or detach not using those 2 methods (`detach` and `attach`). One example is when we delete an instance that has cinder volumes attached, another one is when finishing an instance live migration. Nova needs to always use the `guard_connection` context manager. There are places where it will be easy to fix such as the `_remove_volume_connection` method in cinder/volume/manager.py. But there are others where it looks like it will be harder like
[Yahoo-eng-team] [Bug 2035375] [NEW] Detaching multiple NVMe-oF volumes may leave the subsystem in connecting state
Public bug reported: When detaching multiple NVMe-oF volumes from the same host we may end with a NVMe subsystem in "connecting" state, and we'll see a bunch nvme error in dmesg. This happens on storage systems that share the same subsystem for multiple volumes because Nova has not been updated to support the tri- state "shared_targets" option that groups the detach and unmap of volumes to prevent race conditions. This is related to the issue mentioned in an os-brick commit message: https://review.opendev.org/c/openstack/os-brick/+/836062/12//COMMIT_MSG ** Affects: nova Importance: Undecided Assignee: Gorka Eguileor (gorka) Status: New ** Changed in: nova Assignee: (unassigned) => Gorka Eguileor (gorka) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2035375 Title: Detaching multiple NVMe-oF volumes may leave the subsystem in connecting state Status in OpenStack Compute (nova): New Bug description: When detaching multiple NVMe-oF volumes from the same host we may end with a NVMe subsystem in "connecting" state, and we'll see a bunch nvme error in dmesg. This happens on storage systems that share the same subsystem for multiple volumes because Nova has not been updated to support the tri- state "shared_targets" option that groups the detach and unmap of volumes to prevent race conditions. This is related to the issue mentioned in an os-brick commit message: https://review.opendev.org/c/openstack/os- brick/+/836062/12//COMMIT_MSG To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2035375/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2035368] [NEW] Debug config option in cinder and glance sections not working
Public bug reported: Nova has the hability of individually setting debug mode for cinder related libraries (cinderclient and os-brick) and for glanceclient using their respective configuration sections and setting `debug = true` regardless of the default debug setting. Unfortunately these options don't work as expected and have no effect. ** Affects: nova Importance: Undecided Assignee: Gorka Eguileor (gorka) Status: New ** Changed in: nova Assignee: (unassigned) => Gorka Eguileor (gorka) ** Description changed: - Nova has the possibility of individually setting debug mode for cinder + Nova has the hability of individually setting debug mode for cinder related libraries (cinderclient and os-brick) and for glanceclient using their respective configuration sections and setting `debug = true` regardless of the default debug setting. Unfortunately these options don't work as expected and have no effect. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2035368 Title: Debug config option in cinder and glance sections not working Status in OpenStack Compute (nova): New Bug description: Nova has the hability of individually setting debug mode for cinder related libraries (cinderclient and os-brick) and for glanceclient using their respective configuration sections and setting `debug = true` regardless of the default debug setting. Unfortunately these options don't work as expected and have no effect. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2035368/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2023210] [NEW] Wrong discard_granurality and discard_max_bytes reported in guest OS
Public bug reported: Assuming we have configured everything correctly in OpenStack to use discard on Cinder block devices and we have checked that discard works, the wrong values for discard_granularity` and `discard_max_bytes` are reported to the VM's guest Operating System. We can confirm this by checking the values on the host and the guest and see that they don't match: - `/sys/block//queue/discard_max_bytes` - `/sys/block//queue/discard_granularity`. The problem is that there is no code in Nova to set these values in libvirt, nor does Cinder or os-brick have any code to detect the right values to set. The libvirt fuctionality to set `discard_granularity` and `max_unmap_size` already exists: (https://bugzilla.redhat.com/show_bug.cgi?id=1408553. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2023210 Title: Wrong discard_granurality and discard_max_bytes reported in guest OS Status in OpenStack Compute (nova): New Bug description: Assuming we have configured everything correctly in OpenStack to use discard on Cinder block devices and we have checked that discard works, the wrong values for discard_granularity` and `discard_max_bytes` are reported to the VM's guest Operating System. We can confirm this by checking the values on the host and the guest and see that they don't match: - `/sys/block//queue/discard_max_bytes` - `/sys/block//queue/discard_granularity`. The problem is that there is no code in Nova to set these values in libvirt, nor does Cinder or os-brick have any code to detect the right values to set. The libvirt fuctionality to set `discard_granularity` and `max_unmap_size` already exists: (https://bugzilla.redhat.com/show_bug.cgi?id=1408553. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2023210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2023079] [NEW] Sparseness is not preserved on live migration
Public bug reported: When doing a live volume migration (block migration) thin volumes effectively become thick because all bytes are copies from the source to the destination. I understand that only the filesystem can know about the block that are unallocated, but Nova can set the "detect_zeroes" when doing the block mirroring. This doesn't seem to work for all drivers, but I have confirmed that it works for NFS and RBD volumes. Probably someone more knowledgeable should look into why it doesn't work for iSCSI and such. It requires some additional CPU computational power, so we shouldn't be setting it for normal operations, but I believe the big difference in time and network bandwidth during live migration is well worth it. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2023079 Title: Sparseness is not preserved on live migration Status in OpenStack Compute (nova): New Bug description: When doing a live volume migration (block migration) thin volumes effectively become thick because all bytes are copies from the source to the destination. I understand that only the filesystem can know about the block that are unallocated, but Nova can set the "detect_zeroes" when doing the block mirroring. This doesn't seem to work for all drivers, but I have confirmed that it works for NFS and RBD volumes. Probably someone more knowledgeable should look into why it doesn't work for iSCSI and such. It requires some additional CPU computational power, so we shouldn't be setting it for normal operations, but I believe the big difference in time and network bandwidth during live migration is well worth it. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2023079/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2023078] [NEW] Wrong discard value after online volume migration
Public bug reported: Nova incorrectly sets the libvirt XML after an online volume migration when the source is a backend that doesn’t support discard (Cinder doesn't return `discard: true` in the connection dictionary) to one that does. It seem like Nova doesn't rebuild the disk XML, so it's missing the discard=unmap that should have for the new volume. This bug results in the trimming/unmapping commands not working on the new volume until the next time Nova connects the volume. For example an instance reboot will not be enough, but a shelve and unshelve will do the trick and fstrim will work again. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2023078 Title: Wrong discard value after online volume migration Status in OpenStack Compute (nova): New Bug description: Nova incorrectly sets the libvirt XML after an online volume migration when the source is a backend that doesn’t support discard (Cinder doesn't return `discard: true` in the connection dictionary) to one that does. It seem like Nova doesn't rebuild the disk XML, so it's missing the discard=unmap that should have for the new volume. This bug results in the trimming/unmapping commands not working on the new volume until the next time Nova connects the volume. For example an instance reboot will not be enough, but a shelve and unshelve will do the trick and fstrim will work again. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2023078/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2020699] [NEW] Nova's rescue and unrescue assumes os-brick connect_volume is idempotent
Public bug reported: The rescue and unrescue operations in Nova assume that calls to `connect_volume` in os-brick are idempotent which it's currently true, but it was not something we guaranteed in os-brick. With the recent CVE [1][2] we realized that os-brick cannot assume on the `connect_volume` that if there is a device/s present for the provided connection information then it is the right volume, and even if it's the right volume it cannot assume that it has the right information in sysfs (like the volume size), so it needs to clean things up to the best of its ability before actually connecting, and just in case it needs to confirm just before returning a patch to the caller that the device it's going to return is actually correct and consistent (as in the multipath only has devices with the same size and SCSI ID). This means that os-brick's `connect_volume` will no longer be idempotent by design once this patch [3] merges to prevent data leak in any corner cases. This will break the rescue and unrescue nova operations, because on the rescue call it stashes the original XML [4] and then unstashes it on unrescue [5], but in between Nova calls `connect_volume` for the rescue instance, effectively disconnecting the original device path. This means that reusing that original path either points to a non- existent device or to a volume of another instance. We can see an example of the non-existent device case in the failed CI job [6] where test `tempest.api.compute.servers.test_server_rescue.ServerStableDeviceRescueTest.test_stable_device_rescue_disk_virtio_with_volume_attached` fails with a nova-compute error [7]: libvirt.libvirtError: Cannot access storage file '/dev/sdd': No such file or directory [1]: https://nvd.nist.gov/vuln/detail/CVE-2023-2088 [2]: https://bugs.launchpad.net/nova/+bug/2004555 [3]: https://review.opendev.org/c/openstack/os-brick/+/882841 [4]: https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4229-L4232 [5]: https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4323-L4328 [6]: https://a30336fa6a8fca5c6dba- fe779e5654b21fdff79727b204dfb7d6.ssl.cf1.rackcdn.com/882841/3/check/os- brick-src-tempest-lvm-lio-barbican/8ef7adf/testr_results.html [7]: https://zuul.opendev.org/t/openstack/build/8ef7adf6a82248d8b9f94eb5b5bba73c/log/controller/logs/screen- n-cpu.txt?severity=4#77239 ** Affects: nova Importance: High Status: Triaged ** Tags: cinder libvirt rescue volumes -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2020699 Title: Nova's rescue and unrescue assumes os-brick connect_volume is idempotent Status in OpenStack Compute (nova): Triaged Bug description: The rescue and unrescue operations in Nova assume that calls to `connect_volume` in os-brick are idempotent which it's currently true, but it was not something we guaranteed in os-brick. With the recent CVE [1][2] we realized that os-brick cannot assume on the `connect_volume` that if there is a device/s present for the provided connection information then it is the right volume, and even if it's the right volume it cannot assume that it has the right information in sysfs (like the volume size), so it needs to clean things up to the best of its ability before actually connecting, and just in case it needs to confirm just before returning a patch to the caller that the device it's going to return is actually correct and consistent (as in the multipath only has devices with the same size and SCSI ID). This means that os-brick's `connect_volume` will no longer be idempotent by design once this patch [3] merges to prevent data leak in any corner cases. This will break the rescue and unrescue nova operations, because on the rescue call it stashes the original XML [4] and then unstashes it on unrescue [5], but in between Nova calls `connect_volume` for the rescue instance, effectively disconnecting the original device path. This means that reusing that original path either points to a non- existent device or to a volume of another instance. We can see an example of the non-existent device case in the failed CI job [6] where test `tempest.api.compute.servers.test_server_rescue.ServerStableDeviceRescueTest.test_stable_device_rescue_disk_virtio_with_volume_attached` fails with a nova-compute error [7]: libvirt.libvirtError: Cannot access storage file '/dev/sdd': No such file or directory [1]: https://nvd.nist.gov/vuln/detail/CVE-2023-2088 [2]: https://bugs.launchpad.net/nova/+bug/2004555 [3]: https://review.opendev.org/c/openstack/os-brick/+/882841 [4]: https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4229-L4232 [5]:
[Yahoo-eng-team] [Bug 1967157] Re: Fails to extend in-use (non LUKS v1) encrypted volumes
Fix available in Zed ** Changed in: os-brick Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1967157 Title: Fails to extend in-use (non LUKS v1) encrypted volumes Status in OpenStack Compute (nova): Fix Released Status in os-brick: Fix Released Bug description: Patch fixing bug #1861071 resolved the issue of extending LUKS v1 volumes when nova connects them via libvirt instead of through os- brick, but nova side still fails to extend in-use volumes when they don't go through libvirt (i.e., LUKS v2). The logs will show a very similar error, but the user won't know that his has happened and Cinder will show the new size: Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [req-100471fa-c198-40ac-b713-adc395e480f1 req-3a1ea13e-916b-4851-be67-6d849bf4aa3a service nova] [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] resizing block device failed.: libvirt.libvirtError: internal error: unable to execut> Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] Traceback (most recent call last): Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2809, in extend_volume Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] connection_info, encryption) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2763, in _resize_attached_encrypted_volume Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] decrypted_device_new_size, block_device, instance) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2712, in _resize_attached_volume Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] block_device.resize(new_size) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 789, in resize Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] self._guest._domain.blockResize(self._disk, size, flags=flags) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 193, in doit Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] result = proxy_call(self._autowrap, f, *args, **kwargs) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 151, in proxy_call Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] rv = execute(f, *args, **kwargs) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 132, in execute Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] six.reraise(c, e, tb) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] raise value Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 86, in tworker Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] rv = meth(*args, **kwargs) Mar 29
[Yahoo-eng-team] [Bug 1922052] Re: Missing os-brick commands in debug mode
Moving bug to os-brick since that's where we are going to fix it, although the issue was caused by a change in Nova's code and it will only be visible on Nova. ** Project changed: nova => os-brick -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1922052 Title: Missing os-brick commands in debug mode Status in os-brick: Triaged Status in oslo.privsep: Fix Released Bug description: To debug os-brick's attach and detach code developers and system administrators rely on seeing what commands are actually being executed by os-brick. The os-brick library relies on the DEBUG logs from the libraries (such as ``oslo_concurrency.processutils``) for this purpose instead of duplicating log entries by logging the calls and stdout-stderr itself. The default configuration in Nova no longer logs those os-brick commands when running on debug mode. This issue was introduced when fixing bug #1784062, as the fix was to set ALL privsep calls to log only INFO level messages. The current solution is to set the ``default_log_levels`` configuration option in nova and include ``oslo.privsep.daemon=DEBUG`` in it. The default for os-brick should be the other way around, it should allow emitting DEBUG messages on debug mode. To manage notifications about this bug go to: https://bugs.launchpad.net/os-brick/+bug/1922052/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1922052] Re: Missing os-brick commands in debug mode
Fix available in Xena (2.6.0) ** Changed in: oslo.privsep Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1922052 Title: Missing os-brick commands in debug mode Status in OpenStack Compute (nova): Triaged Status in oslo.privsep: Fix Released Bug description: To debug os-brick's attach and detach code developers and system administrators rely on seeing what commands are actually being executed by os-brick. The os-brick library relies on the DEBUG logs from the libraries (such as ``oslo_concurrency.processutils``) for this purpose instead of duplicating log entries by logging the calls and stdout-stderr itself. The default configuration in Nova no longer logs those os-brick commands when running on debug mode. This issue was introduced when fixing bug #1784062, as the fix was to set ALL privsep calls to log only INFO level messages. The current solution is to set the ``default_log_levels`` configuration option in nova and include ``oslo.privsep.daemon=DEBUG`` in it. The default for os-brick should be the other way around, it should allow emitting DEBUG messages on debug mode. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1922052/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1988751] [NEW] Nova not detaching Cinder volume is volume is available
Public bug reported: It is known that Cinder and Nova can sometime be out of sync, where one of them thinks that a volume is attached and the other doesn't. With the Cinder attachment API an operator can now fix the issue where Nova says the volume is detached and Cinder says it's not. The operator just needs to run the "cinder attachment-delete" command on the specific attachment. The opposite situation, Cinder says it's available and Nova says it's attached, cannot be currently fixed without modifying the Nova database and making manual os-brick calls or the appropriate CLI calls to detach the volume. Ideally Nova should be able to call os-brick using the BDM information to locally detach the volume (using the force option which can lose data) and then not call cinder to do the detach since it already says the volume is not mapped. One way to reproduce this is: - Create a VM - Create a volume - Attach the volume - Delete the attachment in cinder with "cinder attachment-delete " - Trying to detach the volume in nova The error we'll see is something like: ERROR (BadRequest): Invalid volume: Invalid input received: Invalid volume: Unable to detach volume. Volume status must be 'in-use' and attach_status must be 'attached' to detach. (HTTP 400) (Request-ID: req- ec02147a-6b5b-40d2-991c-3d49207f5c9b) (HTTP 400) (Request-ID: req-d8ab82c5-cb32-446e-a8e9-fd8e30be0995) ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1988751 Title: Nova not detaching Cinder volume is volume is available Status in OpenStack Compute (nova): New Bug description: It is known that Cinder and Nova can sometime be out of sync, where one of them thinks that a volume is attached and the other doesn't. With the Cinder attachment API an operator can now fix the issue where Nova says the volume is detached and Cinder says it's not. The operator just needs to run the "cinder attachment-delete" command on the specific attachment. The opposite situation, Cinder says it's available and Nova says it's attached, cannot be currently fixed without modifying the Nova database and making manual os-brick calls or the appropriate CLI calls to detach the volume. Ideally Nova should be able to call os-brick using the BDM information to locally detach the volume (using the force option which can lose data) and then not call cinder to do the detach since it already says the volume is not mapped. One way to reproduce this is: - Create a VM - Create a volume - Attach the volume - Delete the attachment in cinder with "cinder attachment-delete " - Trying to detach the volume in nova The error we'll see is something like: ERROR (BadRequest): Invalid volume: Invalid input received: Invalid volume: Unable to detach volume. Volume status must be 'in-use' and attach_status must be 'attached' to detach. (HTTP 400) (Request-ID: req-ec02147a-6b5b-40d2-991c-3d49207f5c9b) (HTTP 400) (Request-ID: req-d8ab82c5-cb32-446e-a8e9-fd8e30be0995) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1988751/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1967157] [NEW] Fails to extend in-use (non LUKS v1) encrypted volumes
irtError('virDomainBlockResize() failed') Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] libvirt.libvirtError: internal error: unable to execute QEMU command 'block_resize': Cannot grow device files ** Affects: nova Importance: Undecided Assignee: Gorka Eguileor (gorka) Status: New ** Changed in: nova Assignee: (unassigned) => Gorka Eguileor (gorka) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1967157 Title: Fails to extend in-use (non LUKS v1) encrypted volumes Status in OpenStack Compute (nova): New Bug description: Patch fixing bug #1861071 resolved the issue of extending LUKS v1 volumes when nova connects them via libvirt instead of through os- brick, but nova side still fails to extend in-use volumes when they don't go through libvirt (i.e., LUKS v2). The logs will show a very similar error, but the user won't know that his has happened and Cinder will show the new size: Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [req-100471fa-c198-40ac-b713-adc395e480f1 req-3a1ea13e-916b-4851-be67-6d849bf4aa3a service nova] [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] resizing block device failed.: libvirt.libvirtError: internal error: unable to execut> Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] Traceback (most recent call last): Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2809, in extend_volume Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] connection_info, encryption) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2763, in _resize_attached_encrypted_volume Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] decrypted_device_new_size, block_device, instance) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2712, in _resize_attached_volume Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] block_device.resize(new_size) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 789, in resize Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] self._guest._domain.blockResize(self._disk, size, flags=flags) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 193, in doit Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] result = proxy_call(self._autowrap, f, *args, **kwargs) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 151, in proxy_call Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] rv = execute(f, *args, **kwargs) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 132, in execute Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] six.reraise(c, e, tb) Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] raise val
[Yahoo-eng-team] [Bug 1798224] Re: DeprecationWarning: The behavior of .best_match for the Accept classes is currently being maintained for backward compatibility, but the method will be deprecated in
** Also affects: cinder Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1798224 Title: DeprecationWarning: The behavior of .best_match for the Accept classes is currently being maintained for backward compatibility, but the method will be deprecated in the future Status in Cinder: In Progress Status in OpenStack Compute (nova): Fix Released Bug description: When executing 'tox -e py35', the following deprecation warning is shown. It should be fixed. 2018-10-16 03:36:49.117553 | ubuntu-xenial | {5} nova.tests.unit.api.openstack.compute.test_disk_config.DiskConfigTestCaseV21.test_update_server_override_auto [0.544275s] ... ok 2018-10-16 03:36:49.117626 | ubuntu-xenial | 2018-10-16 03:36:49.117666 | ubuntu-xenial | Captured stderr: 2018-10-16 03:36:49.117703 | ubuntu-xenial | (snipped...) 2018-10-16 03:36:49.118228 | ubuntu-xenial | b'/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/webob/acceptparse.py:1379: DeprecationWarning: The behavior of .best_match for the Accept classes is currently being maintained for backward compatibility, but the method will be deprecated in the future, as its behavior is not specified in (and currently does not conform to) RFC 7231.' 2018-10-16 03:36:49.118288 | ubuntu-xenial | b' DeprecationWarning,' 2018-10-16 03:36:49.118319 | ubuntu-xenial | b'' To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1798224/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1922052] [NEW] Missing os-brick commands in debug mode
Public bug reported: To debug os-brick's attach and detach code developers and system administrators rely on seeing what commands are actually being executed by os-brick. The os-brick library relies on the DEBUG logs from the libraries (such as ``oslo_concurrency.processutils``) for this purpose instead of duplicating log entries by logging the calls and stdout-stderr itself. The default configuration in Nova no longer logs those os-brick commands when running on debug mode. This issue was introduced when fixing bug #784062, as the fix was to set ALL privsep calls to log only INFO level messages. The current solution is to set the ``default_log_levels`` configuration option in nova and include ``oslo.privsep.daemon=DEBUG`` in it. The default for os-brick should be the other way around, it should allow emitting DEBUG messages on debug mode. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1922052 Title: Missing os-brick commands in debug mode Status in OpenStack Compute (nova): New Bug description: To debug os-brick's attach and detach code developers and system administrators rely on seeing what commands are actually being executed by os-brick. The os-brick library relies on the DEBUG logs from the libraries (such as ``oslo_concurrency.processutils``) for this purpose instead of duplicating log entries by logging the calls and stdout-stderr itself. The default configuration in Nova no longer logs those os-brick commands when running on debug mode. This issue was introduced when fixing bug #784062, as the fix was to set ALL privsep calls to log only INFO level messages. The current solution is to set the ``default_log_levels`` configuration option in nova and include ``oslo.privsep.daemon=DEBUG`` in it. The default for os-brick should be the other way around, it should allow emitting DEBUG messages on debug mode. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1922052/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829343] [NEW] requirements-check job fails
Public bug reported: requirements-check job will always fail with current Nova doc requirements: 2019-05-14 15:14:03.375075 | TASK [Run requirements check script] 2019-05-14 15:14:15.756135 | ubuntu-bionic | sys.version_info(major=3, minor=6, micro=7, releaselevel='final', serial=0) 2019-05-14 15:14:15.756335 | ubuntu-bionic | selecting default requirements directory for normal mode 2019-05-14 15:14:15.756388 | ubuntu-bionic | Branch: master 2019-05-14 15:14:15.756464 | ubuntu-bionic | Source: src/opendev.org/openstack/nova 2019-05-14 15:14:15.756573 | ubuntu-bionic | Requirements: /home/zuul/src/opendev.org/openstack/requirements 2019-05-14 15:14:15.756631 | ubuntu-bionic | git log -n 1 --format=%H 2019-05-14 15:14:15.756735 | ubuntu-bionic | Patch under test: b'f029534938012a19e4eee2d0927f4ccb2747b8fe' 2019-05-14 15:14:15.756870 | ubuntu-bionic | git --git-dir /home/zuul/src/opendev.org/openstack/requirements/.git rev-parse HEAD 2019-05-14 15:14:15.756988 | ubuntu-bionic | requirements git sha: b'85f9aad798323c5f6dd8ce1a3dd5e009c2f944a1' 2019-05-14 15:14:15.757056 | ubuntu-bionic | virtualenv /tmp/tmp62m_7g3d/venv 2019-05-14 15:14:15.757192 | ubuntu-bionic | /tmp/tmp62m_7g3d/venv/bin/pip install /home/zuul/src/opendev.org/openstack/requirements 2019-05-14 15:14:15.757289 | ubuntu-bionic | Checking b'f029534938012a19e4eee2d0927f4ccb2747b8fe' 2019-05-14 15:14:15.757350 | ubuntu-bionic | Processing requirements.txt 2019-05-14 15:14:15.757417 | ubuntu-bionic | Processing test-requirements.txt 2019-05-14 15:14:15.757482 | ubuntu-bionic | Processing doc/requirements.txt 2019-05-14 15:14:15.757539 | ubuntu-bionic | Processing .[osprofiler] 2019-05-14 15:14:15.757603 | ubuntu-bionic | Validating requirements.txt 2019-05-14 15:14:15.757671 | ubuntu-bionic | Validating test-requirements.txt 2019-05-14 15:14:15.757736 | ubuntu-bionic | Validating doc/requirements.txt 2019-05-14 15:14:15.757999 | ubuntu-bionic | Requirement(package='sphinx', location='', specifiers='!=1.6.6,!=1.6.7,>=1.6.2', markers='', comment='# BSD', extras=frozenset()) 'markers': '' does not match "python_version=='2.7'" 2019-05-14 15:14:15.758262 | ubuntu-bionic | Requirement(package='sphinx', location='', specifiers='!=1.6.6,!=1.6.7,>=1.6.2', markers='', comment='# BSD', extras=frozenset()) 'markers': '' does not match "python_version>='3.4'" 2019-05-14 15:14:15.758537 | ubuntu-bionic | Could not find a global requirements entry to match package sphinx. If the package is already included in the global list, the name or platform markers there may not match the local settings. 2019-05-14 15:14:15.758596 | ubuntu-bionic | Validating osprofiler 2019-05-14 15:14:15.758684 | ubuntu-bionic | Validating lower constraints of requirements.txt 2019-05-14 15:14:15.758777 | ubuntu-bionic | Validating lower constraints of test-requirements.txt 2019-05-14 15:14:15.758865 | ubuntu-bionic | Validating lower constraints of osprofiler 2019-05-14 15:14:15.758938 | ubuntu-bionic | *** Incompatible requirement found! 2019-05-14 15:14:15.759063 | ubuntu-bionic | *** See https://docs.openstack.org/requirements/latest/ 2019-05-14 15:14:16.085404 | ubuntu-bionic | ERROR 2019-05-14 15:14:16.085856 | ubuntu-bionic | { 2019-05-14 15:14:16.085977 | ubuntu-bionic | "delta": "0:00:11.176407", 2019-05-14 15:14:16.086084 | ubuntu-bionic | "end": "2019-05-14 15:14:15.781517", 2019-05-14 15:14:16.086187 | ubuntu-bionic | "msg": "non-zero return code", 2019-05-14 15:14:16.086288 | ubuntu-bionic | "rc": 1, 2019-05-14 15:14:16.086391 | ubuntu-bionic | "start": "2019-05-14 15:14:04.605110" 2019-05-14 15:14:16.086490 | ubuntu-bionic | } 2019-05-14 15:14:16.170292 | ** Affects: nova Importance: Undecided Assignee: Gorka Eguileor (gorka) Status: In Progress ** Changed in: nova Assignee: (unassigned) => Gorka Eguileor (gorka) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1829343 Title: requirements-check job fails Status in OpenStack Compute (nova): In Progress Bug description: requirements-check job will always fail with current Nova doc requirements: 2019-05-14 15:14:03.375075 | TASK [Run requirements check script] 2019-05-14 15:14:15.756135 | ubuntu-bionic | sys.version_info(major=3, minor=6, micro=7, releaselevel='final', serial=0) 2019-05-14 15:14:15.756335 | ubuntu-bionic | selecting default requirements directory for normal mode 2019-05-14 15:14:15.756388 | ubuntu-bionic | Branch: master 2019-05-14 15:14:15.756464 | ubuntu-bionic | Source: src/opendev.org/openstack/nova 2019-05-14 15:14:15.756573 | ubuntu-bionic | Requirements: /home/zuul/src/opendev.org/openstack/requirements 2019-05-14 15:14:15.756631 | ubuntu-bionic | git log -n 1 --format=
[Yahoo-eng-team] [Bug 1800515] [NEW] Unnecessary locking when connecting volumes
Public bug reported: Cinder introduced "shared_targets" and "service_uuid" fields in volumes to allow volume consumers to protect themselves from unintended leftover devices when handling iSCSI connections with shared targets. The way to protect from the automatic scans that happen on detach/map race conditions is by locking and only allowing one attach or one detach operation for each server to happen at a given time. When using an up to date Open iSCSI initiator we don't need to use locks, as it has the possibility to disable automatic LUN scans (which are the real cause of the leftover devices), and OS-Brick already supports this feature. Currently Nova is blindly locking whenever "shared_targets" is set to True, even when the iSCSI initiator and OS-Brick are already presenting such races, which introduces unnecessary serialization on the connection of volumes. ** Affects: nova Importance: Undecided Assignee: Gorka Eguileor (gorka) Status: New ** Changed in: nova Assignee: (unassigned) => Gorka Eguileor (gorka) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1800515 Title: Unnecessary locking when connecting volumes Status in OpenStack Compute (nova): New Bug description: Cinder introduced "shared_targets" and "service_uuid" fields in volumes to allow volume consumers to protect themselves from unintended leftover devices when handling iSCSI connections with shared targets. The way to protect from the automatic scans that happen on detach/map race conditions is by locking and only allowing one attach or one detach operation for each server to happen at a given time. When using an up to date Open iSCSI initiator we don't need to use locks, as it has the possibility to disable automatic LUN scans (which are the real cause of the leftover devices), and OS-Brick already supports this feature. Currently Nova is blindly locking whenever "shared_targets" is set to True, even when the iSCSI initiator and OS-Brick are already presenting such races, which introduces unnecessary serialization on the connection of volumes. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1800515/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1703954] Re: Attach/Detach encrypted volume problems with real paths
** Changed in: os-brick Status: New => Fix Released ** Changed in: cinder Assignee: (unassigned) => Gorka Eguileor (gorka) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1703954 Title: Attach/Detach encrypted volume problems with real paths Status in Cinder: New Status in OpenStack Compute (nova): Incomplete Status in os-brick: Fix Released Bug description: OS-Brick on 1.14 and 1.15 returns real paths instead of returning symbolic links, which results in the encryption attach_volume call replacing the real device with a link to the crypt dm. The issue comes from the Nova flow when attaching an encrypted volume: 1- Attach volume 2- Generate libvirt configuration with path from step 1 3- Encrypt attach volume Since step 2 has already generated the config with the path from step 1 then step 3 must preserve this path. When step 1 returns a symbolic link we just forcefully replace it with a link to the crypt dm and everything is OK, but when we return a real path it does the same thing, which means we'll be replacing for example /dev/sda with a symlink, which will then break the detach process, and all future attachments. If flow order was changed to be 1, 3, 2 then the encrypt attach volume could give a different path to be used for the libvirt config generation. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1703954/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1703954] [NEW] Attach/Detach encrypted volume problems with real paths
Public bug reported: OS-Brick on 1.14 and 1.15 returns real paths instead of returning symbolic links, which results in the encryption attach_volume call replacing the real device with a link to the crypt dm. The issue comes from the Nova flow when attaching an encrypted volume: 1- Attach volume 2- Generate libvirt configuration with path from step 1 3- Encrypt attach volume Since step 2 has already generated the config with the path from step 1 then step 3 must preserve this path. When step 1 returns a symbolic link we just forcefully replace it with a link to the crypt dm and everything is OK, but when we return a real path it does the same thing, which means we'll be replacing for example /dev/sda with a symlink, which will then break the detach process, and all future attachments. If flow order was changed to be 1, 3, 2 then the encrypt attach volume could give a different path to be used for the libvirt config generation. ** Affects: cinder Importance: Undecided Status: New ** Affects: nova Importance: Undecided Status: New ** Affects: os-brick Importance: Undecided Status: New ** Also affects: cinder Importance: Undecided Status: New ** Also affects: nova Importance: Undecided Status: New ** Description changed: OS-Brick on 1.14 and 1.15 returns real paths instead of returning symbolic links, which results in the encryption attach_volume call replacing the real device with a link to the crypt dm. The issue comes from the Nova flow when attaching an encrypted volume: 1- Attach volume 2- Generate libvirt configuration with path from step 1 3- Encrypt attach volume Since step 2 has already generated the config with the path from step 1 then step 3 must preserve this path. When step 1 returns a symbolic link we just forcefully replace it with a link to the crypt dm and everything is OK, but when we return a real - path it does the same thing. + path it does the same thing, which means we'll be replacing for example + /dev/sda with a symlink, which will then break the detach process, and + all future attachments. If flow order was changed to be 1, 3, 2 then the encrypt attach volume could give a different path to be used for the libvirt config generation. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1703954 Title: Attach/Detach encrypted volume problems with real paths Status in Cinder: New Status in OpenStack Compute (nova): New Status in os-brick: New Bug description: OS-Brick on 1.14 and 1.15 returns real paths instead of returning symbolic links, which results in the encryption attach_volume call replacing the real device with a link to the crypt dm. The issue comes from the Nova flow when attaching an encrypted volume: 1- Attach volume 2- Generate libvirt configuration with path from step 1 3- Encrypt attach volume Since step 2 has already generated the config with the path from step 1 then step 3 must preserve this path. When step 1 returns a symbolic link we just forcefully replace it with a link to the crypt dm and everything is OK, but when we return a real path it does the same thing, which means we'll be replacing for example /dev/sda with a symlink, which will then break the detach process, and all future attachments. If flow order was changed to be 1, 3, 2 then the encrypt attach volume could give a different path to be used for the libvirt config generation. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1703954/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1593055] Re: Retype an in-use volume failed in mitaka
*** This bug is a duplicate of bug 1381153 *** https://bugs.launchpad.net/bugs/1381153 Just for reference, this is a well know issue with some QEMU versions that don't support blockcopy command. You just need to get the right QEMU binary built and installed if you don't have the right one available in your distro. There are some instructions around that may help you: https://www.jrssite.com/wordpress/?p=302 ** Changed in: nova Status: New => Invalid ** This bug has been marked a duplicate of bug 1381153 Cannot create instance live snapshots in Centos7 (icehouse) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1593055 Title: Retype an in-use volume failed in mitaka Status in Cinder: Invalid Status in OpenStack Compute (nova): Invalid Bug description: reproduce: 1 create an vm 2 create a volume in lvm 3 attach volume to vm 4 retype volume to another backend Look at the log like qemu version problem.My qemu version is like this: [root@localhost logs]# rpm -qa | grep qemu qemu-img-1.5.3-105.el7_2.4.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64 ipxe-roms-qemu-20130517-8.gitc4bce43.el7_2.1.noarch qemu-kvm-1.5.3-105.el7_2.4.x86_64 qemu-kvm-common-1.5.3-105.el7_2.4.x86_64 error logs: 2016-06-16 19:03:46.892 ERROR oslo_messaging.rpc.server [req-8c37204e-3484-448a-8aed-35f38403178d admin admin] Exception during handling message 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server Traceback (most recent call last): 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 153, in dispatch 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 122, in _do_dispatch 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server result = func(ctxt, **new_args) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception.py", line 110, in wrapped 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server payload) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__ 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server self.force_reraise() 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in force_reraise 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception.py", line 89, in wrapped 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server return f(self, context, *args, **kw) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server LOG.warning(msg, e, instance=instance) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__ 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server self.force_reraise() 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in force_reraise 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info()) 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__ 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server self.force_reraise() 2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in force_reraise 2016-06-16 19:03:46.892 TRACE
[Yahoo-eng-team] [Bug 1268439] Re: range method is not same in py3.x and py2.x
Fix released as LP bug 1530249 ** Changed in: cinder Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1268439 Title: range method is not same in py3.x and py2.x Status in Ceilometer: Fix Released Status in Cinder: Fix Released Status in Glance: Fix Released Status in heat: Fix Released Status in OpenStack Identity (keystone): Invalid Status in neutron: Fix Released Status in python-ceilometerclient: Fix Released Status in python-neutronclient: Invalid Status in python-swiftclient: Fix Released Status in OpenStack Object Storage (swift): In Progress Bug description: in py3.x,range is xrange in py2.x. in py3.x, if you want get a list,you must use: list(range(value)) I review the code, find that many codes use range for loop, if used py3.x environment, it will occure error. so we must modify this issue. To manage notifications about this bug go to: https://bugs.launchpad.net/ceilometer/+bug/1268439/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1449639] Re: RBD: On image creation error, image is not deleted
** Description changed: When an exception rises while adding/creating an image, and the image has been created, this new image is not properly deleted. The fault lies in the `_delete_image` call of the Store.add method that is providing incorrect arguments. + + This also affects Glance (Icehouse), since back then glance_store + functionality was included there. ** Also affects: glance Importance: Undecided Status: New ** Changed in: glance Assignee: (unassigned) = Gorka Eguileor (gorka) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1449639 Title: RBD: On image creation error, image is not deleted Status in OpenStack Image Registry and Delivery Service (Glance): New Status in OpenStack Glance backend store-drivers library (glance_store): New Bug description: When an exception rises while adding/creating an image, and the image has been created, this new image is not properly deleted. The fault lies in the `_delete_image` call of the Store.add method that is providing incorrect arguments. This also affects Glance (Icehouse), since back then glance_store functionality was included there. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1449639/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1422699] [NEW] glance api doesn't abort start up on Store configuration errors
Public bug reported: Glance api service does not abort start up when errors in glance-api.cfg file are encountered. It would make sense to abort service start up when a BadStoreConfiguration exception is encountered, instead of just sending the error to the logs and disabling adding images to that Store. For example if a Filesystem Storage Backend with multiple store is configured with a duplicate directory: filesystem_store_datadirs=/mnt/nfs1/images/:200 filesystem_store_datadirs=/mnt/nfs1/images/:100 Logs will have the error: ERROR glance_store._drivers.filesystem [-] Directory /mnt/nfs1/image specified multiple times in filesystem_store_datadirs option of filesystem configuration TRACE glance_store._drivers.filesystem None TRACE glance_store._drivers.filesystem WARNING glance_store.driver [-] Failed to configure store correctly: None Disabling add method. Service will start and when client tries to add an image he will receive a 410 Gone error saying: Error in store configuration. Adding images to store is disabled. This affects not only the filesystem storage backend but all glance- storage drivers that encounter an error in the configuration and raise a BadStoreConfiguration exception. How reproducible: Every time Steps to Reproduce: 1. Configure Glance to use Filesystem Storage Backend with multiple store and duplicate a filesystem_storage_datadirs. 2. Run glance api Expected behavior: Glance api service should not have started and should have reported that the directory was specified multiple times. ** Affects: glance Importance: Undecided Status: New ** Affects: glance-store Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1422699 Title: glance api doesn't abort start up on Store configuration errors Status in OpenStack Image Registry and Delivery Service (Glance): New Status in OpenStack Glance backend store-drivers library (glance_store): New Bug description: Glance api service does not abort start up when errors in glance-api.cfg file are encountered. It would make sense to abort service start up when a BadStoreConfiguration exception is encountered, instead of just sending the error to the logs and disabling adding images to that Store. For example if a Filesystem Storage Backend with multiple store is configured with a duplicate directory: filesystem_store_datadirs=/mnt/nfs1/images/:200 filesystem_store_datadirs=/mnt/nfs1/images/:100 Logs will have the error: ERROR glance_store._drivers.filesystem [-] Directory /mnt/nfs1/image specified multiple times in filesystem_store_datadirs option of filesystem configuration TRACE glance_store._drivers.filesystem None TRACE glance_store._drivers.filesystem WARNING glance_store.driver [-] Failed to configure store correctly: None Disabling add method. Service will start and when client tries to add an image he will receive a 410 Gone error saying: Error in store configuration. Adding images to store is disabled. This affects not only the filesystem storage backend but all glance- storage drivers that encounter an error in the configuration and raise a BadStoreConfiguration exception. How reproducible: Every time Steps to Reproduce: 1. Configure Glance to use Filesystem Storage Backend with multiple store and duplicate a filesystem_storage_datadirs. 2. Run glance api Expected behavior: Glance api service should not have started and should have reported that the directory was specified multiple times. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1422699/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1422699] Re: glance api doesn't abort start up on Store configuration errors
** Also affects: glance-store Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1422699 Title: glance api doesn't abort start up on Store configuration errors Status in OpenStack Image Registry and Delivery Service (Glance): New Status in OpenStack Glance backend store-drivers library (glance_store): New Bug description: Glance api service does not abort start up when errors in glance-api.cfg file are encountered. It would make sense to abort service start up when a BadStoreConfiguration exception is encountered, instead of just sending the error to the logs and disabling adding images to that Store. For example if a Filesystem Storage Backend with multiple store is configured with a duplicate directory: filesystem_store_datadirs=/mnt/nfs1/images/:200 filesystem_store_datadirs=/mnt/nfs1/images/:100 Logs will have the error: ERROR glance_store._drivers.filesystem [-] Directory /mnt/nfs1/image specified multiple times in filesystem_store_datadirs option of filesystem configuration TRACE glance_store._drivers.filesystem None TRACE glance_store._drivers.filesystem WARNING glance_store.driver [-] Failed to configure store correctly: None Disabling add method. Service will start and when client tries to add an image he will receive a 410 Gone error saying: Error in store configuration. Adding images to store is disabled. This affects not only the filesystem storage backend but all glance- storage drivers that encounter an error in the configuration and raise a BadStoreConfiguration exception. How reproducible: Every time Steps to Reproduce: 1. Configure Glance to use Filesystem Storage Backend with multiple store and duplicate a filesystem_storage_datadirs. 2. Run glance api Expected behavior: Glance api service should not have started and should have reported that the directory was specified multiple times. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1422699/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp