[Yahoo-eng-team] [Bug 2035911] [NEW] Race conditions attaching/detaching volumes

2023-09-14 Thread Gorka Eguileor
Public bug reported:

For cinder volume attach and detach operations Nova is properly using
os-brick's `guard_connection` as a context manager to protect against
race conditions the the local disconnection of the volume and the call
to cinder to unmap/unexport the volume with other volumes' local
connection and map/export.

This is the code:

def detach(self, context, instance, volume_api, virt_driver,
   attachment_id=None, destroy_bdm=False):
volume = self._get_volume(context, volume_api, self.volume_id)
# Let OS-Brick handle high level locking that covers the local os-brick
# detach and the Cinder call to call unmap the volume.  Not all volume
# backends or hosts require locking.
with brick_utils.guard_connection(volume):
self._do_detach(context, instance, volume_api, virt_driver,
attachment_id, destroy_bdm)


@update_db
def attach(self, context, instance, volume_api, virt_driver,
   do_driver_attach=False, **kwargs):
volume = self._get_volume(context, volume_api, self.volume_id)
volume_api.check_availability_zone(context, volume,
   instance=instance)
# Let OS-Brick handle high level locking that covers the call to
# Cinder that exports & maps the volume, and for the local os-brick
# attach.  Not all volume backends or hosts require locking.
with brick_utils.guard_connection(volume):
self._do_attach(context, instance, volume, volume_api,
virt_driver, do_driver_attach)

But there are many other places where Nova does attach or detach not
using those 2 methods (`detach` and `attach`).

One example is when we delete an instance that has cinder volumes
attached, another one is when finishing an instance live migration.

Nova needs to always use the `guard_connection` context manager.

There are places where it will be easy to fix such as the
`_remove_volume_connection` method in cinder/volume/manager.py.

But there are others where it looks like it will be harder like
`_shutdown_instance` in the same file, because the volumes are locally
detached through the `self.driver.destroy` call and only after all the
volumes (potentially from different backends) have been locally removed
does it call `self.volume_api.attachment_delete` for each of the
volumes.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2035911

Title:
  Race conditions attaching/detaching volumes

Status in OpenStack Compute (nova):
  New

Bug description:
  For cinder volume attach and detach operations Nova is properly using
  os-brick's `guard_connection` as a context manager to protect against
  race conditions the the local disconnection of the volume and the call
  to cinder to unmap/unexport the volume with other volumes' local
  connection and map/export.

  This is the code:

  def detach(self, context, instance, volume_api, virt_driver,
 attachment_id=None, destroy_bdm=False):
  volume = self._get_volume(context, volume_api, self.volume_id)
  # Let OS-Brick handle high level locking that covers the local 
os-brick
  # detach and the Cinder call to call unmap the volume.  Not all volume
  # backends or hosts require locking.
  with brick_utils.guard_connection(volume):
  self._do_detach(context, instance, volume_api, virt_driver,
  attachment_id, destroy_bdm)

  
  @update_db
  def attach(self, context, instance, volume_api, virt_driver,
 do_driver_attach=False, **kwargs):
  volume = self._get_volume(context, volume_api, self.volume_id)
  volume_api.check_availability_zone(context, volume,
 instance=instance)
  # Let OS-Brick handle high level locking that covers the call to
  # Cinder that exports & maps the volume, and for the local os-brick
  # attach.  Not all volume backends or hosts require locking.
  with brick_utils.guard_connection(volume):
  self._do_attach(context, instance, volume, volume_api,
  virt_driver, do_driver_attach)

  But there are many other places where Nova does attach or detach not
  using those 2 methods (`detach` and `attach`).

  One example is when we delete an instance that has cinder volumes
  attached, another one is when finishing an instance live migration.

  Nova needs to always use the `guard_connection` context manager.

  There are places where it will be easy to fix such as the
  `_remove_volume_connection` method in cinder/volume/manager.py.

  But there are others where it looks like it will be harder like
  

[Yahoo-eng-team] [Bug 2035375] [NEW] Detaching multiple NVMe-oF volumes may leave the subsystem in connecting state

2023-09-13 Thread Gorka Eguileor
Public bug reported:

When detaching multiple NVMe-oF volumes from the same host we may end
with a NVMe subsystem in "connecting" state, and we'll see a bunch nvme
error in dmesg.

This happens on storage systems that share the same subsystem for
multiple volumes because Nova has not been updated to support the tri-
state "shared_targets" option that groups the detach and unmap of
volumes to prevent race conditions.

This is related to the issue mentioned in an os-brick commit message:
https://review.opendev.org/c/openstack/os-brick/+/836062/12//COMMIT_MSG

** Affects: nova
 Importance: Undecided
 Assignee: Gorka Eguileor (gorka)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Gorka Eguileor (gorka)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2035375

Title:
  Detaching multiple NVMe-oF volumes may leave the subsystem in
  connecting state

Status in OpenStack Compute (nova):
  New

Bug description:
  When detaching multiple NVMe-oF volumes from the same host we may end
  with a NVMe subsystem in "connecting" state, and we'll see a bunch
  nvme error in dmesg.

  This happens on storage systems that share the same subsystem for
  multiple volumes because Nova has not been updated to support the tri-
  state "shared_targets" option that groups the detach and unmap of
  volumes to prevent race conditions.

  This is related to the issue mentioned in an os-brick commit message:
  https://review.opendev.org/c/openstack/os-
  brick/+/836062/12//COMMIT_MSG

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2035375/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2035368] [NEW] Debug config option in cinder and glance sections not working

2023-09-13 Thread Gorka Eguileor
Public bug reported:

Nova has the hability of individually setting debug mode for cinder
related libraries (cinderclient and os-brick) and for glanceclient using
their respective configuration sections and setting `debug = true`
regardless of the default debug setting.

Unfortunately these options don't work as expected and have no effect.

** Affects: nova
 Importance: Undecided
 Assignee: Gorka Eguileor (gorka)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Gorka Eguileor (gorka)

** Description changed:

- Nova has the possibility of individually setting debug mode for cinder
+ Nova has the hability of individually setting debug mode for cinder
  related libraries (cinderclient and os-brick) and for glanceclient using
  their respective configuration sections and setting `debug = true`
  regardless of the default debug setting.
  
  Unfortunately these options don't work as expected and have no effect.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2035368

Title:
  Debug config option in cinder and glance sections not working

Status in OpenStack Compute (nova):
  New

Bug description:
  Nova has the hability of individually setting debug mode for cinder
  related libraries (cinderclient and os-brick) and for glanceclient
  using their respective configuration sections and setting `debug =
  true` regardless of the default debug setting.

  Unfortunately these options don't work as expected and have no effect.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2035368/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2023210] [NEW] Wrong discard_granurality and discard_max_bytes reported in guest OS

2023-06-07 Thread Gorka Eguileor
Public bug reported:

Assuming we have configured everything correctly in OpenStack to use
discard on Cinder block devices and we  have checked that discard works,
the wrong values for discard_granularity` and `discard_max_bytes` are
reported to the VM's guest Operating System.

We can confirm this by checking the values on the host and the guest and
see that they don't match:

- `/sys/block//queue/discard_max_bytes`
- `/sys/block//queue/discard_granularity`.

The problem is that there is no code in Nova to set these values in
libvirt, nor does Cinder or os-brick have any code to detect the right
values to set.

The libvirt fuctionality to set `discard_granularity` and
`max_unmap_size` already exists:
(https://bugzilla.redhat.com/show_bug.cgi?id=1408553.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2023210

Title:
  Wrong discard_granurality and discard_max_bytes reported in guest OS

Status in OpenStack Compute (nova):
  New

Bug description:
  Assuming we have configured everything correctly in OpenStack to use
  discard on Cinder block devices and we  have checked that discard
  works, the wrong values for discard_granularity` and
  `discard_max_bytes` are reported to the VM's guest Operating System.

  We can confirm this by checking the values on the host and the guest
  and see that they don't match:

  - `/sys/block//queue/discard_max_bytes`
  - `/sys/block//queue/discard_granularity`.

  The problem is that there is no code in Nova to set these values in
  libvirt, nor does Cinder or os-brick have any code to detect the right
  values to set.

  The libvirt fuctionality to set `discard_granularity` and
  `max_unmap_size` already exists:
  (https://bugzilla.redhat.com/show_bug.cgi?id=1408553.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2023210/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2023079] [NEW] Sparseness is not preserved on live migration

2023-06-06 Thread Gorka Eguileor
Public bug reported:

When doing a live volume migration (block migration) thin volumes
effectively become thick because all bytes are copies from the source to
the destination.

I understand that only the filesystem can know about the block that are
unallocated, but Nova can set the "detect_zeroes" when doing the block
mirroring.

This doesn't seem to work for all drivers, but I have confirmed that it
works for NFS and RBD volumes.  Probably someone more knowledgeable
should look into why it doesn't work for iSCSI and such.

It requires some additional CPU computational power, so we shouldn't be
setting it for normal operations, but I believe the big difference in
time and network bandwidth during live migration is well worth it.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2023079

Title:
  Sparseness is not preserved on live migration

Status in OpenStack Compute (nova):
  New

Bug description:
  When doing a live volume migration (block migration) thin volumes
  effectively become thick because all bytes are copies from the source
  to the destination.

  I understand that only the filesystem can know about the block that
  are unallocated, but Nova can set the "detect_zeroes" when doing the
  block mirroring.

  This doesn't seem to work for all drivers, but I have confirmed that
  it works for NFS and RBD volumes.  Probably someone more knowledgeable
  should look into why it doesn't work for iSCSI and such.

  It requires some additional CPU computational power, so we shouldn't
  be setting it for normal operations, but I believe the big difference
  in time and network bandwidth during live migration is well worth it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2023079/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2023078] [NEW] Wrong discard value after online volume migration

2023-06-06 Thread Gorka Eguileor
Public bug reported:

Nova incorrectly sets the libvirt XML after an online volume migration
when the source is a backend that doesn’t support discard (Cinder
doesn't return `discard: true` in the connection dictionary) to one that
does.

It seem like Nova doesn't rebuild the disk XML, so it's missing the
discard=unmap that should have for the new volume.

This bug results in the trimming/unmapping commands not working on the
new volume until the next time Nova connects the volume.

For example an instance reboot will not be enough, but a shelve and
unshelve will do the trick and fstrim will work again.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2023078

Title:
  Wrong discard value after online volume migration

Status in OpenStack Compute (nova):
  New

Bug description:
  Nova incorrectly sets the libvirt XML after an online volume migration
  when the source is a backend that doesn’t support discard (Cinder
  doesn't return `discard: true` in the connection dictionary) to one
  that does.

  It seem like Nova doesn't rebuild the disk XML, so it's missing the
  discard=unmap that should have for the new volume.

  This bug results in the trimming/unmapping commands not working on the
  new volume until the next time Nova connects the volume.

  For example an instance reboot will not be enough, but a shelve and
  unshelve will do the trick and fstrim will work again.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2023078/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2020699] [NEW] Nova's rescue and unrescue assumes os-brick connect_volume is idempotent

2023-05-24 Thread Gorka Eguileor
Public bug reported:

The rescue and unrescue operations in Nova assume that calls to
`connect_volume` in os-brick are idempotent which it's currently true,
but it was not something we guaranteed in os-brick.

With the recent CVE [1][2] we realized that os-brick cannot assume on
the `connect_volume` that if there is a device/s present for the
provided connection information then it is the right volume, and even if
it's the right volume it cannot assume that it has the right information
in sysfs (like the volume size), so it needs to clean things up to the
best of its ability before actually connecting, and just in case it
needs to confirm just before returning a patch to the caller that the
device it's going to return is actually correct and consistent (as in
the multipath only has devices with the same size and SCSI ID).

This means that os-brick's `connect_volume` will no longer be idempotent
by design once this patch [3] merges to prevent data leak in any corner
cases.

This will break the rescue and unrescue nova operations, because on the
rescue call it stashes the original XML [4] and then unstashes it on
unrescue [5], but in between Nova calls `connect_volume` for the rescue
instance, effectively disconnecting the original device path.

This means that reusing that original path either points to a non-
existent device or to a  volume of another instance.

We can see an example of the non-existent device case in the failed CI
job [6] where test
`tempest.api.compute.servers.test_server_rescue.ServerStableDeviceRescueTest.test_stable_device_rescue_disk_virtio_with_volume_attached`
fails with a nova-compute error [7]:

  libvirt.libvirtError: Cannot access storage file '/dev/sdd': No such
file or directory


[1]: https://nvd.nist.gov/vuln/detail/CVE-2023-2088

[2]: https://bugs.launchpad.net/nova/+bug/2004555

[3]: https://review.opendev.org/c/openstack/os-brick/+/882841

[4]:
https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4229-L4232

[5]:
https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4323-L4328

[6]: https://a30336fa6a8fca5c6dba-
fe779e5654b21fdff79727b204dfb7d6.ssl.cf1.rackcdn.com/882841/3/check/os-
brick-src-tempest-lvm-lio-barbican/8ef7adf/testr_results.html

[7]:
https://zuul.opendev.org/t/openstack/build/8ef7adf6a82248d8b9f94eb5b5bba73c/log/controller/logs/screen-
n-cpu.txt?severity=4#77239

** Affects: nova
 Importance: High
 Status: Triaged


** Tags: cinder libvirt rescue volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2020699

Title:
  Nova's rescue and unrescue assumes os-brick connect_volume is
  idempotent

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  The rescue and unrescue operations in Nova assume that calls to
  `connect_volume` in os-brick are idempotent which it's currently true,
  but it was not something we guaranteed in os-brick.

  With the recent CVE [1][2] we realized that os-brick cannot assume on
  the `connect_volume` that if there is a device/s present for the
  provided connection information then it is the right volume, and even
  if it's the right volume it cannot assume that it has the right
  information in sysfs (like the volume size), so it needs to clean
  things up to the best of its ability before actually connecting, and
  just in case it needs to confirm just before returning a patch to the
  caller that the device it's going to return is actually correct and
  consistent (as in the multipath only has devices with the same size
  and SCSI ID).

  This means that os-brick's `connect_volume` will no longer be
  idempotent by design once this patch [3] merges to prevent data leak
  in any corner cases.

  This will break the rescue and unrescue nova operations, because on
  the rescue call it stashes the original XML [4] and then unstashes it
  on unrescue [5], but in between Nova calls `connect_volume` for the
  rescue instance, effectively disconnecting the original device path.

  This means that reusing that original path either points to a non-
  existent device or to a  volume of another instance.

  We can see an example of the non-existent device case in the failed CI
  job [6] where test
  
`tempest.api.compute.servers.test_server_rescue.ServerStableDeviceRescueTest.test_stable_device_rescue_disk_virtio_with_volume_attached`
  fails with a nova-compute error [7]:

libvirt.libvirtError: Cannot access storage file '/dev/sdd': No such
  file or directory


  [1]: https://nvd.nist.gov/vuln/detail/CVE-2023-2088

  [2]: https://bugs.launchpad.net/nova/+bug/2004555

  [3]: https://review.opendev.org/c/openstack/os-brick/+/882841

  [4]:
  
https://github.com/openstack/nova/blob/71b105a4cfea054827e09b5b8df6be845909275a/nova/virt/libvirt/driver.py#L4229-L4232

  [5]:
  

[Yahoo-eng-team] [Bug 1967157] Re: Fails to extend in-use (non LUKS v1) encrypted volumes

2023-03-03 Thread Gorka Eguileor
Fix available in Zed

** Changed in: os-brick
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1967157

Title:
  Fails to extend in-use (non LUKS v1) encrypted volumes

Status in OpenStack Compute (nova):
  Fix Released
Status in os-brick:
  Fix Released

Bug description:
  Patch fixing bug #1861071 resolved the issue of extending LUKS v1
  volumes when nova connects them via libvirt instead of through os-
  brick, but nova side still fails to extend in-use volumes when they
  don't go through libvirt (i.e., LUKS v2).

  The logs will show a very similar error, but the user won't know that
  his has happened and Cinder will show the new size:

  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [req-100471fa-c198-40ac-b713-adc395e480f1 
req-3a1ea13e-916b-4851-be67-6d849bf4aa3a service nova] [instance: 
3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] resizing block device failed.: 
libvirt.libvirtError: internal error: unable to execut>
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
Traceback (most recent call last):
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2809, in extend_volume
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
connection_info, encryption)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2763, in 
_resize_attached_encrypted_volume
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
decrypted_device_new_size, block_device, instance)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2712, in 
_resize_attached_volume
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
block_device.resize(new_size)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 789, in resize
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
self._guest._domain.blockResize(self._disk, size, flags=flags)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 193, in 
doit
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
result = proxy_call(self._autowrap, f, *args, **kwargs)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 151, in 
proxy_call
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
rv = execute(f, *args, **kwargs)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 132, in 
execute
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
six.reraise(c, e, tb)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
raise value
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 86, in 
tworker
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
rv = meth(*args, **kwargs)
  Mar 29 

[Yahoo-eng-team] [Bug 1922052] Re: Missing os-brick commands in debug mode

2023-01-26 Thread Gorka Eguileor
Moving bug to os-brick since that's where we are going to fix it,
although the issue was caused by a change in Nova's code and it will
only be visible on Nova.

** Project changed: nova => os-brick

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1922052

Title:
  Missing os-brick commands in debug mode

Status in os-brick:
  Triaged
Status in oslo.privsep:
  Fix Released

Bug description:
  To debug os-brick's attach and detach code developers and system
  administrators rely on seeing  what commands are actually being
  executed by os-brick.

  The os-brick library relies on the DEBUG logs from the libraries (such
  as ``oslo_concurrency.processutils``) for this purpose instead of
  duplicating log entries by logging the calls and stdout-stderr itself.

  The default configuration in Nova no longer logs those os-brick
  commands when running on debug mode.

  This issue was introduced when fixing bug #1784062, as the fix was to
  set ALL privsep calls to log only INFO level messages.

  The current solution is to set the ``default_log_levels``
  configuration option in nova and include
  ``oslo.privsep.daemon=DEBUG`` in it.

  The default for os-brick should be the other way around, it should
  allow emitting DEBUG messages on debug mode.

To manage notifications about this bug go to:
https://bugs.launchpad.net/os-brick/+bug/1922052/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1922052] Re: Missing os-brick commands in debug mode

2023-01-26 Thread Gorka Eguileor
Fix available in Xena (2.6.0)

** Changed in: oslo.privsep
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1922052

Title:
  Missing os-brick commands in debug mode

Status in OpenStack Compute (nova):
  Triaged
Status in oslo.privsep:
  Fix Released

Bug description:
  To debug os-brick's attach and detach code developers and system
  administrators rely on seeing  what commands are actually being
  executed by os-brick.

  The os-brick library relies on the DEBUG logs from the libraries (such
  as ``oslo_concurrency.processutils``) for this purpose instead of
  duplicating log entries by logging the calls and stdout-stderr itself.

  The default configuration in Nova no longer logs those os-brick
  commands when running on debug mode.

  This issue was introduced when fixing bug #1784062, as the fix was to
  set ALL privsep calls to log only INFO level messages.

  The current solution is to set the ``default_log_levels``
  configuration option in nova and include
  ``oslo.privsep.daemon=DEBUG`` in it.

  The default for os-brick should be the other way around, it should
  allow emitting DEBUG messages on debug mode.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1922052/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1988751] [NEW] Nova not detaching Cinder volume is volume is available

2022-09-05 Thread Gorka Eguileor
Public bug reported:

It is known that Cinder and Nova can sometime be out of sync, where one
of them thinks that a volume is attached and the other doesn't.

With the Cinder attachment API an operator can now fix the issue where
Nova says the volume is detached and Cinder says it's not.  The operator
just needs to run the "cinder attachment-delete" command on the specific
attachment.

The opposite situation, Cinder says it's available and Nova says it's
attached, cannot be currently fixed without modifying the Nova database
and making manual os-brick calls or the appropriate CLI calls to detach
the volume.

Ideally Nova should be able to call os-brick using the BDM information
to locally detach the volume (using the force option which can lose
data) and then not call cinder to do the detach since it already says
the volume is not mapped.

One way to reproduce this is:

- Create a VM
- Create a volume
- Attach the volume
- Delete the attachment in cinder with "cinder attachment-delete "
- Trying to detach the volume in nova

The error we'll see is something like:

ERROR (BadRequest): Invalid volume: Invalid input received: Invalid
volume: Unable to detach volume. Volume status must be 'in-use' and
attach_status must be 'attached' to detach. (HTTP 400) (Request-ID: req-
ec02147a-6b5b-40d2-991c-3d49207f5c9b) (HTTP 400) (Request-ID:
req-d8ab82c5-cb32-446e-a8e9-fd8e30be0995)

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1988751

Title:
  Nova not detaching Cinder volume is volume is available

Status in OpenStack Compute (nova):
  New

Bug description:
  It is known that Cinder and Nova can sometime be out of sync, where
  one of them thinks that a volume is attached and the other doesn't.

  With the Cinder attachment API an operator can now fix the issue where
  Nova says the volume is detached and Cinder says it's not.  The
  operator just needs to run the "cinder attachment-delete" command on
  the specific attachment.

  The opposite situation, Cinder says it's available and Nova says it's
  attached, cannot be currently fixed without modifying the Nova
  database and making manual os-brick calls or the appropriate CLI calls
  to detach the volume.

  Ideally Nova should be able to call os-brick using the BDM information
  to locally detach the volume (using the force option which can lose
  data) and then not call cinder to do the detach since it already says
  the volume is not mapped.

  One way to reproduce this is:

  - Create a VM
  - Create a volume
  - Attach the volume
  - Delete the attachment in cinder with "cinder attachment-delete "
  - Trying to detach the volume in nova

  The error we'll see is something like:

  ERROR (BadRequest): Invalid volume: Invalid input received: Invalid
  volume: Unable to detach volume. Volume status must be 'in-use' and
  attach_status must be 'attached' to detach. (HTTP 400) (Request-ID:
  req-ec02147a-6b5b-40d2-991c-3d49207f5c9b) (HTTP 400) (Request-ID:
  req-d8ab82c5-cb32-446e-a8e9-fd8e30be0995)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1988751/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1967157] [NEW] Fails to extend in-use (non LUKS v1) encrypted volumes

2022-03-30 Thread Gorka Eguileor
irtError('virDomainBlockResize() failed')
Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
libvirt.libvirtError: internal error: unable to execute QEMU command 
'block_resize': Cannot grow device files

** Affects: nova
 Importance: Undecided
 Assignee: Gorka Eguileor (gorka)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Gorka Eguileor (gorka)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1967157

Title:
  Fails to extend in-use (non LUKS v1) encrypted volumes

Status in OpenStack Compute (nova):
  New

Bug description:
  Patch fixing bug #1861071 resolved the issue of extending LUKS v1
  volumes when nova connects them via libvirt instead of through os-
  brick, but nova side still fails to extend in-use volumes when they
  don't go through libvirt (i.e., LUKS v2).

  The logs will show a very similar error, but the user won't know that
  his has happened and Cinder will show the new size:

  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [req-100471fa-c198-40ac-b713-adc395e480f1 
req-3a1ea13e-916b-4851-be67-6d849bf4aa3a service nova] [instance: 
3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] resizing block device failed.: 
libvirt.libvirtError: internal error: unable to execut>
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
Traceback (most recent call last):
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2809, in extend_volume
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
connection_info, encryption)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2763, in 
_resize_attached_encrypted_volume
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
decrypted_device_new_size, block_device, instance)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2712, in 
_resize_attached_volume
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
block_device.resize(new_size)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 789, in resize
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
self._guest._domain.blockResize(self._disk, size, flags=flags)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 193, in 
doit
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
result = proxy_call(self._autowrap, f, *args, **kwargs)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 151, in 
proxy_call
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
rv = execute(f, *args, **kwargs)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/eventlet/tpool.py", line 132, in 
execute
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
six.reraise(c, e, tb)
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9]   
File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise
  Mar 29 21:25:39 ssmc.localdomain nova-compute[1376242]: ERROR 
nova.virt.libvirt.driver [instance: 3f206ec4-fad5-48b8-9cb2-c3e6f00f30c9] 
raise val

[Yahoo-eng-team] [Bug 1798224] Re: DeprecationWarning: The behavior of .best_match for the Accept classes is currently being maintained for backward compatibility, but the method will be deprecated in

2021-05-13 Thread Gorka Eguileor
** Also affects: cinder
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1798224

Title:
  DeprecationWarning: The behavior of .best_match for the Accept classes
  is currently being maintained for backward compatibility, but the
  method will be deprecated in the future

Status in Cinder:
  In Progress
Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  When executing 'tox -e py35', the following deprecation warning is shown.
  It should be fixed.

  2018-10-16 03:36:49.117553 | ubuntu-xenial | {5} 
nova.tests.unit.api.openstack.compute.test_disk_config.DiskConfigTestCaseV21.test_update_server_override_auto
 [0.544275s] ... ok
  2018-10-16 03:36:49.117626 | ubuntu-xenial |
  2018-10-16 03:36:49.117666 | ubuntu-xenial | Captured stderr:
  2018-10-16 03:36:49.117703 | ubuntu-xenial | 
  (snipped...)
  2018-10-16 03:36:49.118228 | ubuntu-xenial | 
b'/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/webob/acceptparse.py:1379:
 DeprecationWarning: The behavior of .best_match for the Accept classes is 
currently being maintained for backward compatibility, but the method will be 
deprecated in the future, as its behavior is not specified in (and currently 
does not conform to) RFC 7231.'
  2018-10-16 03:36:49.118288 | ubuntu-xenial | b'  DeprecationWarning,'
  2018-10-16 03:36:49.118319 | ubuntu-xenial | b''

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1798224/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1922052] [NEW] Missing os-brick commands in debug mode

2021-03-31 Thread Gorka Eguileor
Public bug reported:

To debug os-brick's attach and detach code developers and system
administrators rely on seeing  what commands are actually being executed
by os-brick.

The os-brick library relies on the DEBUG logs from the libraries (such
as ``oslo_concurrency.processutils``) for this purpose instead of
duplicating log entries by logging the calls and stdout-stderr itself.

The default configuration in Nova no longer logs those os-brick commands
when running on debug mode.

This issue was introduced when fixing bug #784062, as the fix was to set
ALL privsep calls to log only INFO level messages.

The current solution is to set the ``default_log_levels`` configuration
option in nova and include  ``oslo.privsep.daemon=DEBUG`` in it.

The default for os-brick should be the other way around, it should allow
emitting DEBUG messages on debug mode.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1922052

Title:
  Missing os-brick commands in debug mode

Status in OpenStack Compute (nova):
  New

Bug description:
  To debug os-brick's attach and detach code developers and system
  administrators rely on seeing  what commands are actually being
  executed by os-brick.

  The os-brick library relies on the DEBUG logs from the libraries (such
  as ``oslo_concurrency.processutils``) for this purpose instead of
  duplicating log entries by logging the calls and stdout-stderr itself.

  The default configuration in Nova no longer logs those os-brick
  commands when running on debug mode.

  This issue was introduced when fixing bug #784062, as the fix was to
  set ALL privsep calls to log only INFO level messages.

  The current solution is to set the ``default_log_levels``
  configuration option in nova and include
  ``oslo.privsep.daemon=DEBUG`` in it.

  The default for os-brick should be the other way around, it should
  allow emitting DEBUG messages on debug mode.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1922052/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1829343] [NEW] requirements-check job fails

2019-05-16 Thread Gorka Eguileor
Public bug reported:

requirements-check job will always fail with current Nova doc
requirements:

2019-05-14 15:14:03.375075 | TASK [Run requirements check script]
2019-05-14 15:14:15.756135 | ubuntu-bionic | sys.version_info(major=3, minor=6, 
micro=7, releaselevel='final', serial=0)
2019-05-14 15:14:15.756335 | ubuntu-bionic | selecting default requirements 
directory for normal mode
2019-05-14 15:14:15.756388 | ubuntu-bionic | Branch: master
2019-05-14 15:14:15.756464 | ubuntu-bionic | Source: 
src/opendev.org/openstack/nova
2019-05-14 15:14:15.756573 | ubuntu-bionic | Requirements: 
/home/zuul/src/opendev.org/openstack/requirements
2019-05-14 15:14:15.756631 | ubuntu-bionic | git log -n 1 --format=%H
2019-05-14 15:14:15.756735 | ubuntu-bionic | Patch under test: 
b'f029534938012a19e4eee2d0927f4ccb2747b8fe'
2019-05-14 15:14:15.756870 | ubuntu-bionic | git --git-dir 
/home/zuul/src/opendev.org/openstack/requirements/.git rev-parse HEAD
2019-05-14 15:14:15.756988 | ubuntu-bionic | requirements git sha: 
b'85f9aad798323c5f6dd8ce1a3dd5e009c2f944a1'
2019-05-14 15:14:15.757056 | ubuntu-bionic | virtualenv /tmp/tmp62m_7g3d/venv
2019-05-14 15:14:15.757192 | ubuntu-bionic | /tmp/tmp62m_7g3d/venv/bin/pip 
install /home/zuul/src/opendev.org/openstack/requirements
2019-05-14 15:14:15.757289 | ubuntu-bionic | Checking 
b'f029534938012a19e4eee2d0927f4ccb2747b8fe'
2019-05-14 15:14:15.757350 | ubuntu-bionic | Processing requirements.txt
2019-05-14 15:14:15.757417 | ubuntu-bionic | Processing test-requirements.txt
2019-05-14 15:14:15.757482 | ubuntu-bionic | Processing doc/requirements.txt
2019-05-14 15:14:15.757539 | ubuntu-bionic | Processing .[osprofiler]
2019-05-14 15:14:15.757603 | ubuntu-bionic | Validating requirements.txt
2019-05-14 15:14:15.757671 | ubuntu-bionic | Validating test-requirements.txt
2019-05-14 15:14:15.757736 | ubuntu-bionic | Validating doc/requirements.txt
2019-05-14 15:14:15.757999 | ubuntu-bionic | Requirement(package='sphinx', 
location='', specifiers='!=1.6.6,!=1.6.7,>=1.6.2', markers='', comment='# BSD', 
extras=frozenset()) 'markers': '' does not match "python_version=='2.7'"
2019-05-14 15:14:15.758262 | ubuntu-bionic | Requirement(package='sphinx', 
location='', specifiers='!=1.6.6,!=1.6.7,>=1.6.2', markers='', comment='# BSD', 
extras=frozenset()) 'markers': '' does not match "python_version>='3.4'"
2019-05-14 15:14:15.758537 | ubuntu-bionic | Could not find a global 
requirements entry to match package sphinx. If the package is already included 
in the global list, the name or platform markers there may not match the local 
settings.
2019-05-14 15:14:15.758596 | ubuntu-bionic | Validating osprofiler
2019-05-14 15:14:15.758684 | ubuntu-bionic | Validating lower constraints of 
requirements.txt
2019-05-14 15:14:15.758777 | ubuntu-bionic | Validating lower constraints of 
test-requirements.txt
2019-05-14 15:14:15.758865 | ubuntu-bionic | Validating lower constraints of 
osprofiler
2019-05-14 15:14:15.758938 | ubuntu-bionic | *** Incompatible requirement found!
2019-05-14 15:14:15.759063 | ubuntu-bionic | *** See 
https://docs.openstack.org/requirements/latest/
2019-05-14 15:14:16.085404 | ubuntu-bionic | ERROR
2019-05-14 15:14:16.085856 | ubuntu-bionic | {
2019-05-14 15:14:16.085977 | ubuntu-bionic |   "delta": "0:00:11.176407",
2019-05-14 15:14:16.086084 | ubuntu-bionic |   "end": "2019-05-14 
15:14:15.781517",
2019-05-14 15:14:16.086187 | ubuntu-bionic |   "msg": "non-zero return code",
2019-05-14 15:14:16.086288 | ubuntu-bionic |   "rc": 1,
2019-05-14 15:14:16.086391 | ubuntu-bionic |   "start": "2019-05-14 
15:14:04.605110"
2019-05-14 15:14:16.086490 | ubuntu-bionic | }
2019-05-14 15:14:16.170292 |

** Affects: nova
 Importance: Undecided
 Assignee: Gorka Eguileor (gorka)
 Status: In Progress

** Changed in: nova
 Assignee: (unassigned) => Gorka Eguileor (gorka)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1829343

Title:
  requirements-check job fails

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  requirements-check job will always fail with current Nova doc
  requirements:

  2019-05-14 15:14:03.375075 | TASK [Run requirements check script]
  2019-05-14 15:14:15.756135 | ubuntu-bionic | sys.version_info(major=3, 
minor=6, micro=7, releaselevel='final', serial=0)
  2019-05-14 15:14:15.756335 | ubuntu-bionic | selecting default requirements 
directory for normal mode
  2019-05-14 15:14:15.756388 | ubuntu-bionic | Branch: master
  2019-05-14 15:14:15.756464 | ubuntu-bionic | Source: 
src/opendev.org/openstack/nova
  2019-05-14 15:14:15.756573 | ubuntu-bionic | Requirements: 
/home/zuul/src/opendev.org/openstack/requirements
  2019-05-14 15:14:15.756631 | ubuntu-bionic | git log -n 1 --format=

[Yahoo-eng-team] [Bug 1800515] [NEW] Unnecessary locking when connecting volumes

2018-10-29 Thread Gorka Eguileor
Public bug reported:

Cinder introduced "shared_targets" and "service_uuid" fields in volumes
to allow volume consumers to protect themselves from unintended leftover
devices when handling iSCSI connections with shared targets.

The way to protect from the automatic scans that happen on detach/map
race conditions is by locking and only allowing one attach or one detach
operation for each server to happen at a given time.

When using an up to date Open iSCSI initiator we don't need to use
locks, as it has the possibility to disable automatic LUN scans (which
are the real cause of the leftover devices), and OS-Brick already
supports this feature.

Currently Nova is blindly locking whenever "shared_targets" is set to
True, even when the iSCSI initiator and OS-Brick are already presenting
such races, which introduces unnecessary serialization on the connection
of volumes.

** Affects: nova
 Importance: Undecided
 Assignee: Gorka Eguileor (gorka)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Gorka Eguileor (gorka)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1800515

Title:
  Unnecessary locking when connecting volumes

Status in OpenStack Compute (nova):
  New

Bug description:
  Cinder introduced "shared_targets" and "service_uuid" fields in
  volumes to allow volume consumers to protect themselves from
  unintended leftover devices when handling iSCSI connections with
  shared targets.

  The way to protect from the automatic scans that happen on detach/map
  race conditions is by locking and only allowing one attach or one
  detach operation for each server to happen at a given time.

  When using an up to date Open iSCSI initiator we don't need to use
  locks, as it has the possibility to disable automatic LUN scans (which
  are the real cause of the leftover devices), and OS-Brick already
  supports this feature.

  Currently Nova is blindly locking whenever "shared_targets" is set to
  True, even when the iSCSI initiator and OS-Brick are already
  presenting such races, which introduces unnecessary serialization on
  the connection of volumes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1800515/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1703954] Re: Attach/Detach encrypted volume problems with real paths

2017-12-15 Thread Gorka Eguileor
** Changed in: os-brick
   Status: New => Fix Released

** Changed in: cinder
 Assignee: (unassigned) => Gorka Eguileor (gorka)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1703954

Title:
  Attach/Detach encrypted volume problems with real paths

Status in Cinder:
  New
Status in OpenStack Compute (nova):
  Incomplete
Status in os-brick:
  Fix Released

Bug description:
  OS-Brick on 1.14 and 1.15 returns real paths instead of returning
  symbolic links, which results in the encryption attach_volume call
  replacing the real device with a link to the crypt dm.

  The issue comes from the Nova flow when attaching an encrypted volume:

  1- Attach volume
  2- Generate libvirt configuration with path from step 1
  3- Encrypt attach volume

  Since step 2 has already generated the config with the path from step
  1 then step 3 must preserve this path.

  When step 1 returns a symbolic link we just forcefully replace it with
  a link to the crypt dm and everything is OK, but when we return a real
  path it does the same thing, which means we'll be replacing for
  example /dev/sda with a symlink, which will then break the detach
  process, and all future attachments.

  If flow order was changed to be 1, 3, 2 then the encrypt attach volume
  could give a different path to be used for the libvirt config
  generation.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1703954/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1703954] [NEW] Attach/Detach encrypted volume problems with real paths

2017-07-12 Thread Gorka Eguileor
Public bug reported:

OS-Brick on 1.14 and 1.15 returns real paths instead of returning
symbolic links, which results in the encryption attach_volume call
replacing the real device with a link to the crypt dm.

The issue comes from the Nova flow when attaching an encrypted volume:

1- Attach volume
2- Generate libvirt configuration with path from step 1
3- Encrypt attach volume

Since step 2 has already generated the config with the path from step 1
then step 3 must preserve this path.

When step 1 returns a symbolic link we just forcefully replace it with a
link to the crypt dm and everything is OK, but when we return a real
path it does the same thing, which means we'll be replacing for example
/dev/sda with a symlink, which will then break the detach process, and
all future attachments.

If flow order was changed to be 1, 3, 2 then the encrypt attach volume
could give a different path to be used for the libvirt config
generation.

** Affects: cinder
 Importance: Undecided
 Status: New

** Affects: nova
 Importance: Undecided
 Status: New

** Affects: os-brick
 Importance: Undecided
 Status: New

** Also affects: cinder
   Importance: Undecided
   Status: New

** Also affects: nova
   Importance: Undecided
   Status: New

** Description changed:

  OS-Brick on 1.14 and 1.15 returns real paths instead of returning
  symbolic links, which results in the encryption attach_volume call
  replacing the real device with a link to the crypt dm.
  
  The issue comes from the Nova flow when attaching an encrypted volume:
  
  1- Attach volume
  2- Generate libvirt configuration with path from step 1
  3- Encrypt attach volume
  
  Since step 2 has already generated the config with the path from step 1
  then step 3 must preserve this path.
  
  When step 1 returns a symbolic link we just forcefully replace it with a
  link to the crypt dm and everything is OK, but when we return a real
- path it does the same thing.
+ path it does the same thing, which means we'll be replacing for example
+ /dev/sda with a symlink, which will then break the detach process, and
+ all future attachments.
  
  If flow order was changed to be 1, 3, 2 then the encrypt attach volume
  could give a different path to be used for the libvirt config
  generation.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1703954

Title:
  Attach/Detach encrypted volume problems with real paths

Status in Cinder:
  New
Status in OpenStack Compute (nova):
  New
Status in os-brick:
  New

Bug description:
  OS-Brick on 1.14 and 1.15 returns real paths instead of returning
  symbolic links, which results in the encryption attach_volume call
  replacing the real device with a link to the crypt dm.

  The issue comes from the Nova flow when attaching an encrypted volume:

  1- Attach volume
  2- Generate libvirt configuration with path from step 1
  3- Encrypt attach volume

  Since step 2 has already generated the config with the path from step
  1 then step 3 must preserve this path.

  When step 1 returns a symbolic link we just forcefully replace it with
  a link to the crypt dm and everything is OK, but when we return a real
  path it does the same thing, which means we'll be replacing for
  example /dev/sda with a symlink, which will then break the detach
  process, and all future attachments.

  If flow order was changed to be 1, 3, 2 then the encrypt attach volume
  could give a different path to be used for the libvirt config
  generation.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1703954/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1593055] Re: Retype an in-use volume failed in mitaka

2016-06-22 Thread Gorka Eguileor
*** This bug is a duplicate of bug 1381153 ***
https://bugs.launchpad.net/bugs/1381153

Just for reference, this is a well know issue with some QEMU versions
that don't support blockcopy command.  You just need to get the right
QEMU binary built and installed if you don't have the right one
available in your distro.  There are some instructions around  that may
help you: https://www.jrssite.com/wordpress/?p=302

** Changed in: nova
   Status: New => Invalid

** This bug has been marked a duplicate of bug 1381153
   Cannot create instance live snapshots in Centos7 (icehouse)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1593055

Title:
  Retype an in-use volume failed in mitaka

Status in Cinder:
  Invalid
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  reproduce:
  1 create an vm 
  2 create a volume in lvm
  3 attach volume to vm
  4 retype volume to another backend

  Look at the log like qemu version problem.My qemu version is like this:
  [root@localhost logs]# rpm -qa | grep qemu
  qemu-img-1.5.3-105.el7_2.4.x86_64
  libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64
  ipxe-roms-qemu-20130517-8.gitc4bce43.el7_2.1.noarch
  qemu-kvm-1.5.3-105.el7_2.4.x86_64
  qemu-kvm-common-1.5.3-105.el7_2.4.x86_64

  
  error logs:

  2016-06-16 19:03:46.892 ERROR oslo_messaging.rpc.server 
[req-8c37204e-3484-448a-8aed-35f38403178d admin admin] Exception during 
handling message
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server Traceback (most 
recent call last):
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in 
_process_incoming
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server res = 
self.dispatcher.dispatch(message)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 153, 
in dispatch
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server return 
self._do_dispatch(endpoint, method, ctxt, args)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 122, 
in _do_dispatch
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server result = 
func(ctxt, **new_args)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/exception.py", line 110, in wrapped
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server payload)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server 
self.force_reraise()
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in 
force_reraise
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server 
six.reraise(self.type_, self.value, self.tb)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/exception.py", line 89, in wrapped
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server return f(self, 
context, *args, **kw)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server LOG.warning(msg, 
e, instance=instance)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server 
self.force_reraise()
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in 
force_reraise
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server 
six.reraise(self.type_, self.value, self.tb)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server return 
function(self, context, *args, **kwargs)
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server 
kwargs['instance'], e, sys.exc_info())
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 221, in __exit__
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server 
self.force_reraise()
  2016-06-16 19:03:46.892 TRACE oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in 
force_reraise
  2016-06-16 19:03:46.892 TRACE 

[Yahoo-eng-team] [Bug 1268439] Re: range method is not same in py3.x and py2.x

2016-01-18 Thread Gorka Eguileor
Fix released as LP bug 1530249

** Changed in: cinder
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1268439

Title:
  range method is not same in py3.x and py2.x

Status in Ceilometer:
  Fix Released
Status in Cinder:
  Fix Released
Status in Glance:
  Fix Released
Status in heat:
  Fix Released
Status in OpenStack Identity (keystone):
  Invalid
Status in neutron:
  Fix Released
Status in python-ceilometerclient:
  Fix Released
Status in python-neutronclient:
  Invalid
Status in python-swiftclient:
  Fix Released
Status in OpenStack Object Storage (swift):
  In Progress

Bug description:
  in py3.x,range is xrange in py2.x.
  in py3.x, if you want get a list,you must use:
  list(range(value))

  I review the code, find that many codes use range for  loop, if used py3.x 
environment,
  it will occure error.
  so we must modify this issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceilometer/+bug/1268439/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1449639] Re: RBD: On image creation error, image is not deleted

2015-04-28 Thread Gorka Eguileor
** Description changed:

  When an exception rises while adding/creating an image, and the image
  has been created, this new image is not properly deleted.
  
  The fault lies in the `_delete_image` call of the Store.add method that
  is providing incorrect arguments.
+ 
+ This also affects Glance (Icehouse), since back then glance_store
+ functionality was included there.

** Also affects: glance
   Importance: Undecided
   Status: New

** Changed in: glance
 Assignee: (unassigned) = Gorka Eguileor (gorka)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1449639

Title:
  RBD: On image creation error, image is not deleted

Status in OpenStack Image Registry and Delivery Service (Glance):
  New
Status in OpenStack Glance backend store-drivers library (glance_store):
  New

Bug description:
  When an exception rises while adding/creating an image, and the image
  has been created, this new image is not properly deleted.

  The fault lies in the `_delete_image` call of the Store.add method
  that is providing incorrect arguments.

  This also affects Glance (Icehouse), since back then glance_store
  functionality was included there.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1449639/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1422699] [NEW] glance api doesn't abort start up on Store configuration errors

2015-02-17 Thread Gorka Eguileor
Public bug reported:

Glance api service does not abort start up when errors in glance-api.cfg file 
are encountered.
It would make sense to abort service start up when a BadStoreConfiguration 
exception is encountered, instead of just sending the error to the logs and 
disabling adding images to that Store.

For example if a Filesystem Storage Backend with multiple store is configured 
with a duplicate directory:
filesystem_store_datadirs=/mnt/nfs1/images/:200
filesystem_store_datadirs=/mnt/nfs1/images/:100

Logs will have the error:
ERROR glance_store._drivers.filesystem [-] Directory /mnt/nfs1/image specified 
multiple times in filesystem_store_datadirs option of filesystem configuration
TRACE glance_store._drivers.filesystem None
TRACE glance_store._drivers.filesystem
WARNING glance_store.driver [-] Failed to configure store correctly: None 
Disabling add method.

Service will start and when client tries to add an image he will receive
a 410 Gone error saying: Error in store configuration. Adding images to
store is disabled.

This affects not only the filesystem storage backend but all glance-
storage drivers that encounter an error in the configuration and raise a
BadStoreConfiguration exception.

How reproducible:
Every time

Steps to Reproduce:
1. Configure Glance to use  Filesystem Storage Backend with multiple store and 
duplicate a filesystem_storage_datadirs.
2. Run glance api

Expected behavior:
Glance api service should not have started and should have reported that the 
directory was specified multiple times.

** Affects: glance
 Importance: Undecided
 Status: New

** Affects: glance-store
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1422699

Title:
  glance api doesn't abort start up on Store configuration errors

Status in OpenStack Image Registry and Delivery Service (Glance):
  New
Status in OpenStack Glance backend store-drivers library (glance_store):
  New

Bug description:
  Glance api service does not abort start up when errors in glance-api.cfg file 
are encountered.
  It would make sense to abort service start up when a BadStoreConfiguration 
exception is encountered, instead of just sending the error to the logs and 
disabling adding images to that Store.

  For example if a Filesystem Storage Backend with multiple store is configured 
with a duplicate directory:
  filesystem_store_datadirs=/mnt/nfs1/images/:200
  filesystem_store_datadirs=/mnt/nfs1/images/:100

  Logs will have the error:
  ERROR glance_store._drivers.filesystem [-] Directory /mnt/nfs1/image 
specified multiple times in filesystem_store_datadirs option of filesystem 
configuration
  TRACE glance_store._drivers.filesystem None
  TRACE glance_store._drivers.filesystem
  WARNING glance_store.driver [-] Failed to configure store correctly: None 
Disabling add method.

  Service will start and when client tries to add an image he will
  receive a 410 Gone error saying: Error in store configuration. Adding
  images to store is disabled.

  This affects not only the filesystem storage backend but all glance-
  storage drivers that encounter an error in the configuration and raise
  a BadStoreConfiguration exception.

  How reproducible:
  Every time

  Steps to Reproduce:
  1. Configure Glance to use  Filesystem Storage Backend with multiple store 
and duplicate a filesystem_storage_datadirs.
  2. Run glance api

  Expected behavior:
  Glance api service should not have started and should have reported that the 
directory was specified multiple times.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1422699/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1422699] Re: glance api doesn't abort start up on Store configuration errors

2015-02-17 Thread Gorka Eguileor
** Also affects: glance-store
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1422699

Title:
  glance api doesn't abort start up on Store configuration errors

Status in OpenStack Image Registry and Delivery Service (Glance):
  New
Status in OpenStack Glance backend store-drivers library (glance_store):
  New

Bug description:
  Glance api service does not abort start up when errors in glance-api.cfg file 
are encountered.
  It would make sense to abort service start up when a BadStoreConfiguration 
exception is encountered, instead of just sending the error to the logs and 
disabling adding images to that Store.

  For example if a Filesystem Storage Backend with multiple store is configured 
with a duplicate directory:
  filesystem_store_datadirs=/mnt/nfs1/images/:200
  filesystem_store_datadirs=/mnt/nfs1/images/:100

  Logs will have the error:
  ERROR glance_store._drivers.filesystem [-] Directory /mnt/nfs1/image 
specified multiple times in filesystem_store_datadirs option of filesystem 
configuration
  TRACE glance_store._drivers.filesystem None
  TRACE glance_store._drivers.filesystem
  WARNING glance_store.driver [-] Failed to configure store correctly: None 
Disabling add method.

  Service will start and when client tries to add an image he will
  receive a 410 Gone error saying: Error in store configuration. Adding
  images to store is disabled.

  This affects not only the filesystem storage backend but all glance-
  storage drivers that encounter an error in the configuration and raise
  a BadStoreConfiguration exception.

  How reproducible:
  Every time

  Steps to Reproduce:
  1. Configure Glance to use  Filesystem Storage Backend with multiple store 
and duplicate a filesystem_storage_datadirs.
  2. Run glance api

  Expected behavior:
  Glance api service should not have started and should have reported that the 
directory was specified multiple times.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1422699/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp