I'm still really confused by this but some thoughts on the nova
os.chmod() call mentioned in an earlier commit that would fix this.

If I chmod the tmp dir that gets created by nova (e.g.
/var/lib/nova/instances/snapshots/tmpkajuir8o) to 755 just before the
snapshot (after the nova chmod), the snapshot is successful.

As mentioned in
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1896617/comments/18,
the upstream nova code sets permissions for the tmp dir with:

os.chmod(tmpdir, 0o701)

That code has been that way since 2015, so it's not new in ussuri, see
git blame:

824c3706a3e nova/virt/libvirt/driver.py     (Nicolas Simonds               
2015-07-23 12:47:24 -0500  2388)                         # NOTE(xqueralt): 
libvirt needs o+x in the tempdir
824c3706a3e nova/virt/libvirt/driver.py     (Nicolas Simonds               
2015-07-23 12:47:24 -0500  2389)                         os.chmod(tmpdir, 0o701)

However, this seems like a heavy handed chmod if the goal, as the
comment above it mentions, is to give libvirt o+x in the tempdir. I say
this because it overrides any default permissions that were set
previously by the operating system.

It seems that this should really be a lighter touch such as the
following (equivalent to chmod o+x tmpdir):

st = os.stat(tmpdir)
os.chmod(tmpdir, st.st_mode | stat.S_IXOTH)

That would fix this bug for us, but still doesn't explain what changed
in Ubuntu to cause this to fail. We did make some permissions changes in
the nova package in focal but as compared above (with ussuri-proposed)
file/directory permissions above in comment #21 I'm seeing no
differences.

** Changed in: nova
       Status: Invalid => New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1896617

Title:
  Creation of image (or live snapshot) from the existing VM fails if
  libvirt-image-backend is configured to qcow2 starting from Ussuri

Status in OpenStack nova-compute charm:
  Invalid
Status in OpenStack Compute (nova):
  New
Status in nova package in Ubuntu:
  Triaged

Bug description:
  tl;dr

  1) creating the image from the existing VM fails if qcow2 image backend is 
used, but everything is fine if using rbd image backend in nova-compute.
  2) openstack server image create --name <name of the new image> <instance 
name or uuid> fails with some unrelated error:

  $ openstack server image create --wait 842fa12c-19ee-44cb-bb31-36d27ec9d8fc
  HTTP 404 Not Found: No image found with ID 
f4693860-cd8d-4088-91b9-56b2f173ffc7

  == Details ==

  Two Tempest tests ([1] and [2]) from the 2018.02 Refstack test lists
  [0] are failing with the following exception:

  49701867-bedc-4d7d-aa71-7383d877d90c
  Traceback (most recent call last):
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/api/compute/base.py",
 line 369, in create_image_from_server
      waiters.wait_for_image_status(client, image_id, wait_until)
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/common/waiters.py",
 line 161, in wait_for_image_status
      image = show_image(image_id)
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/lib/services/compute/images_client.py",
 line 74, in show_image
      resp, body = self.get("images/%s" % image_id)
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/lib/common/rest_client.py",
 line 298, in get
      return self.request('GET', url, extra_headers, headers)
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/lib/services/compute/base_compute_client.py",
 line 48, in request
      method, url, extra_headers, headers, body, chunked)
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/lib/common/rest_client.py",
 line 687, in request
      self._error_checker(resp, resp_body)
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/lib/common/rest_client.py",
 line 793, in _error_checker
      raise exceptions.NotFound(resp_body, resp=resp)
  tempest.lib.exceptions.NotFound: Object not found
  Details: {'code': 404, 'message': 'Image not found.'}

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/api/compute/images/test_images_oneserver.py",
 line 69, in test_create_delete_image
      wait_until='ACTIVE')
    File 
"/home/ubuntu/snap/fcbtest/14/.rally/verification/verifier-2d9cbf4d-fcbb-491d-848d-5137a9bde99e/repo/tempest/api/compute/base.py",
 line 384, in create_image_from_server
      image_id=image_id)
  tempest.exceptions.SnapshotNotFoundException: Server snapshot image 
d82e95b0-9c62-492d-a08c-5bb118d3bf56 not found.

  So far I was able to identify the following:

  1) 
https://github.com/openstack/tempest/blob/master/tempest/api/compute/images/test_images_oneserver.py#L69
 invokes a "create image from server"
  2) It fails with the following error message in the nova-compute logs: 
https://pastebin.canonical.com/p/h6ZXdqjRRm/

  The same occurs if the "openstack server image create --wait" will be
  executed; however, according to
  https://docs.openstack.org/nova/ussuri/admin/migrate-instance-with-
  snapshot.html the VM has to be shut down before the image creation:

  "Shut down the source VM before you take the snapshot to ensure that
  all data is flushed to disk. If necessary, list the instances to view
  the instance name. Use the openstack server stop command to shut down
  the instance:"

  This step is definitely being skipped by the test (e.g it's trying to
  perform the snapshot on top of the live VM).

  FWIW, I'm using libvirt-image-backend: qcow2 in my nova-compute
  application params; and I was able to confirm that if the above
  parameter will be changed to "libvirt-image-backend: rbd", the tests
  will pass successfully.

  Also, there is similar issue I was able to find:
  https://bugs.launchpad.net/nova/+bug/1885418 but it doesn't have any
  useful information rather then confirmation of the fact that OpenStack
  Ussuri + libvirt backend has some problem with the live snapshotting.

  [0] 
https://refstack.openstack.org/api/v1/guidelines/2018.02/tests?target=platform&type=required&alias=true&flag=false
  [1] 
tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_delete_image[id-3731d080-d4c5-4872-b41a-64d0d0021314]
  [2] 
tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name[id-3b7c6fe4-dfe7-477c-9243-b06359db51e6]

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/1896617/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to