[Yahoo-eng-team] [Bug 1480514] [NEW] Remove error instance fail when enable serial_consol
Public bug reported: When I fixed https://bugs.launchpad.net/nova/+bug/1478607 I found I can't remove those error instances which was failed when config xml. This is because of following block: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L894 When nova try to destroy instance, it will cleanup relative resources. if we enable serial console, nova will try to find ports, which was assigned to it, and release them. But the instance was created failed, therefore nova will throw nova instance not found. Yes, the block looks like it had handle instance not found exception. But the function of _get_serial_ports_from_instance has yield keyword. It will not raise exception immediately instead of raise exception when program try to iterator yielded items. Therefore instance not found exception will been raised at L894 instead of L889. You can checkout following sample code. http://www.tutorialspoint.com/execute_python_online.php?PID=0Bw_CjBb95KQMU05ycERQdUFfcms ** Affects: nova Importance: Undecided Assignee: lyanchih (lyanchih) Status: New ** Changed in: nova Assignee: (unassigned) = lyanchih (lyanchih) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1480514 Title: Remove error instance fail when enable serial_consol Status in OpenStack Compute (nova): New Bug description: When I fixed https://bugs.launchpad.net/nova/+bug/1478607 I found I can't remove those error instances which was failed when config xml. This is because of following block: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L894 When nova try to destroy instance, it will cleanup relative resources. if we enable serial console, nova will try to find ports, which was assigned to it, and release them. But the instance was created failed, therefore nova will throw nova instance not found. Yes, the block looks like it had handle instance not found exception. But the function of _get_serial_ports_from_instance has yield keyword. It will not raise exception immediately instead of raise exception when program try to iterator yielded items. Therefore instance not found exception will been raised at L894 instead of L889. You can checkout following sample code. http://www.tutorialspoint.com/execute_python_online.php?PID=0Bw_CjBb95KQMU05ycERQdUFfcms To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1480514/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1479214] [NEW] nova can't attach volume to specific device name
Public bug reported: Nova attach volume cli support one option named device, it can specify this volume where to mount. But it doesn't work. Volume will be attached to device which is determined by nova compute. Maybe this bug was caused at following code: https://github.com/openstack/nova/blob/c5db407bb22e453a4bca22de1860bb6ce6090782/nova/virt/libvirt/driver.py#L6823 It will ignore device name which user assign, then auto select disk from blockinfo. My nova git environment is nova: 14d00296b179fcf115cf13d37b2f0b5b734d298d ** Affects: nova Importance: Undecided Assignee: lyanchih (lyanchih) Status: New ** Changed in: nova Assignee: (unassigned) = lyanchih (lyanchih) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1479214 Title: nova can't attach volume to specific device name Status in OpenStack Compute (nova): New Bug description: Nova attach volume cli support one option named device, it can specify this volume where to mount. But it doesn't work. Volume will be attached to device which is determined by nova compute. Maybe this bug was caused at following code: https://github.com/openstack/nova/blob/c5db407bb22e453a4bca22de1860bb6ce6090782/nova/virt/libvirt/driver.py#L6823 It will ignore device name which user assign, then auto select disk from blockinfo. My nova git environment is nova: 14d00296b179fcf115cf13d37b2f0b5b734d298d To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1479214/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1478199] [NEW] Unrescue will not remove rescue disk in ceph when image_type=rbd
Public bug reported: This bug will happen when using libvirt/QEMU and image_type=rbd. Rescue instance will produce rescue kernel and ramdisk disk in local. It will also product rescue disk which will saved in ceph by rbd. When users want to unrescue instance, nova will remove all rescue kernel and ramdisk disk in local. But rescue disk which was created in rescue step will still exist. We can using rbd or rados command to show whether objects was still existed in pool or not. For example: sudo rbd --pool $POOL_NAME ls | grep .rescue or sudo rados --pool $POOL_NAME ls | grep .rescue Why it will happen? Because of unrescue action will remove local rescue file and lvm disk but it didn't remove rdb disk. Therefore we need to add libvirt images_type condition statement which will remove correct type of disk. ** Affects: nova Importance: Undecided Assignee: lyanchih (lyanchih) Status: In Progress ** Changed in: nova Assignee: (unassigned) = lyanchih (lyanchih) ** Changed in: nova Status: New = In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1478199 Title: Unrescue will not remove rescue disk in ceph when image_type=rbd Status in OpenStack Compute (nova): In Progress Bug description: This bug will happen when using libvirt/QEMU and image_type=rbd. Rescue instance will produce rescue kernel and ramdisk disk in local. It will also product rescue disk which will saved in ceph by rbd. When users want to unrescue instance, nova will remove all rescue kernel and ramdisk disk in local. But rescue disk which was created in rescue step will still exist. We can using rbd or rados command to show whether objects was still existed in pool or not. For example: sudo rbd --pool $POOL_NAME ls | grep .rescue or sudo rados --pool $POOL_NAME ls | grep .rescue Why it will happen? Because of unrescue action will remove local rescue file and lvm disk but it didn't remove rdb disk. Therefore we need to add libvirt images_type condition statement which will remove correct type of disk. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1478199/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1476530] [NEW] Iscsi session still connect after detach volume from paused instance
Public bug reported: My test environment are lvm/ISCSI, libvirt/QEMU How to reproduce: 1. create instance 2 .create volume 3. attach volume to instance 4. pause instance 5. detach volume from instance Nova will not disconnect with volume. You can enter following command to verify. sudo iscsiadm -m node --rescan It will display the session which was build in previous steps. Of course, you can also find device will still exist in /sys/block Because of nova will search all block devices which are defined in xml for all guest. Then nova will disconnect ISCSI block which was existed in /dev/disk/by-path and didn't been define in any guest. But paused instance's xml define will still contain dev which prefer to remove. Therefore nova will not disconnect with this volume. There are two kind solution: 1. Logout iscsci connection manually. (sudo iscsiadm -m node -T Target --logout) 2. Reattach same volume.(lol) But we still need to handle this bug with paused instance. ** Affects: nova Importance: Undecided Assignee: lyanchih (lyanchih) Status: In Progress ** Changed in: nova Assignee: (unassigned) = lyanchih (lyanchih) ** Changed in: nova Status: New = In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1476530 Title: Iscsi session still connect after detach volume from paused instance Status in OpenStack Compute (nova): In Progress Bug description: My test environment are lvm/ISCSI, libvirt/QEMU How to reproduce: 1. create instance 2 .create volume 3. attach volume to instance 4. pause instance 5. detach volume from instance Nova will not disconnect with volume. You can enter following command to verify. sudo iscsiadm -m node --rescan It will display the session which was build in previous steps. Of course, you can also find device will still exist in /sys/block Because of nova will search all block devices which are defined in xml for all guest. Then nova will disconnect ISCSI block which was existed in /dev/disk/by-path and didn't been define in any guest. But paused instance's xml define will still contain dev which prefer to remove. Therefore nova will not disconnect with this volume. There are two kind solution: 1. Logout iscsci connection manually. (sudo iscsiadm -m node -T Target --logout) 2. Reattach same volume.(lol) But we still need to handle this bug with paused instance. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1476530/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1427141] Re: console auth token timeout has no impact
** Changed in: nova Status: Confirmed = Invalid ** Changed in: nova Assignee: lyanchih (lyanchih) = (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1427141 Title: console auth token timeout has no impact Status in OpenStack Compute (nova): Invalid Bug description: Issue = The console feature (VNC, SERIAL, ...) returns a connection with an auth token. This connection *never* times out. Steps to reproduce == The steps below are suitable for testing with the serial console but the behavior is the same with VNC. * enable the console feature in nova.conf [serial_console] enabled=True * set the token timeout value in nova.conf to a value which fits your testing (e.g.) console_token_ttl=10 * start the nova-serialproxy service (e.g. with devstack [1]) * start an instance * Connect to the serial console of that launched instance (e.g. Horizon with console tab or another client [2]) * Execute a command (e.g. date) * Wait until the timespan defined by console_token_ttl elapsed * Execute another command (e.g. date) Expected behavior = The command in the console is refused after the timespan elapsed. Actual behavior === The connection is kept open and each command is executed after the defined timespan. This looks weird in the case when Horizon times out but the console tab is still working. Logs Env. === OpenStack is installed and started with devstack. The logs [3] show that the expired token gets removed when a new token is appended. The append of a new token happens only when the console tab is reopened and the old token is expired. Nova version pedebug@OS-CTRL:/opt/stack/nova$ git log --oneline -n5 017574e Merge Added retries in 'network_set_host' function a957d56 libvirt: Adjust Nova to support FCP on System z systems 36bae5a Merge fake: fix public API signatures to match virt driver 13223b5 Merge Don't assume contents of values after aggregate_update c4a9cc5 Merge Fix VNC access, when reverse DNS lookups fail References == [1] Devstack guide; Nova and devstack; http://docs.openstack.org/developer/devstack/guides/nova.html [2] larsk/novaconsole; github; https://github.com/larsks/novaconsole/ [3] http://paste.openstack.org/show/184866/ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1427141/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1467570] Re: Nova can't provision instance from snapshot with a ceph backend
** No longer affects: horizon ** Changed in: nova Assignee: lyanchih (lyanchih) = (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1467570 Title: Nova can't provision instance from snapshot with a ceph backend Status in OpenStack Compute (nova): Invalid Bug description: This is a weird issue that does not happen in our Juno setup, but happens in our Kilo setup. The configuration between the two setups is pretty much the same, with only kilo-specific changes done (namely, moving lines around to new sections). Here's how to reproduce: 1.Provision an instance. 2.Make a snapshot of this instance. 3.Try to provision an instance with that snapshot. Nova-compute will complain that it can't find the disk and the instance will fall in error. Here's what the default behavior is supposed to be from my observations: -When the image is uploaded into ceph, a snapshot is created automatically inside ceph (this is NOT an instance snapshot per say, but a ceph internal snapshot). -When an instance is booted from image in nova, this snapshot gets a clone in the nova ceph pool. Nova then uses that clone as the instance's disk. This is called copy-on-write cloning. Here's when things get funky: -When an instance is booted from a snapshot, the copy-on-write cloning does not happen. Nova looks for the disk and, of course, fails to find it in its pool, thus failing to provision the instance . There's no trace anywhere of the copy-on- write clone failing (In part because ceph doesn't log client commands, from what I see). The compute logs I got are in this pastebin : http://pastebin.com/ADHTEnhn There's a few things I notice here that I'd like to point out : -Nova create an ephemeral drive file, then proceeds to delete it before using rbd_utils instead. While strange, this may be the intended but somewhat dirty behavior, as nova consider it an ephemeral instance, before realizing that it's actually a ceph instance and doesn't need its ephemeral disk. Or maybe these conjectures are completely wrong and this is part of the issue. -Nova creates the image (I'm guessing it's the copy-on-write cloning happening here). What exactly happens here isn't very clear, but then it complains that it can't find the clone in its pool to use as block device. This issue does not happen on ephemeral storage. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1467570/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1474283] [NEW] Boot instance from volume snapshot fail
Public bug reported: How to reproduce: 1. create new volume and volume source is image 2. Snapshot volume which was created at step1 3. Launch as instance from snapshot volume which was created at step2 Then horizon will display Block Device Mapping is Invalid: failed to get volume Because horizon will send volume source type instead of snapshot source type to nova. Therefore nova api will try to fetch volume tough volume id instead of snapshot. Nova create server request's data was incorrect at following link and line. https://github.com/openstack/horizon/blob/master/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py#L874 ** Affects: horizon Importance: Undecided Assignee: lyanchih (lyanchih) Status: New ** Changed in: horizon Assignee: (unassigned) = lyanchih (lyanchih) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1474283 Title: Boot instance from volume snapshot fail Status in OpenStack Dashboard (Horizon): New Bug description: How to reproduce: 1. create new volume and volume source is image 2. Snapshot volume which was created at step1 3. Launch as instance from snapshot volume which was created at step2 Then horizon will display Block Device Mapping is Invalid: failed to get volume Because horizon will send volume source type instead of snapshot source type to nova. Therefore nova api will try to fetch volume tough volume id instead of snapshot. Nova create server request's data was incorrect at following link and line. https://github.com/openstack/horizon/blob/master/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py#L874 To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1474283/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1473949] [NEW] gate-nova-python34 some times test failed on test_save_updates_numa_topology
Public bug reported: After I commit review, I got gate-nova-python34 FAILURE in following log file http://logs.openstack.org/19/201019/1/check/gate-nova-python34/1e74b65/console.html The assert meessages are AssertionError: Expected call: instance_extra_update_by_uuid(nova.context.RequestContext object at 0x7fb95f499dd8, 'fake-uuid', {'numa_topology': '{nova_object.version: 1.1, nova_object.name: InstanceNUMATopology, nova_object.changes: [cells, instance_uuid], nova_object.data: {cells: [{nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, id, cpuset], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 0, cpuset: [0], memory: 128}, nova_object.namespace: nova}, {nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, id, cpuset], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 1, cpuset: [1], memory: 128}, nova_object.namespace: nova}], instance_uuid: fake-uuid}, nova_object.namespace: nova}'}) 2015-07-13 07:28:22.759 | Actual call: instance_extra_update_by_uuid(nova.context.RequestContext object at 0x7fb95f499dd8, 'fake-uuid', {'numa_topology': '{nova_object.version: 1.1, nova_object.name: InstanceNUMATopology, nova_object.changes: [cells, instance_uuid], nova_object.data: {cells: [{nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, cpuset, id], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 0, cpuset: [0], memory: 128}, nova_object.namespace: nova}, {nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, cpuset, id], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 1, cpuset: [1], memory: 128}, nova_object.namespace: nova}], instance_uuid: fake-uuid}, nova_object.namespace: nova}'}) You can notice the difference of these two value are nova_object.changes in cells object. They have same element with different order. This is because of the order of _changed_fields was not always same. Therefore the two value's order are different. But python27 will not had this problem. Because of when we want to get object's change, those changes will been save in set and finally return it. Python27's set collection will sort content, but python34 wouldn't. ** Affects: nova Importance: Undecided Assignee: lyanchih (lyanchih) Status: New ** Changed in: nova Assignee: (unassigned) = lyanchih (lyanchih) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1473949 Title: gate-nova-python34 some times test failed on test_save_updates_numa_topology Status in OpenStack Compute (nova): New Bug description: After I commit review, I got gate-nova-python34 FAILURE in following log file http://logs.openstack.org/19/201019/1/check/gate-nova-python34/1e74b65/console.html The assert meessages are AssertionError: Expected call: instance_extra_update_by_uuid(nova.context.RequestContext object at 0x7fb95f499dd8, 'fake-uuid', {'numa_topology': '{nova_object.version: 1.1, nova_object.name: InstanceNUMATopology, nova_object.changes: [cells, instance_uuid], nova_object.data: {cells: [{nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, id, cpuset], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 0, cpuset: [0], memory: 128}, nova_object.namespace: nova}, {nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, id, cpuset], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 1, cpuset: [1], memory: 128}, nova_object.namespace: nova}], instance_uuid: fake-uuid}, nova_object.namespace: nova}'}) 2015-07-13 07:28:22.759 | Actual call: instance_extra_update_by_uuid(nova.context.RequestContext object at 0x7fb95f499dd8, 'fake-uuid', {'numa_topology': '{nova_object.version: 1.1, nova_object.name: InstanceNUMATopology, nova_object.changes: [cells, instance_uuid], nova_object.data: {cells: [{nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, cpuset, id], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 0, cpuset: [0], memory: 128}, nova_object.namespace: nova}, {nova_object.version: 1.2, nova_object.name: InstanceNUMACell, nova_object.changes: [memory, cpuset, id], nova_object.data: {pagesize: null, cpu_pinning_raw: null, cpu_topology: null, id: 1, cpuset: [1], memory: 128}, nova_object.namespace: nova}], instance_uuid: fake-uuid}, nova_object.namespace: nova}'}) You can notice the difference of these two value are nova_object.changes in cells object. They have same element with different order. This is because of the order
[Yahoo-eng-team] [Bug 1445637] Re: Instance resource quota not observed for non-ephemeral storage
Cinder client had offer qos command. Those instance quota settings of non-emphemeral disk should be set via cinder cli instead of inherit from instance's flavor. ** Changed in: nova Status: In Progress = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1445637 Title: Instance resource quota not observed for non-ephemeral storage Status in OpenStack Compute (nova): Invalid Bug description: I'm using a nova built from stable/kilo and trying to implement instance IO resource quotas for disk as per https://wiki.openstack.org/wiki/InstanceResourceQuota#IO. While this works when building an instance from ephemeral storage, it does not when booting from a bootable cinder volume. I realize I can implement this using cinder quota but I want to apply the same settings in nova regardless of the underlying disk. Steps to produce: nova flavor-create iolimited 1 8192 64 4 nova flavor-key 1 set quota:disk_read_iops_sec=1 Boot an instance using the above flavor Guest XML is missing iotune entries Expected result: snip target dev='vda' bus='virtio'/ iotune read_iops_sec1/read_iops_sec /iotune /snip This relates somewhat to https://bugs.launchpad.net/nova/+bug/1405367 but that case is purely hit when booting from RBD-backed ephemeral storage. Essentially, for non-ephemeral disks, a call is made to _get_volume_config() which creates a generic LibvirtConfigGuestDisk object but no further processing is done to add extra-specs (if any). I've essentially copied the disk_qos() method from the associated code review (https://review.openstack.org/#/c/143939/) to implement my own patch (attached). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1445637/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1445637] Re: Instance resource quota not observed for non-ephemeral storage
I'm sorry for I was too hurry to change into invalid. Originally I was thought those non-ephemeral disk was managed by cinder, those settings should dependent on it. And even you assign higher value, the rate was still limit by cinder. Then you can't observed the real rate. But I also thought flavor was hardware template, its settings should also apply. Maybe we could select the minimum quota value between cinder or flavor settings. ** Changed in: nova Status: Invalid = In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1445637 Title: Instance resource quota not observed for non-ephemeral storage Status in OpenStack Compute (nova): In Progress Bug description: I'm using a nova built from stable/kilo and trying to implement instance IO resource quotas for disk as per https://wiki.openstack.org/wiki/InstanceResourceQuota#IO. While this works when building an instance from ephemeral storage, it does not when booting from a bootable cinder volume. I realize I can implement this using cinder quota but I want to apply the same settings in nova regardless of the underlying disk. Steps to produce: nova flavor-create iolimited 1 8192 64 4 nova flavor-key 1 set quota:disk_read_iops_sec=1 Boot an instance using the above flavor Guest XML is missing iotune entries Expected result: snip target dev='vda' bus='virtio'/ iotune read_iops_sec1/read_iops_sec /iotune /snip This relates somewhat to https://bugs.launchpad.net/nova/+bug/1405367 but that case is purely hit when booting from RBD-backed ephemeral storage. Essentially, for non-ephemeral disks, a call is made to _get_volume_config() which creates a generic LibvirtConfigGuestDisk object but no further processing is done to add extra-specs (if any). I've essentially copied the disk_qos() method from the associated code review (https://review.openstack.org/#/c/143939/) to implement my own patch (attached). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1445637/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1467570] Re: Nova can't provision instance from snapshot with a ceph backend
** Also affects: horizon Importance: Undecided Status: New ** Changed in: horizon Assignee: (unassigned) = lyanchih (lyanchih) ** Changed in: horizon Status: New = Confirmed ** Changed in: nova Status: In Progress = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1467570 Title: Nova can't provision instance from snapshot with a ceph backend Status in OpenStack Dashboard (Horizon): Confirmed Status in OpenStack Compute (Nova): Invalid Bug description: This is a weird issue that does not happen in our Juno setup, but happens in our Kilo setup. The configuration between the two setups is pretty much the same, with only kilo-specific changes done (namely, moving lines around to new sections). Here's how to reproduce: 1.Provision an instance. 2.Make a snapshot of this instance. 3.Try to provision an instance with that snapshot. Nova-compute will complain that it can't find the disk and the instance will fall in error. Here's what the default behavior is supposed to be from my observations: -When the image is uploaded into ceph, a snapshot is created automatically inside ceph (this is NOT an instance snapshot per say, but a ceph internal snapshot). -When an instance is booted from image in nova, this snapshot gets a clone in the nova ceph pool. Nova then uses that clone as the instance's disk. This is called copy-on-write cloning. Here's when things get funky: -When an instance is booted from a snapshot, the copy-on-write cloning does not happen. Nova looks for the disk and, of course, fails to find it in its pool, thus failing to provision the instance . There's no trace anywhere of the copy-on- write clone failing (In part because ceph doesn't log client commands, from what I see). The compute logs I got are in this pastebin : http://pastebin.com/ADHTEnhn There's a few things I notice here that I'd like to point out : -Nova create an ephemeral drive file, then proceeds to delete it before using rbd_utils instead. While strange, this may be the intended but somewhat dirty behavior, as nova consider it an ephemeral instance, before realizing that it's actually a ceph instance and doesn't need its ephemeral disk. Or maybe these conjectures are completely wrong and this is part of the issue. -Nova creates the image (I'm guessing it's the copy-on-write cloning happening here). What exactly happens here isn't very clear, but then it complains that it can't find the clone in its pool to use as block device. This issue does not happen on ephemeral storage. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1467570/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp