Public bug reported:
>From https://issues.redhat.com/browse/OSPRH-13142:
Description of problem:
For a boot-from-volume instances, 'openstack server rescue <vm> --image
<image>' fails with the following issues:
1. It attempts to attach two disks: <instance_uuid>_disk &
<instance_uuid>_disk.rescue. Only, <instance_uuid>_disk.rescue is
created so it fails with the following error:
2024-01-23 16:32:14.338 2 ERROR oslo_messaging.rpc.server
nova.exception.InstanceNotRescuable: Instance
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2 cannot be rescued: Driver Error:
internal error: process exited while connecting to monitor:
2024-01-23T16:32:13.017966Z qemu-kvm: -blockdev
{"driver":"rbd","pool":"vms","image":"dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk","server":[{"host":"172.16.1.100","port":"6789"}],"user":"openstack","auth-
client-required":["cephx","none"],"key-secret":"libvirt-1-storage-auth-
secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-
flush":false},"auto-read-only":true,"discard":"unmap"}: error reading
header from dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk: No such file or
directory
If you look in ceph, only the .rescue image exists.
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT
LOCK
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue 10 GiB 2
excl
However we see the instance configured with both disks.
# virsh domblklist instance-00000003
Target Source
----------------------------------------------------------------
vda vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue
vdb vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk
If I manually copy, the UUID_disk.rescue to UUID_disk, the instance will
boot into RESCUE mode. It seems the UUID_disk volume is not needed and
should not be configured in this RESCUE situation.
2. The RESCUED instance doesn't attach the cinder root volume. The
cinder root also doesnt re-attach after "unrescuing" the instance.
Reproducer:
$ openstack volume create --size 10 --image rhel8 rootvol1
$ openstack volume list
+--------------------------------------+----------+-----------+------+-------------+
| ID | Name | Status | Size | Attached
to |
+--------------------------------------+----------+-----------+------+-------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | available | 10 |
|
+--------------------------------------+----------+-----------+------+-------------+
$ openstack server create --key-name default --flavor rhel --volume rootvol1
--network external test1
$ openstack server show test1 -c status -c image -c volumes_attached
+------------------+--------------------------------------------------------------------------+
| Field | Value
|
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume)
|
| status | ACTIVE
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
$ openstack server rescue test1 --image rhel8
$ openstack server show test1 -c status -c image -c volumes_attached -c fault
--fit
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value
|
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| fault | {'code': 400, 'created': '2024-01-23T20:12:17Z',
'message': 'Instance ac3d46c0-c8d5-45df-bd17-d467baaa5a98 cannot be rescued:
Driver |
| | Error: internal error: process exited while connecting to
monitor: 2024-01-23T20:12:17.612453Z qemu-kvm: -blockdev
|
| |
{"driver":"rbd","pool":"vms","image":"ac3d46c0-c8d5-45df-bd17-d467ba'}
|
| image | N/A (booted from volume)
|
| status | ERROR
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596'
|
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT
LOCK
ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue 10 GiB 2
NOTE: here if manually create the _disk volume, the instance will boot
into rescue mode; however, the cinder volume is not attached.
# rbd --id openstack cp vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
Image copy: 100% complete...done.
RESCUE now completes and the instance is accessible (without cinder root
vol attached).
$ openstack server show test1 -c status -c image -c volumes_attached -c fault
--fit
+------------------+--------------------------------------------------------------------------+
| Field | Value
|
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume)
|
| status | RESCUE
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
volume still shows in-use
$ openstack volume list
+--------------------------------------+----------+--------+------+--------------------------------+
| ID | Name | Status | Size | Attached to
|
+--------------------------------------+----------+--------+------+--------------------------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | in-use | 10 | Attached to
test1 on /dev/vda |
+--------------------------------------+----------+--------+------+--------------------------------+
But not attached.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
The other ugly thing, the unrescue does not revert this back to original disk
config.
$ openstack server unrescue test1
$ openstack server show test1 -c status -c image -c volumes_attached -c fault
--fit
+------------------+--------------------------------------------------------------------------+
| Field | Value
|
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume)
|
| status | ACTIVE
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
The above looks good, but the instance is still booted on rescue disks.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
A hard reboot will fix it:
$ openstack server reboot --hard test1
Now the instance is back to boot from vol:
# virsh domblklist instance-00000004
Target Source
---------------------------------------------------------------
vda volumes/volume-f855dfe6-ad5a-4497-87ff-16ac5856f596
Version-Release number of selected component (if applicable):
Wallaby
How reproducible:
100%
Steps to Reproduce:
1. See above
2.
3.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2110738
Title:
Stable rescue fails when necessary image properties not set
Status in OpenStack Compute (nova):
New
Bug description:
From https://issues.redhat.com/browse/OSPRH-13142:
Description of problem:
For a boot-from-volume instances, 'openstack server rescue <vm>
--image <image>' fails with the following issues:
1. It attempts to attach two disks: <instance_uuid>_disk &
<instance_uuid>_disk.rescue. Only, <instance_uuid>_disk.rescue is
created so it fails with the following error:
2024-01-23 16:32:14.338 2 ERROR oslo_messaging.rpc.server
nova.exception.InstanceNotRescuable: Instance
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2 cannot be rescued: Driver Error:
internal error: process exited while connecting to monitor:
2024-01-23T16:32:13.017966Z qemu-kvm: -blockdev
{"driver":"rbd","pool":"vms","image":"dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk","server":[{"host":"172.16.1.100","port":"6789"}],"user":"openstack","auth-
client-required":["cephx","none"],"key-secret":"libvirt-1-storage-
auth-secret0","node-
name":"libvirt-1-storage","cache":{"direct":false,"no-
flush":false},"auto-read-only":true,"discard":"unmap"}: error reading
header from dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk: No such file or
directory
If you look in ceph, only the .rescue image exists.
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT
LOCK
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue 10 GiB 2
excl
However we see the instance configured with both disks.
# virsh domblklist instance-00000003
Target Source
----------------------------------------------------------------
vda vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue
vdb vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk
If I manually copy, the UUID_disk.rescue to UUID_disk, the instance
will boot into RESCUE mode. It seems the UUID_disk volume is not
needed and should not be configured in this RESCUE situation.
2. The RESCUED instance doesn't attach the cinder root volume. The
cinder root also doesnt re-attach after "unrescuing" the instance.
Reproducer:
$ openstack volume create --size 10 --image rhel8 rootvol1
$ openstack volume list
+--------------------------------------+----------+-----------+------+-------------+
| ID | Name | Status | Size |
Attached to |
+--------------------------------------+----------+-----------+------+-------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | available | 10 |
|
+--------------------------------------+----------+-----------+------+-------------+
$ openstack server create --key-name default --flavor rhel --volume rootvol1
--network external test1
$ openstack server show test1 -c status -c image -c volumes_attached
+------------------+--------------------------------------------------------------------------+
| Field | Value
|
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume)
|
| status | ACTIVE
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
$ openstack server rescue test1 --image rhel8
$ openstack server show test1 -c status -c image -c volumes_attached -c fault
--fit
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value
|
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| fault | {'code': 400, 'created': '2024-01-23T20:12:17Z',
'message': 'Instance ac3d46c0-c8d5-45df-bd17-d467baaa5a98 cannot be rescued:
Driver |
| | Error: internal error: process exited while connecting
to monitor: 2024-01-23T20:12:17.612453Z qemu-kvm: -blockdev
|
| |
{"driver":"rbd","pool":"vms","image":"ac3d46c0-c8d5-45df-bd17-d467ba'}
|
| image | N/A (booted from volume)
|
| status | ERROR
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596'
|
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT
LOCK
ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue 10 GiB 2
NOTE: here if manually create the _disk volume, the instance will boot
into rescue mode; however, the cinder volume is not attached.
# rbd --id openstack cp vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
Image copy: 100% complete...done.
RESCUE now completes and the instance is accessible (without cinder
root vol attached).
$ openstack server show test1 -c status -c image -c volumes_attached -c fault
--fit
+------------------+--------------------------------------------------------------------------+
| Field | Value
|
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume)
|
| status | RESCUE
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
volume still shows in-use
$ openstack volume list
+--------------------------------------+----------+--------+------+--------------------------------+
| ID | Name | Status | Size | Attached
to |
+--------------------------------------+----------+--------+------+--------------------------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | in-use | 10 | Attached
to test1 on /dev/vda |
+--------------------------------------+----------+--------+------+--------------------------------+
But not attached.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
The other ugly thing, the unrescue does not revert this back to original disk
config.
$ openstack server unrescue test1
$ openstack server show test1 -c status -c image -c volumes_attached -c fault
--fit
+------------------+--------------------------------------------------------------------------+
| Field | Value
|
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume)
|
| status | ACTIVE
|
| volumes_attached | delete_on_termination='False',
id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
The above looks good, but the instance is still booted on rescue
disks.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
A hard reboot will fix it:
$ openstack server reboot --hard test1
Now the instance is back to boot from vol:
# virsh domblklist instance-00000004
Target Source
---------------------------------------------------------------
vda volumes/volume-f855dfe6-ad5a-4497-87ff-16ac5856f596
Version-Release number of selected component (if applicable):
Wallaby
How reproducible:
100%
Steps to Reproduce:
1. See above
2.
3.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2110738/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp