Re: block snapshot issue with RBD
Am 29.05.2024 um 12:14 hat Fiona Ebner geschrieben: > I bisected this issue to d3007d348a ("block: Fix crash when loading > snapshot on inactive node"). > > > diff --git a/block/snapshot.c b/block/snapshot.c > > index ec8cf4810b..c4d40e80dd 100644 > > --- a/block/snapshot.c > > +++ b/block/snapshot.c > > @@ -196,8 +196,10 @@ bdrv_snapshot_fallback(BlockDriverState *bs) > > int bdrv_can_snapshot(BlockDriverState *bs) > > { > > BlockDriver *drv = bs->drv; > > + > > GLOBAL_STATE_CODE(); > > -if (!drv || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) { > > + > > +if (!drv || !bdrv_is_inserted(bs) || !bdrv_is_writable(bs)) { > > return 0; > > } > > > > So I guess the issue is that the blockdev is not writable when > "postmigrate" state? That makes sense. The error message really isn't great, but after migration, the image is assumed to be owned by the destination, so we can't use it any more. 'cont' basically asserts that the migration failed and we can get ownership back. I don't think we can do without a manual command reactivating the image on the source, but we could have one that does this without resuming the VM. Kevin
Re: block snapshot issue with RBD
Hi, Am 28.05.24 um 20:19 schrieb Jin Cao: > Hi Ilya > > On 5/28/24 11:13 AM, Ilya Dryomov wrote: >> On Mon, May 27, 2024 at 9:06 PM Jin Cao wrote: >>> >>> Supplementary info: VM is paused after "migrate" command. After being >>> resumed with "cont", snapshot_delete_blkdev_internal works again, which >>> is confusing, as disk snapshot generally recommend I/O is paused, and a >>> frozen VM satisfy this requirement. >> >> Hi Jin, >> >> This doesn't seem to be related to RBD. Given that the same error is >> observed when using the RBD driver with the raw format, I would dig in >> the direction of migration somehow "installing" the raw format (which >> is on-disk compatible with the rbd format). >> > > Thanks for the hint. > >> Also, did you mean to say "snapshot_blkdev_internal" instead of >> "snapshot_delete_blkdev_internal" in both instances? > > Sorry for my copy-and-paste mistake. Yes, it's snapshot_blkdev_internal. > > -- > Sincerely, > Jin Cao > >> >> Thanks, >> >> Ilya >> >>> >>> -- >>> Sincerely >>> Jin Cao >>> >>> On 5/27/24 10:56 AM, Jin Cao wrote: >>>> CC block and migration related address. >>>> >>>> On 5/27/24 12:03 AM, Jin Cao wrote: >>>>> Hi, >>>>> >>>>> I encountered RBD block snapshot issue after doing migration. >>>>> >>>>> Steps >>>>> - >>>>> >>>>> 1. Start QEMU with: >>>>> ./qemu-system-x86_64 -name VM -machine q35 -accel kvm -cpu >>>>> host,migratable=on -m 2G -boot menu=on,strict=on >>>>> rbd:image/ubuntu-22.04-server-cloudimg-amd64.raw -net nic -net user >>>>> -cdrom /home/my/path/of/cloud-init.iso -monitor stdio >>>>> >>>>> 2. Do block snapshot in monitor cmd: snapshot_delete_blkdev_internal. >>>>> It works as expected: the snapshot is visable with command`rbd snap ls >>>>> pool_name/image_name`. >>>>> >>>>> 3. Do pseudo migration with monitor cmd: migrate -d >>>>> exec:cat>/tmp/vm.out >>>>> >>>>> 4. Do block snapshot again with snapshot_delete_blkdev_internal, then >>>>> I get: >>>>> Error: Block format 'raw' used by device 'ide0-hd0' does not >>>>> support internal snapshots >>>>> >>>>> I was hoping to do the second block snapshot successfully, and it >>>>> feels abnormal the RBD block snapshot function is disrupted after >>>>> migration. >>>>> >>>>> BTW, I get the same block snapshot error when I start QEMU with: >>>>> "-drive format=raw,file=rbd:pool_name/image_name" >>>>> >>>>> My questions is: how could I proceed with RBD block snapshot after the >>>>> pseudo migration? > > I bisected this issue to d3007d348a ("block: Fix crash when loading snapshot on inactive node"). > diff --git a/block/snapshot.c b/block/snapshot.c > index ec8cf4810b..c4d40e80dd 100644 > --- a/block/snapshot.c > +++ b/block/snapshot.c > @@ -196,8 +196,10 @@ bdrv_snapshot_fallback(BlockDriverState *bs) > int bdrv_can_snapshot(BlockDriverState *bs) > { > BlockDriver *drv = bs->drv; > + > GLOBAL_STATE_CODE(); > -if (!drv || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) { > + > +if (!drv || !bdrv_is_inserted(bs) || !bdrv_is_writable(bs)) { > return 0; > } > So I guess the issue is that the blockdev is not writable when "postmigrate" state? Best Regards, Fiona
Re: block snapshot issue with RBD
Hi Ilya On 5/28/24 11:13 AM, Ilya Dryomov wrote: On Mon, May 27, 2024 at 9:06 PM Jin Cao wrote: Supplementary info: VM is paused after "migrate" command. After being resumed with "cont", snapshot_delete_blkdev_internal works again, which is confusing, as disk snapshot generally recommend I/O is paused, and a frozen VM satisfy this requirement. Hi Jin, This doesn't seem to be related to RBD. Given that the same error is observed when using the RBD driver with the raw format, I would dig in the direction of migration somehow "installing" the raw format (which is on-disk compatible with the rbd format). Thanks for the hint. Also, did you mean to say "snapshot_blkdev_internal" instead of "snapshot_delete_blkdev_internal" in both instances? Sorry for my copy-and-paste mistake. Yes, it's snapshot_blkdev_internal. -- Sincerely, Jin Cao Thanks, Ilya -- Sincerely Jin Cao On 5/27/24 10:56 AM, Jin Cao wrote: CC block and migration related address. On 5/27/24 12:03 AM, Jin Cao wrote: Hi, I encountered RBD block snapshot issue after doing migration. Steps - 1. Start QEMU with: ./qemu-system-x86_64 -name VM -machine q35 -accel kvm -cpu host,migratable=on -m 2G -boot menu=on,strict=on rbd:image/ubuntu-22.04-server-cloudimg-amd64.raw -net nic -net user -cdrom /home/my/path/of/cloud-init.iso -monitor stdio 2. Do block snapshot in monitor cmd: snapshot_delete_blkdev_internal. It works as expected: the snapshot is visable with command`rbd snap ls pool_name/image_name`. 3. Do pseudo migration with monitor cmd: migrate -d exec:cat>/tmp/vm.out 4. Do block snapshot again with snapshot_delete_blkdev_internal, then I get: Error: Block format 'raw' used by device 'ide0-hd0' does not support internal snapshots I was hoping to do the second block snapshot successfully, and it feels abnormal the RBD block snapshot function is disrupted after migration. BTW, I get the same block snapshot error when I start QEMU with: "-drive format=raw,file=rbd:pool_name/image_name" My questions is: how could I proceed with RBD block snapshot after the pseudo migration?
Re: block snapshot issue with RBD
On Mon, May 27, 2024 at 9:06 PM Jin Cao wrote: > > Supplementary info: VM is paused after "migrate" command. After being > resumed with "cont", snapshot_delete_blkdev_internal works again, which > is confusing, as disk snapshot generally recommend I/O is paused, and a > frozen VM satisfy this requirement. Hi Jin, This doesn't seem to be related to RBD. Given that the same error is observed when using the RBD driver with the raw format, I would dig in the direction of migration somehow "installing" the raw format (which is on-disk compatible with the rbd format). Also, did you mean to say "snapshot_blkdev_internal" instead of "snapshot_delete_blkdev_internal" in both instances? Thanks, Ilya > > -- > Sincerely > Jin Cao > > On 5/27/24 10:56 AM, Jin Cao wrote: > > CC block and migration related address. > > > > On 5/27/24 12:03 AM, Jin Cao wrote: > >> Hi, > >> > >> I encountered RBD block snapshot issue after doing migration. > >> > >> Steps > >> - > >> > >> 1. Start QEMU with: > >> ./qemu-system-x86_64 -name VM -machine q35 -accel kvm -cpu > >> host,migratable=on -m 2G -boot menu=on,strict=on > >> rbd:image/ubuntu-22.04-server-cloudimg-amd64.raw -net nic -net user > >> -cdrom /home/my/path/of/cloud-init.iso -monitor stdio > >> > >> 2. Do block snapshot in monitor cmd: snapshot_delete_blkdev_internal. > >> It works as expected: the snapshot is visable with command`rbd snap ls > >> pool_name/image_name`. > >> > >> 3. Do pseudo migration with monitor cmd: migrate -d exec:cat>/tmp/vm.out > >> > >> 4. Do block snapshot again with snapshot_delete_blkdev_internal, then > >> I get: > >> Error: Block format 'raw' used by device 'ide0-hd0' does not > >> support internal snapshots > >> > >> I was hoping to do the second block snapshot successfully, and it > >> feels abnormal the RBD block snapshot function is disrupted after > >> migration. > >> > >> BTW, I get the same block snapshot error when I start QEMU with: > >> "-drive format=raw,file=rbd:pool_name/image_name" > >> > >> My questions is: how could I proceed with RBD block snapshot after the > >> pseudo migration?
Re: block snapshot issue with RBD
Supplementary info: VM is paused after "migrate" command. After being resumed with "cont", snapshot_delete_blkdev_internal works again, which is confusing, as disk snapshot generally recommend I/O is paused, and a frozen VM satisfy this requirement. -- Sincerely Jin Cao On 5/27/24 10:56 AM, Jin Cao wrote: CC block and migration related address. On 5/27/24 12:03 AM, Jin Cao wrote: Hi, I encountered RBD block snapshot issue after doing migration. Steps - 1. Start QEMU with: ./qemu-system-x86_64 -name VM -machine q35 -accel kvm -cpu host,migratable=on -m 2G -boot menu=on,strict=on rbd:image/ubuntu-22.04-server-cloudimg-amd64.raw -net nic -net user -cdrom /home/my/path/of/cloud-init.iso -monitor stdio 2. Do block snapshot in monitor cmd: snapshot_delete_blkdev_internal. It works as expected: the snapshot is visable with command`rbd snap ls pool_name/image_name`. 3. Do pseudo migration with monitor cmd: migrate -d exec:cat>/tmp/vm.out 4. Do block snapshot again with snapshot_delete_blkdev_internal, then I get: Error: Block format 'raw' used by device 'ide0-hd0' does not support internal snapshots I was hoping to do the second block snapshot successfully, and it feels abnormal the RBD block snapshot function is disrupted after migration. BTW, I get the same block snapshot error when I start QEMU with: "-drive format=raw,file=rbd:pool_name/image_name" My questions is: how could I proceed with RBD block snapshot after the pseudo migration?
Re: block snapshot issue with RBD
CC block and migration related address. On 5/27/24 12:03 AM, Jin Cao wrote: Hi, I encountered RBD block snapshot issue after doing migration. Steps - 1. Start QEMU with: ./qemu-system-x86_64 -name VM -machine q35 -accel kvm -cpu host,migratable=on -m 2G -boot menu=on,strict=on rbd:image/ubuntu-22.04-server-cloudimg-amd64.raw -net nic -net user -cdrom /home/my/path/of/cloud-init.iso -monitor stdio 2. Do block snapshot in monitor cmd: snapshot_delete_blkdev_internal. It works as expected: the snapshot is visable with command`rbd snap ls pool_name/image_name`. 3. Do pseudo migration with monitor cmd: migrate -d exec:cat>/tmp/vm.out 4. Do block snapshot again with snapshot_delete_blkdev_internal, then I get: Error: Block format 'raw' used by device 'ide0-hd0' does not support internal snapshots I was hoping to do the second block snapshot successfully, and it feels abnormal the RBD block snapshot function is disrupted after migration. BTW, I get the same block snapshot error when I start QEMU with: "-drive format=raw,file=rbd:pool_name/image_name" My questions is: how could I proceed with RBD block snapshot after the pseudo migration?
block snapshot issue with RBD
Hi, I encountered RBD block snapshot issue after doing migration. Steps - 1. Start QEMU with: ./qemu-system-x86_64 -name VM -machine q35 -accel kvm -cpu host,migratable=on -m 2G -boot menu=on,strict=on rbd:image/ubuntu-22.04-server-cloudimg-amd64.raw -net nic -net user -cdrom /home/my/path/of/cloud-init.iso -monitor stdio 2. Do block snapshot in monitor cmd: snapshot_delete_blkdev_internal. It works as expected: the snapshot is visable with command`rbd snap ls pool_name/image_name`. 3. Do pseudo migration with monitor cmd: migrate -d exec:cat>/tmp/vm.out 4. Do block snapshot again with snapshot_delete_blkdev_internal, then I get: Error: Block format 'raw' used by device 'ide0-hd0' does not support internal snapshots I was hoping to do the second block snapshot successfully, and it feels abnormal the RBD block snapshot function is disrupted after migration. BTW, I get the same block snapshot error when I start QEMU with: "-drive format=raw,file=rbd:pool_name/image_name" My questions is: how could I proceed with RBD block snapshot after the pseudo migration?