Hi Folks,

I'm running Ceph 18 with OpenStack for my lab (and home services) in a 3 node 
cluster on Ubuntu 22.04. I'm quite new to these platforms. Just learning. This 
is my build, for what it's worth: 
https://blog.rhysgoodwin.com/it/openstack-ceph-hyperconverged/

I got myself into some trouble as follows. This is the sequence of events:

I don't recall when but at some stage I must have tried an image migration from 
one pool to another. The source pool/image is infra-pool/sophosbuild I don't 
know what the target would have been. In any case on my travels, I found the 
infra-pool/sophosbuild image in the trash:
rhys@hcn03:/imagework# rbd trash ls --all infra-pool
65a87bb2472fe sophosbuild

I tried to delete it but got the following:

rhys@hcn03:/imagework# rbd trash rm infra-pool/65a87bb2472fe
2023-10-06T04:23:13.775+0000 7f28bbfff640 -1 librbd::image::RefreshRequest: 
image being migrated
2023-10-06T04:23:13.775+0000 7f28bbfff640 -1 librbd::image::OpenRequest: failed 
to refresh image: (30) Read-only file system
2023-10-06T04:23:13.775+0000 7f28bbfff640 -1 librbd::ImageState: 0x7f28a804b600 
failed to open image: (30) Read-only file system
2023-10-06T04:23:13.775+0000 7f28a2ffd640 -1 librbd::image::RemoveRequest: 
0x7f28a8000b90 handle_open_image: error opening image: (30) Read-only file 
system
rbd: remove error: (30) Read-only file systemRemoving image: 0% 
complete...failed.

Next, I tried to restore the image, and this also failed:
rhys@hcn03:/imagework:# rbd trash restore infra-pool/65a87bb2472fe
librbd::api::Trash: restore: Current trash source 'migration' does not match 
expected: user,mirroring,unknown (4)

Probably stupidly, I followed the steps in this post: 
https://www.spinics.net/lists/ceph-users/msg72786.html to change offset 07 from 
02 (TRASH_IMAGE_SOURCE_MIGRATION) in omap value to 00(TRASH_IMAGE_SOURCE_USER)

After this I was able to restore the image successfully.
However, I still could not delete it:
rhys@hcn03:/imagework:# rbd rm infra-pool/sophosbuild
2023-10-06T05:52:30.708+0000 7ff5937fe640 -1 librbd::image::RefreshRequest: 
image being migrated
2023-10-06T05:52:30.708+0000 7ff5937fe640 -1 librbd::image::OpenRequest: failed 
to refresh image: (30) Read-only file system
2023-10-06T05:52:30.708+0000 7ff5937fe640 -1 librbd::ImageState: 0x564d3f83d680 
failed to open image: (30) Read-only file system
Removing image: 0% complete...failed.rbd: delete error: (30) Read-only file 
system

I tried to abort the migration with: root@hcn03:/imagework# rbd migration abort 
infra-pool/sophosbuild
This took a few mins but failed at 99% (sorry, terminal scroll back lost)

So now I'm stuck, I don't know how to get rid of this image and while 
everything is otherwise healthy in the cluster, the dashboard is throwing 
errors when it tries to enumerate the images in that pool.

I'm considering migrating the good images off this pool and deleing the pool. 
But I don't even know if I'll be allowed to delete the pool while this issue is 
present.

Any advice would be much appreciated.

Kind regards,
Rhys
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to