OK, so you are in the same situation as mine. I would expect that you can keep the mon up without gdb if you set the max_mds of prod back to 3. Then you should try to revert the ‘rmfailed’ by either building your own ‘addfailed’ command or using the one built by Patrick. Details can be found in the referenced thread https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KQ5A5OWRIUEOJBC7VILBGDIKPQGJQIWN/
发件人: Ben Timby<mailto:bti...@smartfile.com> 发送时间: 2021年10月10日 21:50 收件人: 胡 玮文<mailto:huw...@outlook.com> 抄送: ceph-users@ceph.io<mailto:ceph-users@ceph.io> 主题: Re: [ceph-users] Cluster inaccessible Thanks for your reply. I tried something similar (but wrong) based on your messages in the referenced threads, but I was missing the "gdb commands...end" sequence so I just kept hitting my breakpoint over and over. However, I was able to get the monitor running with your guidance. Indeed, ceph rmfailed was run leading up to the crash. Here is the output of ceph fs dump: # ceph fs dump dumped fsmap epoch 68022 e68022 enable_multiple, ever_enabled_multiple: 1,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'cephfs' (1) fs_name cephfs epoch 68022 flags 12 created 2021-08-29T00:22:55.564386+0000 modified 2021-10-10T13:45:03.446746+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 10491 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=414534} failed damaged stopped data_pools [2] metadata_pool 3 inline_data disabled balancer standby_count_wanted 1 [mds.ceph-metadata-04{0:414534} state up:active seq 51 laggy since 2021-10-10T13:45:03.427312+0000 addr [v2:10.100.5.54:6800/1373631131,v1:10.100.5.54:6801/1373631131] compat {c=[1],r=[1],i=[7ff]}] Filesystem 'prod' (2) fs_name prod epoch 109b5 flags 12 created 2021-08-29T00:30:46.944134+0000 modified 2021-10-10T03:16:41.859392+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 9953 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0,1,2 up {} failed damaged stopped 3 data_pools [5] metadata_pool 4 inline_data disabled balancer standby_count_wanted 1 Filesystem 'beta' (3) fs_name beta epoch 68022 flags 12 created 2021-08-29T00:31:49.339070+0000 modified 2021-10-10T13:45:03.446747+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 10492 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=414465} failed damaged stopped data_pools [8] metadata_pool 7 inline_data disabled balancer standby_count_wanted 1 [mds.ceph-metadata-01{0:414465} state up:active seq 68 laggy since 2021-10-10T13:45:03.427278+0000 addr [v2:10.100.5.51:6800/391530905,v1:10.100.5.51:6801/391530905] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.ceph-metadata-05{ffffffff:6534c} state up:standby seq 1 laggy since 07e5-0a-0aT0d:2d:03.068547+0000 addr [v2:10.100.5.55:1a90/ea781ada,v1:10.100.5.55:1a91/ea781ada] compat {c=[1],r=[1],i=[7ff]}] [mds.ceph-metadata-03{ffffffff:653e4} state up:standby seq 1 laggy since 07e5-0a-0aT0d:2d:03.06855c+0000 addr [v2:10.100.5.53:1a90/64326853,v1:10.100.5.53:1a91/64326853] compat {c=[1],r=[1],i=[7ff]}] [mds.ceph-metadata-02{ffffffff:653e7} state up:standby seq 1 laggy since 07e5-0a-0aT0d:2d:03.068570+0000 addr [v2:10.100.5.52:1a90/63094e0,v1:10.100.5.52:1a91/63094e0] compat {c=[1],r=[1],i=[7ff]}] _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io