Hi all, We have a cephfs cluster in production for about 2 months and, for the past 2-3 weeks, we are regularly experiencing MDS crash loops (every 3-4 hours if we have some user activity). A temporary fix is to remove the MDSs in error (or unknown) state, stop samba & nfs-ganesha gateways, then wipe all sessions. Sometimes, we have to repeat this procedure 2 or 3 times to have our cephfs back and working... When looking in the MDS log files, I noticed that all crashs have the following stack trace: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.6/rpm/el8/BUILD/ceph-16.2.6/src/mds/Server.cc: 7503: FAILED ceph_assert(in->first <= straydn->first) ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7eff2644bcce] 2: /usr/lib64/ceph/libceph-common.so.2(+0x276ee8) [0x7eff2644bee8] 3: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x106a) [0x559c8f83331a] 4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x4d9) [0x559c8f837fe9] 5: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xefb) [0x559c8f84e82b] 6: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x3fc) [0x559c8f859aac] 7: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x12b) [0x559c8f86258b] 8: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0xbb4) [0x559c8f7bf374] 9: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7bb) [0x559c8f7c19eb] 10: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x16) [0x559c8f7c1f86] 11: (MDSContext::complete(int)+0x56) [0x559c8fac0906] 12: (MDSRank::_advance_queues()+0x84) [0x559c8f7c0a54] 13: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x204) [0x559c8f7c1434] 14: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x55) [0x559c8f7c1fe5] 15: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x128) [0x559c8f7b1f28] 16: (DispatchQueue::entry()+0x126a) [0x7eff266894da] 17: (DispatchQueue::DispatchThread::entry()+0x11) [0x7eff26739e21] 18: /lib64/libpthread.so.0(+0x814a) [0x7eff2543214a] 19: clone() I found an analog case in the ceph tracker website ( [ https://tracker.ceph.com/issues/41147 | https://tracker.ceph.com/issues/41147 ] ) so I suspected an inode corruption and I started a cephfs scrub (ceph tell mds.cephfsvol:0 scrub start / recursive,repair). As we have a lot of files (about 200 millions entries for 200 TB), I don't know how long time it will take nor: - If this will correct the situation - What to do to avoid the same situation in the future Some information about our ceph cluster (pacific 16.2.6 with containers): ********************************************************** ceph -s ********************************************************** cluster: id: 2943b4fe-2063-11ec-a560-e43d1a1bc30f health: HEALTH_WARN 1 MDSs report oversized cache services: mon: 5 daemons, quorum cephp03,cephp06,cephp05,cephp01,cephp02 (age 12d) mgr: cephp01.smfvfd(active, since 12d), standbys: cephp02.equfuj mds: 2/2 daemons up, 4 standby osd: 264 osds: 264 up (since 12d), 264 in (since 9w) rbd-mirror: 1 daemon active (1 hosts) task status: scrub status: mds.cephfsvol.cephp02.wsokro: idle+waiting paths [/] mds.cephfsvol.cephp05.qneike: active paths [/] data: volumes: 1/1 healthy pools: 5 pools, 2176 pgs objects: 595.12M objects, 200 TiB usage: 308 TiB used, 3.3 PiB / 3.6 PiB avail pgs: 2167 active+clean 7 active+clean+scrubbing+deep 2 active+clean+scrubbing io: client: 39 KiB/s rd, 152 KiB/s wr, 27 op/s rd, 27 op/s wr ********************************************************** # ceph fs get cephfsvol ********************************************************** Filesystem 'cephfsvol' (1) fs_name cephfsvol epoch 106554 flags 12 created 2021-09-28T14:19:54.399567+0000 modified 2022-02-08T12:57:00.653514+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 5497558138880 required_client_features {} last_failure 0 last_failure_osd_epoch 41205 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 2 in 0,1 up {0=4044909,1=3325354} failed damaged stopped data_pools [3,4] metadata_pool 2 inline_data disabled balancer standby_count_wanted 1 [mds.cephfsvol.cephp05.qneike{0:4044909} state up:active seq 1789 export targets 1 join_fscid=1 addr [v2:10.2.100.5:6800/2702983829,v1:10.2.100.5:6801/2702983829] compat {c=[1],r=[1],i=[7ff]}] [mds.cephfsvol.cephp02.wsokro{1:32bdaa} state up:active seq 18a02 export targets 0 join_fscid=1 addr [v2:10.2.100.2:1a90/aa660301,v1:10.2.100.2:1a91/aa660301] compat {c=[1],r=[1],i=[7ff]}] [peers=] Any help or advice would be greatly appreciated.... Arnaud _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io