We are using luminous, we have seven ceph nodes and setup them all as MDS.
Recently the MDS lost very frequently, and when there is only one MDS left, the 
cephfs just degraded to unusable.

Checked the mds log in one ceph node, I found below
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
/build/ceph-12.2.8/src/mds/Locker.cc: 5076: FAILED assert(lock->get_state() == 
LOCK_PRE_SCAN)

ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) 
[0x564400e50e42]
2: (Locker::file_recover(ScatterLock*)+0x208) [0x564400c6ae18]
3: (MDCache::start_files_to_recover()+0xb3) [0x564400b98af3]
4: (MDSRank::clientreplay_start()+0x1f7) [0x564400ae04c7]
5: (MDSRankDispatcher::handle_mds_map(MMDSMap*, MDSMap*)+0x25c0) 
[0x564400aefd40]
6: (MDSDaemon::handle_mds_map(MMDSMap*)+0x154d) [0x564400ace3bd]
7: (MDSDaemon::handle_core_message(Message*)+0x7f3) [0x564400ad1273]
8: (MDSDaemon::ms_dispatch(Message*)+0x1c3) [0x564400ad15a3]
9: (DispatchQueue::entry()+0xeda) [0x5644011a547a]
10: (DispatchQueue::DispatchThread::entry()+0xd) [0x564400ee3fcd]
11: (()+0x7494) [0x7f7a2b106494]
12: (clone()+0x3f) [0x7f7a2a17eaff]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

The full log is also attached. Could you please help us? Thanks!

BR
Oliver

<<attachment: ceph-mds.lkp-ceph-node1.zip>>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to