Hi all,

we have an octopus v15.2.17 cluster and observe that one of our MDS hosts 
showed up in the OSD blacklist:

# ceph osd blacklist ls
192.168.32.87:6801/3841823949 2023-03-22T10:08:02.589698+0100
192.168.32.87:6800/3841823949 2023-03-22T10:08:02.589698+0100

I see an MDS restart that might be related; see log snippets below. There are 
no clients running on this host, only OSDs and one MDS. What could be the 
reason for the blacklist entries?

Thanks!

Log snippets:

Mar 21 10:07:54 ceph-23 journal: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 
7f99e63d5700 time 2023-03-21T10:07:54.967936+0100
Mar 21 10:07:54 ceph-23 journal: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 59: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)
Mar 21 10:07:54 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:54 ceph-23 journal: 1: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x158) [0x7f99f4a25b92]
Mar 21 10:07:54 ceph-23 journal: 2: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:54 ceph-23 journal: 3: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:54 ceph-23 journal: 4: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:54 ceph-23 journal: 5: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:54 ceph-23 journal: 6: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:54 ceph-23 journal: 7: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:54 ceph-23 journal: 8: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:54 ceph-23 journal: 9: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:54 ceph-23 journal: 10: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:54 ceph-23 journal: *** Caught signal (Aborted) **
Mar 21 10:07:54 ceph-23 journal: in thread 7f99e63d5700 thread_name:MR_Finisher
Mar 21 10:07:54 ceph-23 journal: 2023-03-21T10:07:54.980+0100 7f99e63d5700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 
7f99e63d5700 time 2023-03-21T10:07:54.967936+0100
Mar 21 10:07:54 ceph-23 journal: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 59: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)
Mar 21 10:07:54 ceph-23 journal: 
Mar 21 10:07:54 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:54 ceph-23 journal: 1: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x158) [0x7f99f4a25b92]
Mar 21 10:07:54 ceph-23 journal: 2: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:54 ceph-23 journal: 3: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:54 ceph-23 journal: 4: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:54 ceph-23 journal: 5: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:54 ceph-23 journal: 6: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:54 ceph-23 journal: 7: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:54 ceph-23 journal: 8: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:54 ceph-23 journal: 9: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:54 ceph-23 journal: 10: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:54 ceph-23 journal: 
Mar 21 10:07:54 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:54 ceph-23 journal: 1: (()+0x12ce0) [0x7f99f3605ce0]
Mar 21 10:07:54 ceph-23 journal: 2: (gsignal()+0x10f) [0x7f99f2062a9f]
Mar 21 10:07:54 ceph-23 journal: 3: (abort()+0x127) [0x7f99f2035e05]
Mar 21 10:07:54 ceph-23 journal: 4: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x1a9) [0x7f99f4a25be3]
Mar 21 10:07:54 ceph-23 journal: 5: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:54 ceph-23 journal: 6: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:54 ceph-23 journal: 7: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:54 ceph-23 journal: 8: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:54 ceph-23 journal: 9: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:54 ceph-23 journal: 10: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:54 ceph-23 journal: 11: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:54 ceph-23 journal: 12: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:54 ceph-23 journal: 13: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:54 ceph-23 journal: 2023-03-21T10:07:54.982+0100 7f99e63d5700 -1 
*** Caught signal (Aborted) **
Mar 21 10:07:54 ceph-23 journal: in thread 7f99e63d5700 thread_name:MR_Finisher
Mar 21 10:07:54 ceph-23 journal: 
Mar 21 10:07:54 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:54 ceph-23 journal: 1: (()+0x12ce0) [0x7f99f3605ce0]
Mar 21 10:07:54 ceph-23 journal: 2: (gsignal()+0x10f) [0x7f99f2062a9f]
Mar 21 10:07:54 ceph-23 journal: 3: (abort()+0x127) [0x7f99f2035e05]
Mar 21 10:07:54 ceph-23 journal: 4: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x1a9) [0x7f99f4a25be3]
Mar 21 10:07:54 ceph-23 journal: 5: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:54 ceph-23 journal: 6: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:54 ceph-23 journal: 7: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:54 ceph-23 journal: 8: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:54 ceph-23 journal: 9: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:54 ceph-23 journal: 10: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:54 ceph-23 journal: 11: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:54 ceph-23 journal: 12: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:54 ceph-23 journal: 13: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:54 ceph-23 journal: NOTE: a copy of the executable, or `objdump 
-rdS <executable>` is needed to interpret this.
Mar 21 10:07:54 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal:    -1> 2023-03-21T10:07:54.980+0100 
7f99e63d5700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 
7f99e63d5700 time 2023-03-21T10:07:54.967936+0100
Mar 21 10:07:55 ceph-23 journal: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 59: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:55 ceph-23 journal: 1: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x158) [0x7f99f4a25b92]
Mar 21 10:07:55 ceph-23 journal: 2: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:55 ceph-23 journal: 3: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:55 ceph-23 journal: 4: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:55 ceph-23 journal: 5: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:55 ceph-23 journal: 6: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:55 ceph-23 journal: 7: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:55 ceph-23 journal: 8: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:55 ceph-23 journal: 9: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:55 ceph-23 journal: 10: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal:     0> 2023-03-21T10:07:54.982+0100 
7f99e63d5700 -1 *** Caught signal (Aborted) **
Mar 21 10:07:55 ceph-23 journal: in thread 7f99e63d5700 thread_name:MR_Finisher
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:55 ceph-23 journal: 1: (()+0x12ce0) [0x7f99f3605ce0]
Mar 21 10:07:55 ceph-23 journal: 2: (gsignal()+0x10f) [0x7f99f2062a9f]
Mar 21 10:07:55 ceph-23 journal: 3: (abort()+0x127) [0x7f99f2035e05]
Mar 21 10:07:55 ceph-23 journal: 4: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x1a9) [0x7f99f4a25be3]
Mar 21 10:07:55 ceph-23 journal: 5: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:55 ceph-23 journal: 6: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:55 ceph-23 journal: 7: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:55 ceph-23 journal: 8: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:55 ceph-23 journal: 9: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:55 ceph-23 journal: 10: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:55 ceph-23 journal: 11: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:55 ceph-23 journal: 12: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:55 ceph-23 journal: 13: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:55 ceph-23 journal: NOTE: a copy of the executable, or `objdump 
-rdS <executable>` is needed to interpret this.
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: -9999> 2023-03-21T10:07:54.980+0100 
7f99e63d5700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 
7f99e63d5700 time 2023-03-21T10:07:54.967936+0100
Mar 21 10:07:55 ceph-23 journal: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/15.2.17/rpm/el8/BUILD/ceph-15.2.17/src/mds/ScatterLock.h:
 59: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:55 ceph-23 journal: 1: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x158) [0x7f99f4a25b92]
Mar 21 10:07:55 ceph-23 journal: 2: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:55 ceph-23 journal: 3: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:55 ceph-23 journal: 4: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:55 ceph-23 journal: 5: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:55 ceph-23 journal: 6: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:55 ceph-23 journal: 7: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:55 ceph-23 journal: 8: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:55 ceph-23 journal: 9: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:55 ceph-23 journal: 10: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: -9998> 2023-03-21T10:07:54.982+0100 
7f99e63d5700 -1 *** Caught signal (Aborted) **
Mar 21 10:07:55 ceph-23 journal: in thread 7f99e63d5700 thread_name:MR_Finisher
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: ceph version 15.2.17 
(8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Mar 21 10:07:55 ceph-23 journal: 1: (()+0x12ce0) [0x7f99f3605ce0]
Mar 21 10:07:55 ceph-23 journal: 2: (gsignal()+0x10f) [0x7f99f2062a9f]
Mar 21 10:07:55 ceph-23 journal: 3: (abort()+0x127) [0x7f99f2035e05]
Mar 21 10:07:55 ceph-23 journal: 4: (ceph::__ceph_assert_fail(char const*, char 
const*, int, char const*)+0x1a9) [0x7f99f4a25be3]
Mar 21 10:07:55 ceph-23 journal: 5: (()+0x27ddac) [0x7f99f4a25dac]
Mar 21 10:07:55 ceph-23 journal: 6: (MDCache::truncate_inode(CInode*, 
LogSegment*)+0x32c) [0x561bd623962c]
Mar 21 10:07:55 ceph-23 journal: 7: 
(C_MDS_inode_update_finish::finish(int)+0x133) [0x561bd6210a83]
Mar 21 10:07:55 ceph-23 journal: 8: (MDSContext::complete(int)+0x56) 
[0x561bd6422656]
Mar 21 10:07:55 ceph-23 journal: 9: (MDSIOContextBase::complete(int)+0x39c) 
[0x561bd6422b5c]
Mar 21 10:07:55 ceph-23 journal: 10: (MDSLogContextBase::complete(int)+0x44) 
[0x561bd6422cb4]
Mar 21 10:07:55 ceph-23 journal: 11: (Finisher::finisher_thread_entry()+0x1a5) 
[0x7f99f4ab6a95]
Mar 21 10:07:55 ceph-23 journal: 12: (()+0x81ca) [0x7f99f35fb1ca]
Mar 21 10:07:55 ceph-23 journal: 13: (clone()+0x43) [0x7f99f204ddd3]
Mar 21 10:07:55 ceph-23 journal: NOTE: a copy of the executable, or `objdump 
-rdS <executable>` is needed to interpret this.
Mar 21 10:07:55 ceph-23 journal: 
Mar 21 10:07:55 ceph-23 journal: reraise_fatal: default handler for signal 6 
didn't terminate the process?
Mar 21 10:07:58 ceph-23 dockerd-current: 
time="2023-03-21T10:07:58.119559277+01:00" level=warning 
msg="040c1e98a0669204e0e98bdbcdde893f8acf63444f3827358e663a13a2869478 cleanup: 
failed to unmount secrets: invalid argument"
Mar 21 10:07:58 ceph-23 kernel: overlayfs: upperdir is in-use as 
upperdir/workdir of another mount, accessing files from both mounts will result 
in undefined behavior.
Mar 21 10:07:58 ceph-23 kernel: overlayfs: workdir is in-use as 
upperdir/workdir of another mount, accessing files from both mounts will result 
in undefined behavior.
Mar 21 10:07:58 ceph-23 journal: 118 get_config 
/opt/ceph-container/bin/config.static.sh
Mar 21 10:07:58 ceph-23 journal: 5 start_mds 
/opt/ceph-container/bin/start_mds.sh
Mar 21 10:07:58 ceph-23 journal: 120 main /opt/ceph-container/bin/entrypoint.sh
Mar 21 10:07:58 ceph-23 journal: 2023-03-21 10:07:58  
/opt/ceph-container/bin/entrypoint.sh: static: does not generate config
Mar 21 10:07:58 ceph-23 journal: 58 start_mds 
/opt/ceph-container/bin/start_mds.sh
Mar 21 10:07:58 ceph-23 journal: 120 main /opt/ceph-container/bin/entrypoint.sh
Mar 21 10:07:58 ceph-23 journal: 2023-03-21 10:07:58  
/opt/ceph-container/bin/entrypoint.sh: SUCCESS
Mar 21 10:07:58 ceph-23 journal: starting mds.ceph-23 at 

=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to