Regarding: "Probably there is no IO" When I check "ceph -s", it says there is activity. Not much, but some activity:
io: client: 316 KiB/s rd, 10 KiB/s wr, 1 op/s rd, 1 op/s wr And no activity doesn't explain why cephfs-top complains about not finding the cluster. Re: using the options available: # cephfs-top --cluster ceph cluster ceph does not exist # cephfs-top --id 767edb64-4552-43a2-a128-34d693343904 cluster ceph does not exist # cephfs-top --conffile /etc/ceph/ceph.conf cluster ceph does not exist It really doesn't want to connect to the cluster! ;-) In the mgr logfile, I see an issue that might be related ( I never checked the mgr log, as all seemed to be working fine) showing every second or so: 1 mgr finish mon failed to return metadata for mds.cph-a-002: (2) No such file or directory This actually seems to be the same issue as this user experienced: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-April/026126.html The active mds gives the same error if I do a "ceph mds metadata <mds>": # ceph mds metadata cph-a-002 {} Error ENOENT: The secondary works: # ceph mds metadata cph-h-001 { "addr": "[v2:10.0.1.101:6800/1267106302,v1:10.0.1.101:6801/1267106302]", "arch": "x86_64", "ceph_release": "pacific", "ceph_version": "ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)", "ceph_version_short": "16.2.5", ....etc If I reboot the mgr, I see some dubious messages: 0 ms_deliver_dispatch: unhandled message 0x55b3d8c449a0 mon_map magic: 0 v1 from mon.1 v2:10.0.1.102:3300/0 ... -1 client.0 error registering admin socket command: (17) File exists Then it continues with the error message above every second or so. I then reboot the primary mon, which shows some iteresting messages: 3 rocksdb: [le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_version>=5. ... 0 mon.cph-h-001@-1(???).osd e35181 crush map has features 3314933069573799936, adjusting msgr requires ... 1 mon.cph-h-001@0(electing) e26 collect_metadata : no unique device id for : fallback method has no model nor serial' In the active mds logs, not much is happening. Hope this gives a better overview of what's (not) happening. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io