[ceph-users] Re: 15.2.8 mgr keep crashing every few days

2021-03-01 Thread Welby McRoberts
The release notes do have it, however it's under different PR & issue numbers, as it's backported into octopus: mgr/ActivePyModules.cc: always release GIL before attempting to acquire a lock (pr#38801, Cory Snyder) [https://github.com/ceph/ceph/pull/38801,

[ceph-users] mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-10 Thread Welby McRoberts
Hi Folks We've noticed that in a cluster of 21 nodes (5 mgrs & 504 OSDs with 24 per node) that the mgr's are, after a non specific period of time, dropping out of the cluster. The logs only show the following: debug 2020-12-10T02:02:50.409+ 7f1005840700 0 log_channel(cluster) log [DBG] :

[ceph-users] Re: ceph pgs inconsistent, always the same checksum

2020-09-14 Thread Welby McRoberts
Hi Igor We'll take a look at disabling swap on the nodes and see if that improves the situation. Having checked across all osds we're not seeing bluestore_reads_with_retries as anything other than a zero value. We get the error anywhere from 3 - 10 occurrences of the error a week, but it's