[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-26 Thread Janek Bevendorff
I have had defer_client_eviction_on_laggy_osds set to false for a while and I haven't had any further warnings so far (obviously), but also all the other problems with laggy clients bringing our MDS to a crawl over time seem to have gone. So at least on our cluster, the new configurable seems

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-21 Thread Janek Bevendorff
Hi, I took a snapshot of MDS.0's logs. We have five active MDS in total, each one reporting laggy OSDs/clients, but I cannot find anything related to that in the log snippet. Anyhow, I uploaded the log for your reference with ceph-post-file ID 79b5138b-61d7-4ba7-b0a9-c6f02f47b881. This is

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Dhairya Parmar
Hi Janek, The PR venky mentioned makes use of OSD's laggy parameters (laggy_interval and laggy_probability) to find if any OSD is laggy or not. These laggy parameters can reset to 0 if the interval between the last modification done to OSDMap and the time stamp when OSD was marked down exceeds

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Venky Shankar
Hey Janek, I took a closer look at various places where the MDS would consider a client as laggy and it seems like a wide variety of reasons are taken into consideration and not all of them might be a reason to defer client eviction, so the warning is a bit misleading. I'll post a PR for this. In

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Venky Shankar
Hi Janek, On Tue, Sep 19, 2023 at 4:44 PM Janek Bevendorff < janek.bevendo...@uni-weimar.de> wrote: > Hi Venky, > > As I said: There are no laggy OSDs. The maximum ping I have for any OSD in > ceph osd perf is around 60ms (just a handful, probably aging disks). The > vast majority of OSDs have

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-19 Thread Janek Bevendorff
Hi Venky, As I said: There are no laggy OSDs. The maximum ping I have for any OSD in ceph osd perf is around 60ms (just a handful, probably aging disks). The vast majority of OSDs have ping times of less than 1ms. Same for the host machines, yet I'm still seeing this message. It seems that

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-19 Thread Venky Shankar
Hi Janek, On Mon, Sep 18, 2023 at 9:52 PM Janek Bevendorff < janek.bevendo...@uni-weimar.de> wrote: > Thanks! However, I still don't really understand why I am seeing this. > This is due to a changes that was merged recently in pacific https://github.com/ceph/ceph/pull/52270 The MDS

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-18 Thread Janek Bevendorff
Thanks! However, I still don't really understand why I am seeing this. The first time I had this, one of the clients was a remote user dialling in via VPN, which could indeed be laggy. But I am also seeing it from neighbouring hosts that are on the same physical network with reliable ping

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-18 Thread Laura Flores
Hi Janek, There was some documentation added about it here: https://docs.ceph.com/en/pacific/cephfs/health-messages/ There is a description of what it means, and it's tied to an mds configurable. On Mon, Sep 18, 2023 at 10:51 AM Janek Bevendorff < janek.bevendo...@uni-weimar.de> wrote: > Hey