[ceph-users] Re: MDS cache tunning

2021-06-02 Thread Andres Rojas Guerrero
Hi, after one week with only one a MDS all the errors have vanished and the cluster it's running smoothly! Thank you very much for the help!! El 27/5/21 a las 9:50, Andres Rojas Guerrero escribió: Thank you very much, very good explanation!! El 27/5/21 a las 9:42, Dan van der Ster escribió:

[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Andres Rojas Guerrero
Thank you very much, very good explanation!! El 27/5/21 a las 9:42, Dan van der Ster escribió: etween 100-200 -- *** Andrés Rojas Guerrero Unidad Sistemas Linux Area Arquitectura Tecnológica Secretaría General Adjunta de Informática Consejo

[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Dan van der Ster
I don't think # clients alone is a good measure by which to decide to deploy multiple MDSs -- idle clients create very little load, but just a few badly behaving clients can use all the MDS performance. (If you must hear a number, I can share that we have single MDSs with 2-3000 clients

[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Andres Rojas Guerrero
Oh, very interesting!! I have reduced the number of MDS to one. Only one question more, out of curiosity, from what number can we consider that there are many clients? El 27/5/21 a las 9:24, Dan van der Ster escribió: On Thu, May 27, 2021 at 9:21 AM Andres Rojas Guerrero wrote: El

[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Dan van der Ster
On Thu, May 27, 2021 at 9:21 AM Andres Rojas Guerrero wrote: > > > > El 26/5/21 a las 16:51, Dan van der Ster escribió: > > I see you have two active MDSs. Is your cluster more stable if you use > > only one single active MDS? > > Good question!! I read form Ceph Doc: > > "You should configure

[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Andres Rojas Guerrero
El 26/5/21 a las 16:51, Dan van der Ster escribió: I see you have two active MDSs. Is your cluster more stable if you use only one single active MDS? Good question!! I read form Ceph Doc: "You should configure multiple active MDS daemons when your metadata performance is bottlenecked on

[ceph-users] Re: MDS cache tunning

2021-05-26 Thread Dan van der Ster
FS_DEGRADED indicates that your MDS restarted or stopped responding to health beacons. Are your MDSs going OOM? I see you have two active MDSs. Is your cluster more stable if you use only one single active MDS? -- Dan On Wed, May 26, 2021 at 2:44 PM Andres Rojas Guerrero wrote: > > Ok

[ceph-users] Re: MDS cache tunning

2021-05-26 Thread Andres Rojas Guerrero
Ok thank's, I will try to update Nautilus. But really I don't understand the problem, apparently randomly Warnings appear: [WRN] Health check failed: 1 MDSs report slow requests (MDS_SLOW_REQUEST) cluster [INF] Health check cleared: FS_DEGRADED (was: 1 filesystem is degraded) : cluster

[ceph-users] Re: MDS cache tunning

2021-05-26 Thread Dan van der Ster
I've seen your other thread. Using 78GB of RAM when the memory limit is set to 64GB is not highly unusual, and doesn't necessarily indicate any problem. It *would* be a problem if the MDS memory grows uncontrollably, however. Otherwise, check those new defaults for caps recall -- they were

[ceph-users] Re: MDS cache tunning

2021-05-26 Thread Andres Rojas Guerrero
Thanks for the answer. Yes, during these last weeks I have had memory consumption problems in the MDS nodes that led, at least it seemed to me, to performance problems in CephFS. I have been varying, for example: mds_cache_memory_limit mds_min_caps_per_client mds_health_cache_threshold

[ceph-users] Re: MDS cache tunning

2021-05-26 Thread Dan van der Ster
Hi, The mds_cache_memory_limit should be set to something relative to the RAM size of the MDS -- maybe 50% is a good rule of thumb, because there are a few cases where the RSS can exceed this limit. Your experience will help guide what size you need (metadata pool IO activity will be really high