[ceph-users] MDS tuning for large production cluster

Devin A. Bougie via ceph-users Thu, 15 Jan 2026 14:36:56 -0800

We have a 19.2.3 cluster managed by cephadm with five management nodes and 21 
OSD nodes.  We have roughly 300 linux clients across several subnets doing 
cephfs kernel mounts running a wide range of applications.


Each management node has 760GB of memory.  We’re currently using a single 
active MDS daemon, but have experimented with multiple MDS daemons and 
directory pinning.

Each OSD for the cephfs data pool uses hdd (SATA) drives, with the DB and WAL 
on nvme's internal to the storage node.  The cephfs metadata pool is on nvme 
drives internal to the management nodes.

After a lot of testing in attempts to quiet down persistent 
MDS_CLIENT_LATE_RELEASE "clients failing to respond to capability release” and 
MDS_CLIENT_RECALL "clients failing to respond to cache pressure” warnings, 
we’ve ended up with the following settings.  This seems to be working well.  We 
still get periodic MDS_CLIENT_RECALL warnings, but not nearly as many as we 
were seeing and they clear relatively quickly.

I’d greatly appreciate any suggestions for further improvements, or any 
concerns anyone has with these.

Many thanks,
Devin

———
cephfs  session_timeout 120
mds     mds_cache_memory_limit  549755813888                                    
                                              mds       mds_cache_mid   
0.700000                                                                        
              mds       mds_cache_reservation   0.100000                        
                                                              mds       
mds_cache_trim_decay_rate       0.900000                                        
                                              mds       
mds_cache_trim_threshold        524288                                          
                                              mds       
mds_cap_revoke_eviction_timeout 30.000000                                       
                                              mds       
mds_health_cache_threshold      2.000000                                        
                                              mds       mds_log_max_segments    
256                                                                             
              mds       mds_max_caps_per_client 5000000                         
                                                              mds       
mds_recall_global_max_decay_threshold   1048576                                 
                                                      mds       
mds_recall_max_caps     5000                                                    
                                      mds       mds_recall_max_decay_rate       
1.500000                                                                        
              mds       mds_recall_max_decay_threshold  524288                  
                                                                      mds       
mds_recall_warning_decay_rate   120.000000                                      
                                              mds       
mds_recall_warning_threshold    262144                                          
                                                                                
                                             
———
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] MDS tuning for large production cluster

Reply via email to