On 24/02/2021 12:40, Simon Oosthoek wrote:
> Hi
> 
> we've been running our Ceph cluster for nearly 2 years now (Nautilus)
> and recently, due to a temporary situation the cluster is at 80% full.
> 
> We are only using CephFS on the cluster.
> 
> Normally, I realize we should be adding OSD nodes, but this is a
> temporary situation, and I expect the cluster to go to <60% full quite soon.
> 
> Anyway, we are noticing some really problematic slowdowns. There are
> some things that could be related but we are unsure...
> 
> - Our 2 MDS nodes (1 active, 1 standby) are configured with 128GB RAM,
> but are not using more than 2GB, this looks either very inefficient, or
> wrong ;-)

After looking at our monitoring history, it seems the mds cache is
actually used more fully, but most of our servers are getting a weekly
reboot by default. This clears the mds cache obviously. I wonder if
that's a smart idea for an MDS node...? ;-)

> 
> "ceph config dump |grep mds":
>   mds            basic    mds_cache_memory_limit
> 107374182400
>   mds            advanced mds_max_scrub_ops_in_progress       10
> 
> Perhaps we require more or different settings to properly use the MDS
> memory?
> 
> - On all our OSD nodes, the memory line is red in "atop", though no swap
> is in use, it seems the memory on the OSD nodes is taking quite a
> beating, is this normal, or can we tweak settings to make it less stressed?
> 
> This is the first time we are having performance issues like this, I
> think, I'd like to learn some commands to help me analyse this...
> 
> I hope this will ring a bell with someone...
> 
> Cheers
> 
> /Simon
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to