Re: [ceph-users] cephfs mds millions of caps

Webert de Souza Lima Fri, 15 Dec 2017 02:55:22 -0800

Hello, Mr. Yan

On Thu, Dec 14, 2017 at 11:36 PM, Yan, Zheng <uker...@gmail.com> wrote:


>
> The client hold so many capabilities because kernel keeps lots of
> inodes in its cache. Kernel does not trim inodes by itself if it has
> no memory pressure. It seems you have set mds_cache_size config to a
> large value.


Yes, I have set mds_cache_size = 3000000
I usually set this value according to the number of ceph.dir.rentries in
cephfs. Isn't that a good approach?

I have 2 directories in cephfs root, sum of ceph.dir.rentries is 4670933,
for which I would set mds_cache_size to 5M (if I had enough RAM for that in
the MDS server).

# getfattr -d -m ceph.dir.* index
# file: index
ceph.dir.entries="776"
ceph.dir.files="0"
ceph.dir.rbytes="52742318965"
ceph.dir.rctime="1513334528.09909569540"
ceph.dir.rentries="709233"
ceph.dir.rfiles="459512"
ceph.dir.rsubdirs="249721"
ceph.dir.subdirs="776"


# getfattr -d -m ceph.dir.* mail
# file: mail
ceph.dir.entries="786"
ceph.dir.files="1"
ceph.dir.rbytes="15000378101390"
ceph.dir.rctime="1513334524.0993982498"
ceph.dir.rentries="3961700"
ceph.dir.rfiles="3531068"
ceph.dir.rsubdirs="430632"
ceph.dir.subdirs="785"


mds cache size isn't large enough, so mds does not ask
> the client to trim its inode cache neither. This can affect
> performance. we should make mds recognize idle client and ask idle
> client to trim its caps more aggressively
>

I think you mean that the mds cache IS large enough, right? So it doesn't
bother the clients.

This can affect performance. we should make mds recognize idle client and
> ask idle client to trim its caps more aggressively
>

One recurrent problem I have, which I guess is caused by a network issue
(ceph cluster in vrack), is that my MDS servers start switching who is the
active.
This happens after a lease_timeout occur in the mon, then I get "dne in the
mds map" from the active MDS and it suicides.
Even though I use standby-replay, the standby takes from 15min up to 2
hours to take over as active. I see that it loads all inodes (by issuing
"perf dump mds" on the mds daemon).

So, question is: if the number of caps is as low as it is supposed to be
(around 300k) instead if 5M, would the MDS be active faster in such case of
a failure?

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs mds millions of caps

Reply via email to