Hello, Mr. Yan On Thu, Dec 14, 2017 at 11:36 PM, Yan, Zheng <uker...@gmail.com> wrote:
> > The client hold so many capabilities because kernel keeps lots of > inodes in its cache. Kernel does not trim inodes by itself if it has > no memory pressure. It seems you have set mds_cache_size config to a > large value. Yes, I have set mds_cache_size = 3000000 I usually set this value according to the number of ceph.dir.rentries in cephfs. Isn't that a good approach? I have 2 directories in cephfs root, sum of ceph.dir.rentries is 4670933, for which I would set mds_cache_size to 5M (if I had enough RAM for that in the MDS server). # getfattr -d -m ceph.dir.* index # file: index ceph.dir.entries="776" ceph.dir.files="0" ceph.dir.rbytes="52742318965" ceph.dir.rctime="1513334528.09909569540" ceph.dir.rentries="709233" ceph.dir.rfiles="459512" ceph.dir.rsubdirs="249721" ceph.dir.subdirs="776" # getfattr -d -m ceph.dir.* mail # file: mail ceph.dir.entries="786" ceph.dir.files="1" ceph.dir.rbytes="15000378101390" ceph.dir.rctime="1513334524.0993982498" ceph.dir.rentries="3961700" ceph.dir.rfiles="3531068" ceph.dir.rsubdirs="430632" ceph.dir.subdirs="785" mds cache size isn't large enough, so mds does not ask > the client to trim its inode cache neither. This can affect > performance. we should make mds recognize idle client and ask idle > client to trim its caps more aggressively > I think you mean that the mds cache IS large enough, right? So it doesn't bother the clients. This can affect performance. we should make mds recognize idle client and > ask idle client to trim its caps more aggressively > One recurrent problem I have, which I guess is caused by a network issue (ceph cluster in vrack), is that my MDS servers start switching who is the active. This happens after a lease_timeout occur in the mon, then I get "dne in the mds map" from the active MDS and it suicides. Even though I use standby-replay, the standby takes from 15min up to 2 hours to take over as active. I see that it loads all inodes (by issuing "perf dump mds" on the mds daemon). So, question is: if the number of caps is as low as it is supposed to be (around 300k) instead if 5M, would the MDS be active faster in such case of a failure? Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ*
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com