On Tue, Jan 22, 2019 at 10:49 AM Albert Yue <transuranium....@gmail.com> wrote:
>
> Hi Yan Zheng,
>
> In your opinion, can we resolve this issue by move MDS to a 512GB or 1TB 
> memory machine?
>

The problem is from client side, especially clients with large memory.
I don't think enlarge mds cache size is good idea. you can
periodically check periodically
each kernel clients' /sys/kernel/debug/ceph/xxx/caps. run 'echo 2
>/proc/sys/vm/drop_caches' if a client used too many caps (for example
10k),

> On Mon, Jan 21, 2019 at 10:49 PM Yan, Zheng <uker...@gmail.com> wrote:
>>
>> On Mon, Jan 21, 2019 at 11:16 AM Albert Yue <transuranium....@gmail.com> 
>> wrote:
>> >
>> > Dear Ceph Users,
>> >
>> > We have set up a cephFS cluster with 6 osd machines, each with 16 8TB 
>> > harddisk. Ceph version is luminous 12.2.5. We created one data pool with 
>> > these hard disks and created another meta data pool with 3 ssd. We created 
>> > a MDS with 65GB cache size.
>> >
>> > But our users are keep complaining that cephFS is too slow. What we 
>> > observed is cephFS is fast when we switch to a new MDS instance, once the 
>> > cache fills up (which will happen very fast), client became very slow when 
>> > performing some basic filesystem operation such as `ls`.
>> >
>>
>> It seems that clients hold lots of unused inodes their icache, which
>> prevent mds from trimming corresponding objects from its cache.  mimic
>> has command "ceph daemon mds.x cache drop" to ask client to drop its
>> cache. I'm also working on a patch that make kclient client release
>> unused inodes.
>>
>> For luminous,  there is not much we can do, except periodically run
>> "echo 2 > /proc/sys/vm/drop_caches"  on each client.
>>
>>
>> > What we know is our user are putting lots of small files into the cephFS, 
>> > now there are around 560 Million files. We didn't see high CPU wait on MDS 
>> > instance and meta data pool just used around 200MB space.
>> >
>> > My question is, what is the relationship between the metadata pool and 
>> > MDS? Is this performance issue caused by the hardware behind meta data 
>> > pool? Why the meta data pool only used 200MB space, and we saw 3k iops on 
>> > each of these three ssds, why can't MDS cache all these 200MB into memory?
>> >
>> > Thanks very much!
>> >
>> >
>> > Best Regards,
>> >
>> > Albert
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to