Hi,

we were rsync-streaming with 4 cephfs client to a ceph cluster with a cache layer upon an erasure coded pool.
This was going on for some time, and didn't have real problems.

Today we added 2 more streams, and very soon we saw some strange behaviour:
- We are getting blocked requests on our cache pool osds
- our cache pool is often near/ at max ratio
- Our data streams have very bursty IO, (streaming a minute a few hunderds MB and then nothing)

Our OSDs are not overloaded (nor the ECs nor cache, checked with iostat), though it seems like the cache pool can not evict objects in time, and get blocked until that is ok, each time again. If I rise the target_max_bytes limit, it starts streaming again until it is full again.

cache parameters we have are these:
ceph osd pool set cache hit_set_type bloom
ceph osd pool set cache hit_set_count 1
ceph osd pool set cache hit_set_period 3600
ceph osd pool set cache target_max_bytes $((14*75*1024*1024*1024))
ceph osd pool set cache cache_target_dirty_ratio 0.4
ceph osd pool set cache cache_target_full_ratio 0.8


What can be the issue here ? I tried to find some information about the 'cache agent' , but can only find some old references..

Thank you!

Kenneth
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to