david,

2017-06-02 21:41 GMT+08:00 David Turner <drakonst...@gmail.com>:

> I'm thinking you have erasure coding in cephfs and only use cache tiring
> because you have to, correct? What is your use case for repeated file
> accesses? How much data is written into cephfs at a time?
>
these days, up to ten-millions of tiny files were written into  cephfs
everyday; not directly but via samba, some processes are running on
windows, so there are some linux systems run samba service, windows
processes writes through samba. after data written,  they are rarely
accessed; at last, they were uploaded to a cloud storage outside.

> For me, my files are infrequently accessed after they are written or read
> from the EC back-end pool.  I set my cache pool to never leave any data in
> it older than an hour. I have a buddy with a similar setup with the
> difference that files added to cephfs will be heavily accessed and modified
> for the first day, but then intently accessed. He has his settings to keep
> all of his accessed files in cache for 24 hours before they are all cleaned
> up.
>
> We do that by having a target_max_ratio of 0.0 and min_evict and min_flush
> ages set appropriately. The cluster is never chosing what to flush or evict
> based on maintaining the full ratio. As soon as the minimum age are met, it
> does it is added to the queue to process.
>
I can't find target_max_ratio, do you mean cache_target_full_ratio ?

I though cache_target_full_ratio only  concerned the clean objects (evict),
any misunderstanding ?

This fixed our cluster speeds during times when the cache pool was cleaning
> up. The problem we hypothesized was that it was the prices of choosing what
> to clean vs what to keep was causing.
>
> On Fri, Jun 2, 2017, 4:54 AM jiajia zhong <zhong2p...@gmail.com> wrote:
>
>> thank you for your guide :), It's making sense.
>>
>> 2017-06-02 16:17 GMT+08:00 Christian Balzer <ch...@gol.com>:
>>
>>>
>>> Hello,
>>>
>>> On Fri, 2 Jun 2017 14:30:56 +0800 jiajia zhong wrote:
>>>
>>> > christian, thanks for your reply.
>>> >
>>> > 2017-06-02 11:39 GMT+08:00 Christian Balzer <ch...@gol.com>:
>>> >
>>> > > On Fri, 2 Jun 2017 10:30:46 +0800 jiajia zhong wrote:
>>> > >
>>> > > > hi guys:
>>> > > >
>>> > > > Our ceph cluster is working with tier cache.
>>> > > If so, then I suppose you read all the discussions here as well and
>>> not
>>> > > only the somewhat lacking documentation?
>>> > >
>>> > > > I am running "rados -p data_cache cache-try-flush-evict-all" to
>>> evict all
>>> > > > the objects.
>>> > > Why?
>>> > > And why all of it?
>>> >
>>> >
>>> > we found that when the threshold(flush/evict) was triggered, the
>>> > performance would make us a bit upset :), so I wish to flush/evict the
>>> tier
>>> > in a spare time,eg, middle night,In this scenario,the tier could not
>>> pay
>>> > any focus on flush/evict while the great w/r operations on cephfs
>>> which we
>>> > are using.
>>> >
>>> As I said, eviction (which is basically zeroing the data in cache) has
>>> very little impact.
>>> Flushing, moving data from the cache tier to the main pool is another
>>> story.
>>>
>>> But what you're doing here is completely invalidating your cache (the
>>> eviction part), so the performance will be very bad after this as well.
>>>
>>> If you have low utilization periods, consider a cron job that lowers the
>>> dirty ratio (causing only flushes to happen) and then after a while (few
>>> minutes should do, experiment) restore the old setting.
>>>
>>> For example:
>>> ---
>>> # Preemptive Flushing before midnight
>>> 45 23 * * * root ceph osd pool set cache cache_target_dirty_ratio 0.52
>>>
>>> # And back to normal levels
>>> 55 23 * * * root ceph osd pool set cache cache_target_dirty_ratio 0.60
>>> ---
>>>
>>> This will of course only help if the amount of data promoted into your
>>> cache per day is small enough to fit into the flushed space.
>>>
>>> Otherwise your cluster has no other choice to start flushing when things
>>> get full.
>>>
>>> Christian
>>> > >
>>> > > > But It a bit slow
>>> > > >
>>> > > Define slow, but it has to do a LOT of work and housekeeping to do
>>> this,
>>> > > so unless your cluster is very fast (probably not, or you wouldn't
>>> > > want/need a cache tier) and idle, that's the way it is.
>>> > >
>>> > > > 1. Is there any way to speed up the evicting?
>>> > > >
>>> > > Not really, see above.
>>> > >
>>> > > > 2. Is evicting triggered by itself good enough for cluster ?
>>> > > >
>>> > > See above, WHY are you manually flushing/evicting?
>>> > >
>>> > explained above.
>>> >
>>> >
>>> > > Are you aware that flushing is the part that's very I/O intensive,
>>> while
>>> > > evicting is a very low cost/impact operation?
>>> > >
>>> > not very sure, my instinct believed those.
>>> >
>>> >
>>> > > In normal production, the various parameters that control this will
>>> do
>>> > > fine, if properly configured of course.
>>> > >
>>> > > > 3. Does the flushing and evicting slow down the whole cluster?
>>> > > >
>>> > > Of course, as any good sysadmin with the correct tools (atop, iostat,
>>> > > etc, graphing Ceph performance values with Grafana/Graphite) will be
>>> able
>>> > > to see instantly.
>>> >
>>> > actually, we are using graphite,  but I could not see that instantly,
>>> lol
>>> > :(, I could only got the threshold triggered by calculating after
>>> happening.
>>> >
>>> > btw, we have cephfs to store a huge number of small files, (64T , about
>>> > 100K per file),
>>> >
>>> >
>>> > >
>>> > >
>>> > > Christian
>>> > > --
>>> > > Christian Balzer        Network/Systems Engineer
>>> > > ch...@gol.com           Rakuten Communications
>>> > >
>>>
>>>
>>> --
>>> Christian Balzer        Network/Systems Engineer
>>> ch...@gol.com           Rakuten Communications
>>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to