Re: Idea for reclaimation algo

dormando Thu, 10 Apr 2014 20:13:15 -0700

> Hey Dormando, thanks again for some comments... appreciate the help.
>
> Maybe i wasn't clear enough. I need only 1 minute persistence, and i can lose 
> data sometimes, just i can't keep loosing data every minute due to
> constant evictions caused by LRU. Actually i have just wrote that in my 
> previous post. We're loosing about 1 minute of non-meaningfull data every
> week because of restart that we do when memory starts to fill up (even with 
> our patch reclaiming using linked list, we limit reclaiming to keep
> speed better)... so the memory fills up after a week, not 30 minutes...


Can you explain what you're seeing in more detail? Your data only needs to
persist for 1 minute, but it's being evicted before 1 minute is up?

You made it sound like you had some data which never expired? Is this
true?

If your instance is 16GB, takes a week to fill up, but data only needs to
persist for a minute but isn't, something else is very broken? Or am I
still misunderstanding you?

> Now im creating better solution, to limit locking as linked list is getting 
> bigger.
>
> I explained what was worst implications of unwanted evictions (or loosing all 
> data in cache) in my use case:
> 1. loosing ~1 minute of non-significant data that's about to be stored in sql
> 2. "flat" distribution of load to workers (not taking response times into 
> account because stats reset).
> 3. resorting to alternative targeting algorithm (with global, not local 
> statistics).
>
> I never, ever said im going to write data that have to be persistent 
> permanently. It's actually same idea as delayed write. If power fails you
> loose 5s of data, but you can do 100x more writes. So you need the data to be 
> persistent in memory, between writes the data **can't be lost**.
> However you can lose it sometimes, that's the tradeoff that some people can 
> make and some not. Obviously I can't keep loosing this data each
> minute, because if i loose much it'll become meaningfull.
>
> Maybe i wasn't clear in that matter. I can loose all data even 20 times a 
> day. Sensitive data is stored using bulk update or transactions,
> bypassing that "delayed write" layer. "0 evictions", that's the kind of 
> "persistence" im going for. So items are persistent for some very short
> periods of time (1-5 minutes) without being killed. It's just different use 
> case. Running in production since 2 years, based on 1.4.13, tested for
> corectness, monitored so we have enough memory and 0 evictions (just reclaims)
>
> When i came here with same idea ~2 years ago you just said it's very stupid, 
> now you even made me look like a moron :) And i can understand why you
> don't want features that are not ~O(1) perfectly, but please don't get so 
> personal about different ideas to do things and use cases, just because
> these won't work for you.
>
>
>
>
>
> W dniu czwartek, 10 kwietnia 2014 20:53:12 UTC+2 użytkownik Dormando napisał:
>       You really really really really really *must* not put data in memcached
>       which you can't lose.
>
>       Seriously, really don't do it. If you need persistence, try using a 
> redis
>       instance for the persistent stuff, and use memcached for your cache 
> stuff.
>       I don't see why you feel like you need to write your own thing, 
> there're a
>       lot of persistent key/value stores (kyotocabinet/etc?). They have a much
>       lower request ceiling and don't handle the LRU/cache pattern as well, 
> but
>       that's why you can use both.
>
>       Again, please please don't do it. You are damaging your company. You 
> are a
>       *danger* to your company.
>
>       On Thu, 10 Apr 2014, Slawomir Pryczek wrote:
>
>       > Hi Dormando, thanks for suggestions, background thread would be 
> nice...
>       > The idea is actually that with 2-3GB i get plenty of evictions of 
> items that need to be fetched later. And with 16GB i still get
>       evictions,
>       > actually probably i could throw more memory than 16G and it'd only 
> result in more expired items sitting in the middle of slabs,
>       forever... Now im
>       > going for persistence. Sounds probably crazy, but we're having some 
> data that we can't loose:
>       > 1. statistics, we aggregate writes to DB using memcached (+list 
> implementation). If these items get evicted we're loosing rows in db.
>       Loosing data
>       > sometimes isn't a big problem. Eg. we restart memcached once a week 
> so we're loosing 1 minute of data every week. But if we have
>       evictions we're
>       > loosing data constantly (which we can't have)
>       > 2. we drive load balancer using data in memcached for statistics, 
> again, not nice to loose data often because workers can get
>       incorrect amount of
>       > traffic.
>       > 3. we're doing some adserving optimizations, eg. counting per-domain 
> ad priority, for one domain it takes about 10 seconds to analyze
>       all data and
>       > create list of ads, so can't be done online... we put result of this 
> in memcached, if we loose too much of this the system will start
>       to serve
>       > suboptimal ads (because it'll need to switch to more general data or 
> much simpler algorithm that can be done instantly)
>       >
>       > Probably would be best to rewrite all this using C or golang, and use 
> memcached just for caching, but it'd take too much time which
>       we don't have
>       > currently...
>       >
>       > I have seen twitter and nk implementations that seem to do what i 
> need, but they seem old (based on old code), so I prefer to modify
>       code of recent
>       > "official" memcached, to not be stuck with old code or abandonware. 
> Actually there are many topics about limitations of currrent
>       eviction algo and
>       > option to enable some background thread to do scraping based on 
> statistics of most filled slabs (with some parameter to specify if it
>       should take
>       > light or aggressive approach) would be nice...
>       >
>       > As for the code... is that slab_rebalance_move function in slab.c? It 
> seems a little difficult to gasp without some DOCs of how
>       things are
>       > working... can you please write a very short description of how this 
> "angry birds" more workd?
>
>       Look at doc/protocol.txt for explanations of the slab move options. the
>       names are greppable back to the source.
>
>       > I have quick question about this above... linked is item that's 
> placed on linked list, but what other flags means, and why 2 last are
>       2 of them
>       > temporary?
>       > #define ITEM_LINKED 1
>       > #define ITEM_CAS 2
>       >
>       > /* temp */
>       > #define ITEM_SLABBED 4
>       > #define ITEM_FETCHED 8
>       >
>       > This from slab_rebalance_move seems interesting:
>       > refcount = refcount_incr(&it->refcount);
>       > ...
>       > if (refcount == 1) { /* item is unlinked, unused */
>       > ...
>       > } else if (refcount == 2) { /* item is linked but not busy */
>       >
>       > Is there some docs about refcounts, locks and item states? Basically 
> why item with refcount 2 is not busy? You're increasing refcount
>       by 1 on
>       > select, then again when reading data? Can refcount ever be higher 
> than 2 (3 in above case), meaning 2 threads can access same item?
>
>       The comment on the same line is explaining exactly what it means.
>
>       Unfortunately it's a bit of a crap shoot. I think I wrote a threads
>       explanation somewhnere (some release notes, or in a file in there, I 
> can't
>       quite remember offhand). Since scaling the thread code it got a lot more
>       complicated. You have to be extremely careful under what circumstances 
> you
>       access items (you must hold an item lock + the refcount must be 2 if you
>       want to unlink it).
>
>       You'll just have to study it a bit, sorry. Grep around to see where the
>       flags are used.
>
>       > Thanks.
>       >
>       > W dniu czwartek, 10 kwietnia 2014 06:05:30 UTC+2 użytkownik Dormando 
> napisał:
>       >       > Hi Guys,
>       >       > im running a specific case where i don't want (actually can't 
> have) to have evicted items (evictions = 0 ideally)... now i
>       have
>       >       created some simple
>       >       > algo that lock the cache, goes through linked list and evicts 
> items... it makes some problems, like 10-20ms cache locks on
>       some
>       >       cases.
>       >       >
>       >       > Now im thinking about going through each slab memory (slabs 
> keep a list of allocated memory regions) ... looking for items,
>       if
>       >       expired item is
>       >       > found, evict it... this way i can go eg. 10k items or 1MB of 
> memory at a time + pick slabs with high utilization and run this
>       >       "additional" eviction
>       >       > only on them... so it'll prevent allocating memory just 
> because unneded data with short TTL is occupying HEAD of the list.
>       >       >
>       >       > With this linked list eviction im able to run on 2-3GB of 
> memory... without it 16GB of memory is exhausted in 1-2h and then
>       memcached
>       >       starts to
>       >       > kill "good" items (leaving expired ones wasting memory)...
>       >       >
>       >       > Any comments?
>       >       > Thanks.
>       >
>       >       you're going a bit against the base algorithm. if stuff is 
> falling out of
>       >       16GB of memory without ever being utilized again, why is that 
> critical?
>       >       Sounds like you're optimizing the numbers instead of actually 
> tuning
>       >       anything useful.
>       >
>       >       That said, you can probably just extend the slab rebalance 
> code. There's a
>       >       hook in there (which I called "Angry birds mode") that drives a 
> slab
>       >       rebalance when it'd otherwise run an eviction. That code 
> already safely
>       >       walks the slab page for unlocked memory and frees it; you could 
> edit it
>       >       slightly to check for expiration and then freelist it into the 
> slab class
>       >       instead.
>       >
>       >       Since it's already a background thread you could further modify 
> it to just
>       >       wake up and walk pages for stuff to evict.
>       >
>       > --
>       >
>       > ---
>       > You received this message because you are subscribed to the Google 
> Groups "memcached" group.
>       > To unsubscribe from this group and stop receiving emails from it, 
> send an email to memcached+...@googlegroups.com.
>       > For more options, visit https://groups.google.com/d/optout.
>       >
>       >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to memcached+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Idea for reclaimation algo

Reply via email to