Re: Check for orphaned items in lru crawler thread

dormando Tue, 29 Sep 2015 22:28:42 -0700

If you look at the new branch there's a commit explaining the new stats.

You can watch slab_reassing_evictions vs slab_reassign_saves. you can also
test automove=1 vs automove=2 (please also turn on the lru_maintainer and
lru_crawler).


The initial branch you were running didn't add any new stats. It just
restored an old feature.

On Tue, 29 Sep 2015, Scott Mansfield wrote:

> An unrelated prod problem meant I had to stop after about an hour. I'm 
> turning it on again tomorrow morning.
> Are there any new metrics I should be looking at? Anything new in the stats 
> output? I'm about to take a look at the diffs as well.
>
> On Tuesday, September 29, 2015 at 12:37:45 PM UTC-7, Dormando wrote:
>       excellent. if automove=2 is too aggressive you'll see that come in in a
>       hit ratio reduction.
>
>       the new branch works with automove=2 as well, but it will attempt to
>       rescue valid items in the old slab if possible. I'll still be working on
>       it for another few hours today though. I'll mail again when I'm done.
>
>       On Tue, 29 Sep 2015, Scott Mansfield wrote:
>
>       > I have the first commit (slab_automove=2) running in prod right now. 
> Later today will be a full load production test of the latest code. I'll just 
> let it run for a few days unless I spot any problems. We have good metrics 
> for latency et. al. from the client side, though network normally dwarfs 
> memcached time.
>       >
>       > On Tuesday, September 29, 2015 at 3:10:03 AM UTC-7, Dormando wrote:
>       >       That's unfortunate.
>       >
>       >       I've done some more work on the branch:
>       >       https://github.com/memcached/memcached/pull/112
>       >
>       >       It's not completely likely you would see enough of an 
> improvement from the
>       >       new default mode. However if your item sizes change gradually, 
> items are
>       >       reclaimed during expiration, or get overwritten (and thus freed 
> in the old
>       >       class), it should work just fine. I have another patch coming 
> which should
>       >       help though.
>       >
>       >       Open to feedback from any interested party.
>       >
>       >       On Fri, 25 Sep 2015, Scott Mansfield wrote:
>       >
>       >       > I have it running internally, and it runs fine under normal 
> load. It's difficult to put it into the line of fire for a production 
> workload because of social reasons... As well it's a degenerate case that we 
> normally don't run in to (and actively try to avoid). I'm going to run some 
> heavier load tests on it today. 
>       >       >
>       >       > On Wednesday, September 9, 2015 at 10:23:32 AM UTC-7, Scott 
> Mansfield wrote:
>       >       >       I'm working on getting a test going internally. I'll 
> let you know how it goes. 
>       >       >
>       >       >
>       >       > Scott Mansfield
>       >       > On Mon, Sep 7, 2015 at 2:33 PM, dormando wrote:
>       >       >       Yo,
>       >       >
>       >       >       
> https://github.com/dormando/memcached/commits/slab_rebal_next - would you
>       >       >       mind playing around with the branch here? You can see 
> the start options in
>       >       >       the test.
>       >       >
>       >       >       This is a dead simple modification (a restoration of a 
> feature that was
>       >       >       arleady there...). The test very aggressively writes 
> and is able to shunt
>       >       >       memory around appropriately.
>       >       >
>       >       >       The work I'm exploring right now will allow savings of 
> items being
>       >       >       rebalanced from, and increasing the aggression of page 
> moving without
>       >       >       being so brain damaged about it.
>       >       >
>       >       >       But while I'm poking around with that, I'd be 
> interested in knowing if
>       >       >       this simple branch is an improvement, and if so how 
> much.
>       >       >
>       >       >       I'll push more code to the branch, but the changes 
> should be gated behind
>       >       >       a feature flag.
>       >       >
>       >       >       On Tue, 18 Aug 2015, 'Scott Mansfield' via memcached 
> wrote:
>       >       >
>       >       >       >
>       >       >       > No worries man, you're doing us a favor. Let me know 
> if there's anything you need from us, and I promise I'll be quicker this time 
> :)
>       >       >       >
>       >       >       > On Aug 18, 2015 12:01 AM, "dormando" 
> <dorm...@rydia.net> wrote:
>       >       >       >       Hey,
>       >       >       >
>       >       >       >       I'm still really interested in working on this. 
> I'll be taking a careful
>       >       >       >       look soon I hope.
>       >       >       >
>       >       >       >       On Mon, 3 Aug 2015, Scott Mansfield wrote:
>       >       >       >
>       >       >       >       > I've tweaked the program slightly, so I'm 
> adding a new version. It prints more stats as it goes and runs a bit faster.
>       >       >       >       >
>       >       >       >       > On Monday, August 3, 2015 at 1:20:37 AM 
> UTC-7, Scott Mansfield wrote:
>       >       >       >       >       Total brain fart on my part. Apparently 
> I had memcached 1.4.13 on my path (who knows how...) Using the actual one 
> that I've built works. Sorry for the confusion... can't believe I didn't 
> realize that before. I'm testing against the compiled one now to see how it 
> behaves.
>       >       >       >       >       On Monday, August 3, 2015 at 1:15:06 AM 
> UTC-7, Dormando wrote:
>       >       >       >       >             You sure that's 1.4.24? None of 
> those fail for me :(
>       >       >       >       >
>       >       >       >       >             On Mon, 3 Aug 2015, Scott 
> Mansfield wrote:
>       >       >       >       >
>       >       >       >       >             > The command line I've used that 
> will start is:
>       >       >       >       >             >
>       >       >       >       >             > memcached -m 64 -o 
> slab_reassign,slab_automove
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >             > the ones that fail are:
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >             > memcached -m 64 -o 
> slab_reassign,slab_automove,lru_crawler,lru_maintainer
>       >       >       >       >             >
>       >       >       >       >             > memcached -o lru_crawler
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >             > I'm sure I've missed something 
> during compile, though I just used ./configure and make.
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >             > On Monday, August 3, 2015 at 
> 12:22:33 AM UTC-7, Scott Mansfield wrote:
>       >       >       >       >             >       I've attached a pretty 
> simple program to connect, fill a slab with data, and then fill another slab 
> slowly with data of a different size. I've been trying to get memcached to 
> run with the lru_crawler and lru_maintainer flags, but I get '
>       >       >       >       >             >
>       >       >       >       >             >       Illegal suboption 
> "(null)"' every time I try to start with either in any configuration.
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >             >       I haven't seen it start 
> to move slabs automatically with a freshly installed 1.2.24.
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >             >       On Tuesday, July 21, 2015 
> at 4:55:17 PM UTC-7, Scott Mansfield wrote:
>       >       >       >       >             >             I realize I've not 
> given you the tests to reproduce the behavior. I should be able to soon. 
> Sorry about the delay here.
>       >       >       >       >             > In the mean time, I wanted to 
> bring up a possible secondary use of the same logic to move items on slab 
> rebalancing. I think the system might benefit from using the same logic to 
> crawl the pages in a slab and compact the data in the background. In the case 
> where we have memory that is assigned to the slab
>       but not
>       >       >       being used
>       >       >       >       because
>       >       >       >       >             of replaced
>       >       >       >       >             > or TTL'd out data, returning 
> the memory to a pool of free memory will allow a slab to grow with that 
> memory first instead of waiting for an event where memory is needed at that 
> instant.
>       >       >       >       >             >
>       >       >       >       >             > It's a change in approach, from 
> reactive to proactive. What do you think?
>       >       >       >       >             >
>       >       >       >       >             > On Monday, July 13, 2015 at 
> 5:54:11 PM UTC-7, Dormando wrote:
>       >       >       >       >             >       > First, more detail for 
> you:
>       >       >       >       >             >       >
>       >       >       >       >             >       > We are running 1.4.24 
> in production and haven't noticed any bugs as of yet. The new LRUs seem to be 
> working well, though we nearly always run memcached scaled to hold all data 
> without evictions. Those with evictions are behaving well. Those without 
> evictions haven't seen crashing or any other
>       noticeable
>       >       bad
>       >       >       behavior.
>       >       >       >       >             >
>       >       >       >       >             >       Neat.
>       >       >       >       >             >
>       >       >       >       >             >       >
>       >       >       >       >             >       > OK, I think I see an 
> area where I was speculating on functionality. If you have a key in slab 21 
> and then the same key is written again at a larger size in slab 23 I assumed 
> that the space in 21 was not freed on the second write. With that assumption, 
> the LRU crawler would not free up that space.
>       Also just
>       >       >       by observation
>       >       >       >       in
>       >       >       >       >             the
>       >       >       >       >             >       macro, the space is not 
> freed
>       >       >       >       >             >       > fast enough to be 
> effective, in our use case, to accept the writes that are happening. Think in 
> the hundreds of millions of "overwrites" in a 6 - 10 hour period across a 
> cluster.
>       >       >       >       >             >
>       >       >       >       >             >       Internally, "items" (a 
> key/value pair) are generally immutable. The only
>       >       >       >       >             >       time when it's not is for 
> INCR/DECR, and it still becomes immutable if two
>       >       >       >       >             >       INCR/DECR's collide.
>       >       >       >       >             >
>       >       >       >       >             >       What this means, is that 
> the new item is staged in a piece of free memory
>       >       >       >       >             >       while the "upload" stage 
> of the SET happens. When memcached has all of the
>       >       >       >       >             >       data in memory to replace 
> the item, it does an internal swap under a lock.
>       >       >       >       >             >       The old item is removed 
> from the hash table and LRU, and the new item gets
>       >       >       >       >             >       put in its place (at the 
> head of the LRU).
>       >       >       >       >             >
>       >       >       >       >             >       Since items are 
> refcounted, this means that if other users are downloading
>       >       >       >       >             >       an item which just got 
> replaced, their memory doesn't get corrupted by the
>       >       >       >       >             >       item changing out from 
> underneath them. They can continue to read the old
>       >       >       >       >             >       item until they're done. 
> When the refcount reaches zero the old memory is
>       >       >       >       >             >       reclaimed.
>       >       >       >       >             >
>       >       >       >       >             >       Most of the time, the 
> item replacement happens then the old memory is
>       >       >       >       >             >       immediately removed.
>       >       >       >       >             >
>       >       >       >       >             >       However, this does mean 
> that you need *one* piece of free memory to
>       >       >       >       >             >       replace the old one. Then 
> the old memory gets freed after that set.
>       >       >       >       >             >
>       >       >       >       >             >       So if you take a 
> memcached instance with 0 free chunks, and do a rolling
>       >       >       >       >             >       replacement of all items 
> (within the same slab class as before), the first
>       >       >       >       >             >       one would cause an 
> eviction from the tail of the LRU to get a free chunk.
>       >       >       >       >             >       Every SET after that 
> would use the chunk freed from the replacement of the
>       >       >       >       >             >       previous memory.
>       >       >       >       >             >
>       >       >       >       >             >       > After that last 
> sentence I realized I also may not have explained well enough the access 
> pattern. The keys are all overwritten every day, but it takes some time to 
> write them all (obviously). We see a huge increase in the bytes metric as if 
> the new data for the old keys was being written for the
>       first
>       >       time.
>       >       >       Since the
>       >       >       >       "old"
>       >       >       >       >             slab for
>       >       >       >       >             >       the same key doesn't
>       >       >       >       >             >       > proactively release 
> memory, it starts to fill up the cache and then start evicting data in the 
> new slab. Once that happens, we see evictions in the old slab because of the 
> algorithm you mentioned (random picking / freeing of memory). Typically we 
> don't see any use for "upgrading" an item as the new
>       data
>       >       >       would be entirely
>       >       >       >       >             new and
>       >       >       >       >             >       should wholesale replace 
> the
>       >       >       >       >             >       > old data for that key. 
> More specifically, the operation is always set, with different data each day.
>       >       >       >       >             >
>       >       >       >       >             >       Right. Most of your 
> problems will come from two areas. One being that
>       >       >       >       >             >       writing data aggressively 
> into the new slab class (unless you set the
>       >       >       >       >             >       rebalancer to 
> always-replace mode), the mover will make memory available
>       >       >       >       >             >       more slowly than you can 
> insert. So you'll cause extra evictions in the
>       >       >       >       >             >       new slab class.
>       >       >       >       >             >
>       >       >       >       >             >       The secondary problem is 
> from the random evictions in the previous slab
>       >       >       >       >             >       class as stuff is chucked 
> on the floor to make memory moveable.
>       >       >       >       >             >
>       >       >       >       >             >       > As for testing, we'll 
> be able to put it under real production workload. I don't know what kind of 
> data you mean you need for testing. The data stored in the caches are highly 
> confidential. I can give you all kinds of metrics, since we collect most of 
> the ones that are in the stats and some from the
>       stats
>       >       >       slabs output. If
>       >       >       >       >             you have
>       >       >       >       >             >       some specific ones that
>       >       >       >       >             >       > need collecting, I'll 
> double check and make sure we can get those. Alternatively, it might be most 
> beneficial to see the metrics in person :)
>       >       >       >       >             >
>       >       >       >       >             >       I just need stats 
> snapshots here and there, and actually putting the thing
>       >       >       >       >             >       under load. When I did 
> the LRU work I had to beg for several months
>       >       >       >       >             >       before anyone tested it 
> with a production load. This slows things down and
>       >       >       >       >             >       demotivates me from 
> working on the project.
>       >       >       >       >             >
>       >       >       >       >             >       Unfortunately my dayjob 
> keeps me pretty busy so ~internet~ would probably
>       >       >       >       >             >       be best.
>       >       >       >       >             >
>       >       >       >       >             >       > I can create a driver 
> program to reproduce the behavior on a smaller scale. It would write e.g. 10k 
> keys of 10k size, then rewrite the same keys with different size data. I'll 
> work on that and post it to this thread when I can reproduce the behavior 
> locally.
>       >       >       >       >             >
>       >       >       >       >             >       Ok. There're slab 
> rebalance unit tests in the t/ directory which do things
>       >       >       >       >             >       like this, and I've used 
> mc-crusher to slam the rebalancer. It's pretty
>       >       >       >       >             >       easy to run one config to 
> load up 10k objects, then flip to the other
>       >       >       >       >             >       using the same key 
> namespace.
>       >       >       >       >             >
>       >       >       >       >             >       > Thanks,
>       >       >       >       >             >       > Scott
>       >       >       >       >             >       >
>       >       >       >       >             >       > On Saturday, July 11, 
> 2015 at 12:05:54 PM UTC-7, Dormando wrote:
>       >       >       >       >             >       >       Hey,
>       >       >       >       >             >       >
>       >       >       >       >             >       >       On Fri, 10 Jul 
> 2015, Scott Mansfield wrote:
>       >       >       >       >             >       >
>       >       >       >       >             >       >       > We've seen 
> issues recently where we run a cluster that typically has the majority of 
> items overwritten in the same slab every day and a sudden change in data size 
> evicts a ton of data, affecting downstream systems. To be clear that is our 
> problem, but I think there's a tweak in memcached
>       that might
>       >       >       be useful and
>       >       >       >       >             another
>       >       >       >       >             >       possible feature that
>       >       >       >       >             >       >       would be even
>       >       >       >       >             >       >       > better.
>       >       >       >       >             >       >       > The data that 
> is written to this cache is overwritten every day, though the TTL is 7 days. 
> One slab takes up the majority of the space in the cache. The application 
> wrote e.g. 10KB (slab 21) every day for each key consistently. One day, a 
> change occurred where it started writing 15KB (slab
>       23),
>       >       >       causing a migration
>       >       >       >       >             of data
>       >       >       >       >             >       from one slab to
>       >       >       >       >             >       >       another. We had -o
>       >       >       >       >             >       >       > 
> slab_reassign,slab_automove=1 set on the server, causing large numbers of 
> evictions on the initial slab. Let's say the cache could hold the data at 
> 15KB per key, but the old data was not technically TTL'd out in it's old 
> slab. This means that memory was not being freed by the lru crawler
>       thread (I
>       >       >       think) because
>       >       >       >       its
>       >       >       >       >             expiry
>       >       >       >       >             >       had not come
>       >       >       >       >             >       >       around. 
>       >       >       >       >             >       >       >
>       >       >       >       >             >       >       > lines 1199 and 
> 1200 in items.c:
>       >       >       >       >             >       >       > if 
> ((search->exptime != 0 && search->exptime < current_time) || 
> is_flushed(search)) {
>       >       >       >       >             >       >       >
>       >       >       >       >             >       >       > If there was a 
> check to see if this data was "orphaned," i.e. that the key, if accessed, 
> would map to a different slab than the current one, then these orphans could 
> be reclaimed as free memory. I am working on a patch to do this, though I 
> have reservations about performing a hash on the
>       key on the
>       >       >       lru crawler
>       >       >       >       >             thread (if
>       >       >       >       >             >       the hash is not
>       >       >       >       >             >       >       already 
> available).
>       >       >       >       >             >       >       > I have very 
> little experience in the memcached codebase so I don't know the most 
> efficient way to do this. Any help would be appreciated.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       There seems to be 
> a misconception about how the slab classes work. A key,
>       >       >       >       >             >       >       if already 
> existing in a slab, will always map to the slab class it
>       >       >       >       >             >       >       currently fits 
> into. The slab classes always exist, but the amount of
>       >       >       >       >             >       >       memory reserved 
> for each of them will shift with the slab_reassign. ie: 10
>       >       >       >       >             >       >       pages in slab 
> class 21, then memory pressure on 23 causes it to move over.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       So if you examine 
> a key that still exists in slab class 21, it has no
>       >       >       >       >             >       >       reason to move up 
> or down the slab classes.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       > Alternatively, 
> and possibly more beneficial is compaction of data in a slab using the same 
> set of criteria as lru crawling. Understandably, compaction is a very 
> difficult problem to solve since moving the data would be a pain in the ass. 
> I saw a couple of discussions about this in the
>       mailing list,
>       >       >       though I didn't
>       >       >       >       >             see any
>       >       >       >       >             >       firm thoughts about
>       >       >       >       >             >       >       it. I think it
>       >       >       >       >             >       >       > can probably be 
> done in O(1) like the lru crawler by limiting the number of items it touches 
> each time. Writing and reading are doable in O(1) so moving should be as 
> well. Has anyone given more thought on compaction?
>       >       >       >       >             >       >
>       >       >       >       >             >       >       I'd be interested 
> in hacking this up for you folks if you can provide me
>       >       >       >       >             >       >       testing and some 
> data to work with. With all of the LRU work I did in
>       >       >       >       >             >       >       1.4.24, the next 
> things I wanted to do is a big improvement on the slab
>       >       >       >       >             >       >       reassignment code.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       Currently it 
> picks essentially a random slab page, empties it, and moves
>       >       >       >       >             >       >       the slab page 
> into the class under pressure.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       One thing we can 
> do is first examine for free memory in the existing slab,
>       >       >       >       >             >       >       IE:
>       >       >       >       >             >       >
>       >       >       >       >             >       >       - Take a page 
> from slab 21
>       >       >       >       >             >       >       - Scan the page 
> for valid items which need to be moved
>       >       >       >       >             >       >       - Pull free 
> memory from slab 21, migrate the item (moderately complicated)
>       >       >       >       >             >       >       - When the page 
> is empty, move it (or give up if you run out of free
>       >       >       >       >             >       >       chunks).
>       >       >       >       >             >       >
>       >       >       >       >             >       >       The next step is 
> to pull from the LRU on slab 21:
>       >       >       >       >             >       >
>       >       >       >       >             >       >       - Take page from 
> slab 21
>       >       >       >       >             >       >       - Scan page for 
> valid items
>       >       >       >       >             >       >       - Pull free 
> memory from slab 21, migrate the item
>       >       >       >       >             >       >         - If no memory 
> free, evict tail of slab 21. use that chunk.
>       >       >       >       >             >       >       - When the page 
> is empty, move it.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       Then, when you 
> hit this condition your least-recently-used data gets
>       >       >       >       >             >       >       culled as new 
> data migrates your page class. This should match a natural
>       >       >       >       >             >       >       occurrance if you 
> would already be evicting valid (but old) items to make
>       >       >       >       >             >       >       room for new 
> items.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       A bonus to using 
> the free memory trick, is that I can use the amount of
>       >       >       >       >             >       >       free space in a 
> slab class as a heuristic to more quickly move slab pages
>       >       >       >       >             >       >       around.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       If it's still 
> necessary from there, we can explore "upgrading" items to a
>       >       >       >       >             >       >       new slab class, 
> but that is much much more complicated since the item has
>       >       >       >       >             >       >       to shift LRU's. 
> Do you put it at the head, the tail, the middle, etc? It
>       >       >       >       >             >       >       might be 
> impossible to make a good generic decision there.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       What version are 
> you currently on? If 1.4.24, have you seen any
>       >       >       >       >             >       >       instability? I'm 
> currently torn between fighting a few bugs and start on
>       >       >       >       >             >       >       improving the 
> slab rebalancer.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       -Dormando
>       >       >       >       >             >       >
>       >       >       >       >             >       >
>       >       >       >       >             >       > On Saturday, July 11, 
> 2015 at 12:05:54 PM UTC-7, Dormando wrote:
>       >       >       >       >             >       >       Hey,
>       >       >       >       >             >       >
>       >       >       >       >             >       >       On Fri, 10 Jul 
> 2015, Scott Mansfield wrote:
>       >       >       >       >             >       >
>       >       >       >       >             >       >       > We've seen 
> issues recently where we run a cluster that typically has the majority of 
> items overwritten in the same slab every day and a sudden change in data size 
> evicts a ton of data, affecting downstream systems. To be clear that is our 
> problem, but I think there's a tweak in memcached
>       that might
>       >       >       be useful and
>       >       >       >       >             another
>       >       >       >       >             >       possible feature that
>       >       >       >       >             >       >       would be even
>       >       >       >       >             >       >       > better.
>       >       >       >       >             >       >       > The data that 
> is written to this cache is overwritten every day, though the TTL is 7 days. 
> One slab takes up the majority of the space in the cache. The application 
> wrote e.g. 10KB (slab 21) every day for each key consistently. One day, a 
> change occurred where it started writing 15KB (slab
>       23),
>       >       >       causing a migration
>       >       >       >       >             of data
>       >       >       >       >             >       from one slab to
>       >       >       >       >             >       >       another. We had -o
>       >       >       >       >             >       >       > 
> slab_reassign,slab_automove=1 set on the server, causing large numbers of 
> evictions on the initial slab. Let's say the cache could hold the data at 
> 15KB per key, but the old data was not technically TTL'd out in it's old 
> slab. This means that memory was not being freed by the lru crawler
>       thread (I
>       >       >       think) because
>       >       >       >       its
>       >       >       >       >             expiry
>       >       >       >       >             >       had not come
>       >       >       >       >             >       >       around. 
>       >       >       >       >             >       >       >
>       >       >       >       >             >       >       > lines 1199 and 
> 1200 in items.c:
>       >       >       >       >             >       >       > if 
> ((search->exptime != 0 && search->exptime < current_time) || 
> is_flushed(search)) {
>       >       >       >       >             >       >       >
>       >       >       >       >             >       >       > If there was a 
> check to see if this data was "orphaned," i.e. that the key, if accessed, 
> would map to a different slab than the current one, then these orphans could 
> be reclaimed as free memory. I am working on a patch to do this, though I 
> have reservations about performing a hash on the
>       key on the
>       >       >       lru crawler
>       >       >       >       >             thread (if
>       >       >       >       >             >       the hash is not
>       >       >       >       >             >       >       already 
> available).
>       >       >       >       >             >       >       > I have very 
> little experience in the memcached codebase so I don't know the most 
> efficient way to do this. Any help would be appreciated.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       There seems to be 
> a misconception about how the slab classes work. A key,
>       >       >       >       >             >       >       if already 
> existing in a slab, will always map to the slab class it
>       >       >       >       >             >       >       currently fits 
> into. The slab classes always exist, but the amount of
>       >       >       >       >             >       >       memory reserved 
> for each of them will shift with the slab_reassign. ie: 10
>       >       >       >       >             >       >       pages in slab 
> class 21, then memory pressure on 23 causes it to move over.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       So if you examine 
> a key that still exists in slab class 21, it has no
>       >       >       >       >             >       >       reason to move up 
> or down the slab classes.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       > Alternatively, 
> and possibly more beneficial is compaction of data in a slab using the same 
> set of criteria as lru crawling. Understandably, compaction is a very 
> difficult problem to solve since moving the data would be a pain in the ass. 
> I saw a couple of discussions about this in the
>       mailing list,
>       >       >       though I didn't
>       >       >       >       >             see any
>       >       >       >       >             >       firm thoughts about
>       >       >       >       >             >       >       it. I think it
>       >       >       >       >             >       >       > can probably be 
> done in O(1) like the lru crawler by limiting the number of items it touches 
> each time. Writing and reading are doable in O(1) so moving should be as 
> well. Has anyone given more thought on compaction?
>       >       >       >       >             >       >
>       >       >       >       >             >       >       I'd be interested 
> in hacking this up for you folks if you can provide me
>       >       >       >       >             >       >       testing and some 
> data to work with. With all of the LRU work I did in
>       >       >       >       >             >       >       1.4.24, the next 
> things I wanted to do is a big improvement on the slab
>       >       >       >       >             >       >       reassignment code.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       Currently it 
> picks essentially a random slab page, empties it, and moves
>       >       >       >       >             >       >       the slab page 
> into the class under pressure.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       One thing we can 
> do is first examine for free memory in the existing slab,
>       >       >       >       >             >       >       IE:
>       >       >       >       >             >       >
>       >       >       >       >             >       >       - Take a page 
> from slab 21
>       >       >       >       >             >       >       - Scan the page 
> for valid items which need to be moved
>       >       >       >       >             >       >       - Pull free 
> memory from slab 21, migrate the item (moderately complicated)
>       >       >       >       >             >       >       - When the page 
> is empty, move it (or give up if you run out of free
>       >       >       >       >             >       >       chunks).
>       >       >       >       >             >       >
>       >       >       >       >             >       >       The next step is 
> to pull from the LRU on slab 21:
>       >       >       >       >             >       >
>       >       >       >       >             >       >       - Take page from 
> slab 21
>       >       >       >       >             >       >       - Scan page for 
> valid items
>       >       >       >       >             >       >       - Pull free 
> memory from slab 21, migrate the item
>       >       >       >       >             >       >         - If no memory 
> free, evict tail of slab 21. use that chunk.
>       >       >       >       >             >       >       - When the page 
> is empty, move it.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       Then, when you 
> hit this condition your least-recently-used data gets
>       >       >       >       >             >       >       culled as new 
> data migrates your page class. This should match a natural
>       >       >       >       >             >       >       occurrance if you 
> would already be evicting valid (but old) items to make
>       >       >       >       >             >       >       room for new 
> items.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       A bonus to using 
> the free memory trick, is that I can use the amount of
>       >       >       >       >             >       >       free space in a 
> slab class as a heuristic to more quickly move slab pages
>       >       >       >       >             >       >       around.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       If it's still 
> necessary from there, we can explore "upgrading" items to a
>       >       >       >       >             >       >       new slab class, 
> but that is much much more complicated since the item has
>       >       >       >       >             >       >       to shift LRU's. 
> Do you put it at the head, the tail, the middle, etc? It
>       >       >       >       >             >       >       might be 
> impossible to make a good generic decision there.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       What version are 
> you currently on? If 1.4.24, have you seen any
>       >       >       >       >             >       >       instability? I'm 
> currently torn between fighting a few bugs and start on
>       >       >       >       >             >       >       improving the 
> slab rebalancer.
>       >       >       >       >             >       >
>       >       >       >       >             >       >       -Dormando
>       >       >       >       >             >       >
>       >       >       >       >             >       > --
>       >       >       >       >             >       >
>       >       >       >       >             >       > ---
>       >       >       >       >             >       > You received this 
> message because you are subscribed to the Google Groups "memcached" group.
>       >       >       >       >             >       > To unsubscribe from 
> this group and stop receiving emails from it, send an email to 
> memcached+...@googlegroups.com.
>       >       >       >       >             >       > For more options, visit 
> https://groups.google.com/d/optout.
>       >       >       >       >             >       >
>       >       >       >       >             >       >
>       >       >       >       >             >
>       >       >       >       >             > --
>       >       >       >       >             >
>       >       >       >       >             > ---
>       >       >       >       >             > You received this message 
> because you are subscribed to the Google Groups "memcached" group.
>       >       >       >       >             > To unsubscribe from this group 
> and stop receiving emails from it, send an email to 
> memcached+...@googlegroups.com.
>       >       >       >       >             > For more options, visit 
> https://groups.google.com/d/optout.
>       >       >       >       >             >
>       >       >       >       >             >
>       >       >       >       >
>       >       >       >       > --
>       >       >       >       >
>       >       >       >       > ---
>       >       >       >       > You received this message because you are 
> subscribed to the Google Groups "memcached" group.
>       >       >       >       > To unsubscribe from this group and stop 
> receiving emails from it, send an email to memcached+...@googlegroups.com.
>       >       >       >       > For more options, visit 
> https://groups.google.com/d/optout.
>       >       >       >       >
>       >       >       >       >
>       >       >       >
>       >       >       > --
>       >       >       >
>       >       >       > ---
>       >       >       > You received this message because you are subscribed 
> to the Google Groups "memcached" group.
>       >       >       > To unsubscribe from this group and stop receiving 
> emails from it, send an email to memcached+...@googlegroups.com.
>       >       >       > For more options, visit 
> https://groups.google.com/d/optout.
>       >       >       >
>       >       >       >
>       >       >
>       >       >
>       >       > --
>       >       >
>       >       > ---
>       >       > You received this message because you are subscribed to the 
> Google Groups "memcached" group.
>       >       > To unsubscribe from this group and stop receiving emails from 
> it, send an email to memcached+...@googlegroups.com.
>       >       > For more options, visit https://groups.google.com/d/optout.
>       >       >
>       >       >
>       >
>       > --
>       >
>       > ---
>       > You received this message because you are subscribed to the Google 
> Groups "memcached" group.
>       > To unsubscribe from this group and stop receiving emails from it, 
> send an email to memcached+...@googlegroups.com.
>       > For more options, visit https://groups.google.com/d/optout.
>       >
>       >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to memcached+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

Re: Check for orphaned items in lru crawler thread

Reply via email to