Re: Culprit of intermittent & transient item-too-large set failures, and why favoring slab reassignment over memory allocation?

Min-Zhong "John" Lu Tue, 25 Apr 2017 14:08:53 -0700

Hello,

Thanks for the response! So the slab automover is not the culprit.


As for the exact server error: unfortunately I don't have that for now as I 
use libmemcached (plus pylibmc for that matter). With that said, I do have 
used the plain telnet protocol when doing "further get requests" (as in my 
original mail) to verify the success of set requests (and the item size 
showing there are exactly the same as I've calculated within my python 
codes, FWIW).

I think I can set up a little nice netcat script to imitate those set 
requests, directly through the telnet protocol, to capture the exact error 
message. Not sure how the intermittent nature of the failures can come into 
play here, but I'll try my best to reproduce it.

As for setting -o slab_chunk_size_max=1048576 --- I'll try that, but I need 
to schedule a maintenance window with my peers. Let me do the netcat script 
first, and I'll probably have the instance relaunched (with the new 
setting) within a couple days and a few days later I'll ping back on 
whether I'm still seeing the failures.

I'm attaching |stats items| here. Also attaching those |stats| and |stats 
slabs| dumped at the same time for consistency.

Will come back with more info for the fun,
- Mnjul





On Tuesday, April 25, 2017 at 4:40:52 PM UTC-4, Dormando wrote:
>
> Hey! 
>
> Unfortunately you've summoned a dinosaur, as I am old now :P 
>
> My main question; do you have the exact server error returned by 
> memcached? If it is "SERVER_ERROR object too large for cache" - that error 
> has nothing to do with memory allocation, and is just reflecting that the 
> item attempted to store is too large (over 1MB). If it fails for that 
> reason it should always fail. 
>
> First off, unfortunately your assumption that the slab page mover is 
> synchronous isn't correct. It's a fully backgrounded process that doesn't 
> ever block anything. New memory allocations don't block for anything. 
>
> Also; can you include "stats items"? It has some possibly relevant info. 
>
> Especially in your instance, which isn't using all of the memory you've 
> assigned to it (about 1/3rd?). The slab page mover is simply moving memory 
> back into a free pool when there is too much memory free in any particular 
> slab class. 
>
> ie; 
> STAT slab_global_page_pool 308 
>
> When new memory is requested and none is available readily in a slab 
> class, first a new page is pulled from the global page pool if available. 
> After that, a new page is malloced. After that, items are pulled from the 
> LRU and evicted. If nothing can be evicted for some reason you would get 
> an allocation error. 
>
> So you really shouldn't be seeing any. "stats items" would tell me the 
> nature of any allocation problems (hopefully) that you're seeing. Also 
> getting the exact error being thrown to you is very helpful. Most errors 
> in the system are unique so I can trace them back to particular code. 
>
> It is possible there is a bug or weirdness with chunked allocation, which 
> happens for items > 512k and has gone through a couple revisions. You can 
> test this theory by adding: "-o slab_chunk_size_max=1048576" (the same as 
> item size max). Would be great to know if this makes the problem go away, 
> since it means I have some more stuff to tune there. 
>
> have fun, 
> -Dormando 
>
> On Mon, 24 Apr 2017, Min-Zhong "John" Lu wrote: 
>
> > Hi there, 
> > 
> > I've recently been investigating an intermittent & transient 
> failure-to-set issue, in a 
> > long-running memcached instance. And I believe I could use some insight 
> from you all. 
> > 
> > Let me list my configurations first. I have |stats| and |stats slabs| 
> dumps available as 
> > Google Groups attachment. If they fail to go through just lemme re-post 
> them on some 
> > pastebin service. 
> > 
> > Configuration: 
> > Command line arg: -m 2900 -f 1.16 -c 10240 -k -o modern 
> > 
> > Using 1.4.36 (compiled by myself) on Ubuntu 14.04.4 x64. 
> > 
> > The -k flag has been verified to be effective (I've got limits 
> configured correctly). 
> > 
> > Growth factor of 1.16 is just an empirical value for my item sizes. 
> > 
> > 
> > Symptom of the issue:  
> > After running the memcached for around 10 days, there have been 
> occasions where a set 
> > request of an large item (sized around 760KiBs to 930KiBs) would fail, 
> where memcached 
> > returns 37 (item too big). However, when this happens, if I wait for 
> around one minute, 
> > and then send the same set request again (with exactly the same 
> key/item/expiration to 
> > store), memcached would gladly store it. Further get requests verify 
> that the item is 
> > correctly stored. 
> > 
> > According to my logs, this happens intermittently, and I haven't been 
> able to correlate 
> > those transient failures with my slab stats. 
> > 
> > 
> > Observation & Question 1: 
> > Q1: Does my issue arise because when the initial set request arrives at 
> memcached, 
> > memcached has to run the slab automover to produce a slab (maybe two 
> slabs, since the 
> > item is larger than 512KiB) to accommodate the set request? 
> > 
> > This is my hunch --- I am yet to do a quick |stats| dump at the exact 
> moment of the set 
> > failure to confirm this. But I have seen [slab_reassign_busy_items = 
> 10K] and 
> > [slabs_moved = 16.9K] in my |stats| dumps, which means the slab 
> automover must have been 
> > triggered during memcached's entire life time. This leads to my next 
> question: 
> > 
> > 
> > Observation & Question 2 & 3: 
> > Q2: When the slab automover is running, would it possibly block the 
> large-item set 
> > request, as in my case above? 
> > 
> > Q3: Why would memcached favor triggering slab automover over allocating 
> new memory, when 
> > there is still host memory available? 
> > 
> > According to the stats dumps, my memcached instance has [total_malloced 
> = 793MiB], and a 
> > footprint of [bytes = 392.33MiB] --- both fall far short of 
> [limit_maxbytes = 2900MiB]. 
> > Furthermore, nothing has been evicted as I have got [evictions = 0] 
> > 
> > (And the host system has extremely enough free physical memory, per 
> |free -m|) 
> > 
> > I would expect that allocating memory would be faster (and *way* faster 
> actually) than 
> > triggering slab automover to reassign slabs to accommodate the incoming 
> set request, and 
> > that allocating memory would allow the initial set request to be served 
> immediately. 
> > 
> > In addition, if the slab automover just happens to be running when the 
> large-item set 
> > request arrives, and the answer to Q2 is "yes"... can we make it not 
> block if there's 
> > still host memory available? 
> > 
> > 
> > 
> > I'm kinda out of clues here...and I might actually be on a wrong route 
> in my 
> > investigation. 
> > 
> > Any insight is appreciated, and it'd be great if I can get rid of those 
> set failures 
> > without having to summon a dinosaur. 
> > 
> > For example, would disabling slab automover be an acceptable band-aid 
> fix? (and that I 
> > launch the manual mover (mc_slab_mover) when I know I have relatively 
> lighter traffic) 
> > 
> > Thanks a lot. 
> > 
> > p.s. While 'retry this set request at a later time' will work 
> (anecdotally), I don't 
> > want to implement a retry mechanism at client side, since 1) the 'later 
> time' is 
> > probably non-deterministic, and 2) I don't have a readily available 
> construct to 
> > decouple such retry from the rest of my task, and thus having to retry 
> would 
> > unnecessarily block client side. 
> > 
> > -- 
> > 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups "memcached" 
> > group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to 
> > memcached+...@googlegroups.com <javascript:>. 
> > For more options, visit https://groups.google.com/d/optout. 
> > 
> >

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Culprit of intermittent & transient item-too-large set failures, and why favoring slab reassignment over memory allocation?

Reply via email to