Re: MEMCACHED_SERVER_MEMORY_ALLOCATION_FAILURE (SERVER_ERROR out of memory storing object) error
> Dormando,Yes, have to admit - we cache too aggressively (just do not want to > use different less polite word :)). > > Going to do two test experiments: enable compression and auto reallocation. > Before doing this: > 1) why auto reallocation is not enabled by default, what issues/disadvantage > to expect? Because it pulls memory from other places and evicts those items regardless of if they were still valid or expired. There's no way for it to reassign slab pages of "just expired memory". Some people would prefer to just let evictions fall from the tail (least used) rather than do this, so we didn't change the defaults after introducing the feature. > 2) why memcached does not have compression on server side if CPU is idle, > because of ideology to keep it simple and fast? (just asking) I said already: in typical use case there are many more clients, and a very high rate of usage. If you flipped where the compression happens the server would run out of CPU very quickly, and be much more latent. We could support it in the server but it'd be a very low priority feature. > On Tuesday, May 6, 2014 6:40:07 PM UTC-7, Dormando wrote: > > Hi Dormando, > > Full Slabs and Items stats are below. The problem is that other slabs > are full too, so rebalancing is not trivial. I will try to > create a wrapper > > that will do some analysis and do slab rebalancing based on stats > (the idea to move try to shrink slabs with low eviction but need to > think more). > > But i see there is "Slabs Automove" in protocol.txt. Do you recommend > it? > > If it fits your needs. Otherwise, write an external daemon that controls > the automover based on your own needs. > > > >You either need to add more memory to the total system or rebalance > them. > > we run many-many memcached servers with 30Gb+ memory each box. And > the problem occurs on some boxes periodically. So I am thinking > how to convert > > manual restart to automatic action. > > I'm not sure why restarting will fix it, if above you say "rebalancing > is > not trivial". If restarting would fix it, rebalancing would also fix it. > > From the stats below, you do have a fair amount of memory spread out > among the higher order slab classes. Compression, or otherwise > re-evaluating how you store those values may make a big difference. > > There's also a huge amount of stuff being evicted without ever being > fetched again. Are you caching too aggressively, or is memory just way > too > small and they never get a chance to be fetched after being set? > > I'm just eyeballing it but evicted_time seems pretty short (a matter of > hours). That's the last access time of the last object to be evicted... > and it's like that across most of your slab classes. > > So, shuffle and compress and whatnot, but I think you're out of ram > dude. > > > server > > stats > > STAT pid 15480 > > STAT uptime 2476264 > > STAT time 1399422427 > > STAT version 1.4.15 > > STAT libevent 1.4.13-stable > > STAT pointer_size 64 > > STAT rusage_user 639012.117392 > > STAT rusage_system 2076810.323840 > > STAT curr_connections 5237 > > STAT total_connections 122995977 > > STAT connection_structures 23402 > > STAT reserved_fds 40 > > STAT cmd_get 91928675147 > > STAT cmd_set 4358475896 > > STAT cmd_flush 1 > > STAT cmd_touch 0 > > STAT get_hits 85005900667 > > STAT get_misses 6922774480 > > STAT delete_misses 4238049567 > > STAT delete_hits 885535057 > > STAT incr_misses 0 > > STAT incr_hits 0 > > STAT decr_misses 0 > > STAT decr_hits 0 > > STAT cas_misses 1074 > > STAT cas_hits 4784930 > > STAT cas_badval 14966 > > STAT touch_hits 0 > > STAT touch_misses 0 > > STAT auth_cmds 0 > > STAT auth_errors 0 > > STAT bytes_read 32317259718167 > > STAT bytes_written 221039272582722 > > STAT limit_maxbytes 25769803776 > > STAT accepting_conns 1 > > STAT listen_disabled_num 0 > > STAT threads 8 > > STAT conn_yields 0 > > STAT hash_power_level 25 > > STAT hash_bytes 268435456 > > STAT hash_is_expanding 0 > > STAT slab_reassign_running 0 > > STAT slabs_moved 0 > > STAT bytes 23567307974 > > STAT curr_items 32559669 > > STAT total_items 61290586 > > STAT expired_unfetched 6664504 > > STAT evicted_unfetched 1244432758 > > STAT evictions 2522683859 > > STAT reclaimed 7626148 > > END > > > > > > > > stats slabs > > STAT 1:chunk_size 96 > > STAT 1:chunks_per_page 10922 > > STAT 1:total_pages 1 > > STAT 1:total_chunks 10922 > > STAT 1:used_chunks 0 > > STAT 1:free_chunks 10922 > > STAT 1
Re: MEMCACHED_SERVER_MEMORY_ALLOCATION_FAILURE (SERVER_ERROR out of memory storing object) error
> Hi Dormando, > Full Slabs and Items stats are below. The problem is that other slabs are > full too, so rebalancing is not trivial. I will try to create a wrapper > that will do some analysis and do slab rebalancing based on stats (the idea > to move try to shrink slabs with low eviction but need to think more). > But i see there is "Slabs Automove" in protocol.txt. Do you recommend it? If it fits your needs. Otherwise, write an external daemon that controls the automover based on your own needs. > >You either need to add more memory to the total system or rebalance them. > we run many-many memcached servers with 30Gb+ memory each box. And the > problem occurs on some boxes periodically. So I am thinking how to convert > manual restart to automatic action. I'm not sure why restarting will fix it, if above you say "rebalancing is not trivial". If restarting would fix it, rebalancing would also fix it. >From the stats below, you do have a fair amount of memory spread out among the higher order slab classes. Compression, or otherwise re-evaluating how you store those values may make a big difference. There's also a huge amount of stuff being evicted without ever being fetched again. Are you caching too aggressively, or is memory just way too small and they never get a chance to be fetched after being set? I'm just eyeballing it but evicted_time seems pretty short (a matter of hours). That's the last access time of the last object to be evicted... and it's like that across most of your slab classes. So, shuffle and compress and whatnot, but I think you're out of ram dude. > server > stats > STAT pid 15480 > STAT uptime 2476264 > STAT time 1399422427 > STAT version 1.4.15 > STAT libevent 1.4.13-stable > STAT pointer_size 64 > STAT rusage_user 639012.117392 > STAT rusage_system 2076810.323840 > STAT curr_connections 5237 > STAT total_connections 122995977 > STAT connection_structures 23402 > STAT reserved_fds 40 > STAT cmd_get 91928675147 > STAT cmd_set 4358475896 > STAT cmd_flush 1 > STAT cmd_touch 0 > STAT get_hits 85005900667 > STAT get_misses 6922774480 > STAT delete_misses 4238049567 > STAT delete_hits 885535057 > STAT incr_misses 0 > STAT incr_hits 0 > STAT decr_misses 0 > STAT decr_hits 0 > STAT cas_misses 1074 > STAT cas_hits 4784930 > STAT cas_badval 14966 > STAT touch_hits 0 > STAT touch_misses 0 > STAT auth_cmds 0 > STAT auth_errors 0 > STAT bytes_read 32317259718167 > STAT bytes_written 221039272582722 > STAT limit_maxbytes 25769803776 > STAT accepting_conns 1 > STAT listen_disabled_num 0 > STAT threads 8 > STAT conn_yields 0 > STAT hash_power_level 25 > STAT hash_bytes 268435456 > STAT hash_is_expanding 0 > STAT slab_reassign_running 0 > STAT slabs_moved 0 > STAT bytes 23567307974 > STAT curr_items 32559669 > STAT total_items 61290586 > STAT expired_unfetched 6664504 > STAT evicted_unfetched 1244432758 > STAT evictions 2522683859 > STAT reclaimed 7626148 > END > > > > stats slabs > STAT 1:chunk_size 96 > STAT 1:chunks_per_page 10922 > STAT 1:total_pages 1 > STAT 1:total_chunks 10922 > STAT 1:used_chunks 0 > STAT 1:free_chunks 10922 > STAT 1:free_chunks_end 0 > STAT 1:mem_requested 0 > STAT 1:get_hits 9905 > STAT 1:cmd_set 10362 > STAT 1:delete_hits 9582 > STAT 1:incr_hits 0 > STAT 1:decr_hits 0 > STAT 1:cas_hits 0 > STAT 1:cas_badval 0 > STAT 1:touch_hits 0 > STAT 2:chunk_size 120 > STAT 2:chunks_per_page 8738 > STAT 2:total_pages 1 > STAT 2:total_chunks 8738 > STAT 2:used_chunks 13 > STAT 2:free_chunks 8725 > STAT 2:free_chunks_end 0 > STAT 2:mem_requested 1350 > STAT 2:get_hits 1309125 > STAT 2:cmd_set 2963710 > STAT 2:delete_hits 199018 > STAT 2:incr_hits 0 > STAT 2:decr_hits 0 > STAT 2:cas_hits 770681 > STAT 2:cas_badval 3697 > STAT 2:touch_hits 0 > STAT 3:chunk_size 152 > STAT 3:chunks_per_page 6898 > STAT 3:total_pages 5 > STAT 3:total_chunks 34490 > STAT 3:used_chunks 34240 > STAT 3:free_chunks 250 > STAT 3:free_chunks_end 0 > STAT 3:mem_requested 483 > STAT 3:get_hits 2088979 > STAT 3:cmd_set 4355223 > STAT 3:delete_hits 3392 > STAT 3:incr_hits 0 > STAT 3:decr_hits 0 > STAT 3:cas_hits 0 > STAT 3:cas_badval 0 > STAT 3:touch_hits 0 > STAT 4:chunk_size 192 > STAT 4:chunks_per_page 5461 > STAT 4:total_pages 11 > STAT 4:total_chunks 60071 > STAT 4:used_chunks 60070 > STAT 4:free_chunks 1 > STAT 4:free_chunks_end 0 > STAT 4:mem_requested 10821971 > STAT 4:get_hits 65413752 > STAT 4:cmd_set 22935889 > STAT 4:delete_hits 6028 > STAT 4:incr_hits 0 > STAT 4:decr_hits 0 > STAT 4:cas_hits 0 > STAT 4:cas_badval 0 > STAT 4:touch_hits 0 > STAT 5:chunk_size 240 > STAT 5:chunks_per_page 4369 > STAT 5:total_pages 756 > STAT 5:total_chunks 3302964 > STAT 5:used_chunks 3302964 > STAT 5:free_chunks 0 > STAT 5:free_chunks_end 0 > STAT 5:mem_requested 766866823 > STAT 5:get_hits 2762768607 > STAT 5:cmd_set 445418784 > STAT 5:delete_hits 15806705 > STAT 5:incr_hits 0 > STAT 5:decr_hits 0 > STAT 5:cas_hits 0 > STAT 5:cas_badval 0 > STAT 5:touch_hits 0 > STAT 6:chunk_size 304 > STAT 6:chunks_per_pag
Re: MEMCACHED_SERVER_MEMORY_ALLOCATION_FAILURE (SERVER_ERROR out of memory storing object) error
> Hi, > Does anybody know good way to handle OOM during set operation? Server is > fully calcified :) (no new pages to allocate) and i have this issue for > slab 17 > STAT items:17:number 16128 > STAT items:17:age 90 > STAT items:17:evicted 246790897 > STAT items:17:evicted_nonzero 246790874 > STAT items:17:evicted_time 90 > STAT items:17:outofmemory 33098 > STAT items:17:tailrepairs 0 > STAT items:17:reclaimed 1183 > STAT items:17:expired_unfetched 196 > STAT items:17:evicted_unfetched 143699820 > > running memcached : STAT version 1.4.15 "stats slabs" ? Is memory unbalanced from other slabs? > nothing except reboot periodically comes to my mind but this solution does > not make me happy :) There's the slab rebalance feature. OOM errors only happen when there is truly very few pages free and all of the ones in the tail are locked, or there's a bug. It should always evict. The rebalance feature is documented in doc/protocol.txt. However your eviction seems to be very highly pressured. The evicted_unfetched stat is high compared to the tota number of evictions. So they're not even staying in long enough to get fetched again. There aren't that many OOM errors overall, so perhaps you are just hitting that slab way too hard and occasionally locking everything in the tail. You either need to add more memory to the total system or rebalance them. > other option - enable compression to allow more items but need to experiment > (why memcached does not provide server side compression? as i see in > stats memcached cpu is not used, so would be good to utilize it.) Very high rate of access is expected and the ratio of clients to servers might be high, so compression is done in the client instead. It was also designed to let you run it wherever there's free memory (extra installed in webservers/etc) so it wants to avoid excess cpu usage. It's a trivial switch either way. Also consider upgrading to .17 or .19. might be some good fixes. -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
MEMCACHED_SERVER_MEMORY_ALLOCATION_FAILURE (SERVER_ERROR out of memory storing object) error
Hi, Does anybody know good way to handle OOM during set operation? Server is fully calcified :) (no new pages to allocate) and i have this issue for slab 17 STAT items:17:number 16128 STAT items:17:age 90 STAT items:17:evicted 246790897 STAT items:17:evicted_nonzero 246790874 STAT items:17:evicted_time 90 STAT items:17:outofmemory 33098 STAT items:17:tailrepairs 0 STAT items:17:reclaimed 1183 STAT items:17:expired_unfetched 196 STAT items:17:evicted_unfetched 143699820 running memcached : STAT version 1.4.15 nothing except reboot periodically comes to my mind but this solution does not make me happy :) other option - enable compression to allow more items but need to experiment (why memcached does not provide server side compression? as i see in stats memcached cpu is not used, so would be good to utilize it.) Cheers, Den -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: SERVER_ERROR out of memory storing object
> > I apologize for not understanding more. So, all pages have been allocated > (i.e. Memcached cannot just grab a new page) and there's nothing in the tail > of the queue for something to evict. And that's because all recent items > are being written at the same moment? > > That is, there's no more memory and Memcached is too busy with active items > to find something to throw out? > > If this is rate related would one solution be to add more servers into the > pool to spread out the load? > > Our Memcached isn't that busy -- if I can trust our Zabbix graphs 800 gets/s > and 250 set/s and 1 evection/s. About 40M objects and 2500 connections on a > single Memcached server. > > We do have large objects (which is a problem we need to fix). Why is OOM > error more likely with large objects? Look through 'stats slabs' and 'stats items' - your 1MB object space probably only has a handful of chunks on it (maybe even just one). That means it has very little memory to work with. > > Were there issues with the latest version? > > > No, I could not get the latest version to issue the OOM error, but that was > in a dev environment. I don't think I could get our old production version > 1.4.4 to issue the OOM error either under dev. But, I have a lot more > testing to do. > > The timeouts on production are a much bigger concern at this time. I haven't looked at your timeouts mail yet, sorry. The latest version should make those OOM errors less likely. You can also use slab rebalance to give more memory to the larger slab classes. -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: SERVER_ERROR out of memory storing object
On Wed, Sep 25, 2013 at 9:58 AM, dormando wrote: > If you have a memory limit of 2MB, and start uploading 3 1MB objects, the > third one will cause an out of memory error. > > During upload a free object is pulled to be written into. If you are > actively writing to, or actively reading from + writing to, more objects > than are available for it to reserve, it'll bail with an OOM error. It's > only able to look at the tail for this, so it's more common with large > objects. > I apologize for not understanding more. So, all pages have been allocated (i.e. Memcached cannot just grab a new page) and there's nothing in the tail of the queue for something to evict. And that's because all recent items are being written at the same moment? That is, there's no more memory and Memcached is too busy with active items to find something to throw out? If this is rate related would one solution be to add more servers into the pool to spread out the load? Our Memcached isn't that busy -- if I can trust our Zabbix graphs 800 gets/s and 250 set/s and 1 evection/s. About 40M objects and 2500 connections on a single Memcached server. We do have large objects (which is a problem we need to fix). Why is OOM error more likely with large objects? > Were there issues with the latest version? > No, I could not get the latest version to issue the OOM error, but that was in a dev environment. I don't think I could get our old production version 1.4.4 to issue the OOM error either under dev. But, I have a lot more testing to do. The timeouts on production are a much bigger concern at this time. Thanks, -- Bill Moseley mose...@hank.org -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: SERVER_ERROR out of memory storing object
If you have a memory limit of 2MB, and start uploading 3 1MB objects, the third one will cause an out of memory error. During upload a free object is pulled to be written into. If you are actively writing to, or actively reading from + writing to, more objects than are available for it to reserve, it'll bail with an OOM error. It's only able to look at the tail for this, so it's more common with large objects. The latest version has some workarounds which improve the situation but nothing can really "fix" it without allowing it to scan the full list or using temporary buffers. Were there issues with the latest version? On Wed, 25 Sep 2013, Bill Moseley wrote: > What is Memcached doing that is causing it to generate SERVER_ERROR out of > memory storing object? Is that a malloc error? What stats can I look > at to understand why this is happening? > I'm running 1.4.4 memcached -m 4096 -c 8192 on a machine with 8GB RAM. Top > says it has 4GB resident on CentOS 6.2. > > I have a script that is forking and sending 1MB sets to unique keys and I see > the out of memory errors. > > BTW -- I have another CentOS 6.2 with 64GB also 1.4.4 where I start with > "memcached -m 4096" and cannot make it happen. (Although the md5sums for > the two binaries are different for some reason.) > > > I'm curious about this stat: > > STAT total_malloced 8,590,762,448 (my commas) > > > Total amount of memory allocated to slab pages. > Is that the problem? But -m is set to 4096. > > stats settings > STAT maxbytes 0 > STAT maxconns 8192 > STAT tcpport 11211 > STAT udpport 11211 > STAT inter NULL > STAT verbosity 0 > STAT oldest 0 > STAT evictions on > STAT domain_socket NULL > STAT umask 700 > STAT growth_factor 1.25 > STAT chunk_size 48 > STAT num_threads 4 > STAT stat_key_prefix : > STAT detail_enabled no > STAT reqs_per_event 20 > STAT cas_enabled yes > STAT tcp_backlog 1024 > STAT binding_protocol auto-negotiate > STAT auth_enabled_sasl no > STAT item_size_max 1048576 > END > > > > $ (echo stats items; echo quit) | netcat 10.35.1.94 11211 | grep outofmemory > STAT items:1:outofmemory 0 > STAT items:2:outofmemory 0 > STAT items:3:outofmemory 0 > STAT items:4:outofmemory 0 > STAT items:5:outofmemory 0 > STAT items:6:outofmemory 0 > STAT items:7:outofmemory 0 > STAT items:8:outofmemory 0 > STAT items:9:outofmemory 0 > STAT items:10:outofmemory 0 > STAT items:11:outofmemory 0 > STAT items:12:outofmemory 0 > STAT items:13:outofmemory 0 > STAT items:14:outofmemory 0 > STAT items:15:outofmemory 0 > STAT items:16:outofmemory 0 > STAT items:17:outofmemory 0 > STAT items:18:outofmemory 0 > STAT items:19:outofmemory 0 > STAT items:20:outofmemory 0 > STAT items:21:outofmemory 0 > STAT items:22:outofmemory 0 > STAT items:23:outofmemory 0 > STAT items:24:outofmemory 0 > STAT items:25:outofmemory 0 > STAT items:26:outofmemory 0 > STAT items:27:outofmemory 0 > STAT items:28:outofmemory 0 > STAT items:29:outofmemory 0 > STAT items:30:outofmemory 0 > STAT items:31:outofmemory 0 > STAT items:32:outofmemory 0 > STAT items:33:outofmemory 0 > STAT items:34:outofmemory 0 > STAT items:35:outofmemory 0 > STAT items:36:outofmemory 0 > STAT items:37:outofmemory 0 > STAT items:38:outofmemory 0 > STAT items:39:outofmemory 0 > STAT items:40:outofmemory 0 > STAT items:41:outofmemory 0 > STAT items:42:outofmemory 0 > > > stats slabs > STAT 1:chunk_size 96 > STAT 1:chunks_per_page 10922 > STAT 1:total_pages 1 > STAT 1:total_chunks 10922 > STAT 1:used_chunks 7 > STAT 1:free_chunks 8 > STAT 1:free_chunks_end 10907 > STAT 1:mem_requested 18446744073709550798 > STAT 1:get_hits 48561 > STAT 1:cmd_set 35887 > STAT 1:delete_hits 0 > STAT 1:incr_hits 0 > STAT 1:decr_hits 0 > STAT 1:cas_hits 0 > STAT 1:cas_badval 0 > STAT 2:chunk_size 120 > STAT 2:chunks_per_page 8738 > STAT 2:total_pages 625 > STAT 2:total_chunks 5461250 > STAT 2:used_chunks 5461249 > STAT 2:free_chunks 1 > STAT 2:free_chunks_end 0 > STAT 2:mem_requested 2018891193 > STAT 2:get_hits 22305549989 > STAT 2:cmd_set 6257109186 > STAT 2:delete_hits 1 > STAT 2:incr_hits 0 > STAT 2:decr_hits 0 > STAT 2:cas_hits 0 > STAT 2:cas_badval 0 > STAT 3:chunk_size 152 > STAT 3:chunks_per_page 6898 > STAT 3:total_pages 5164 > STAT 3:total_chunks 35621272 > STAT 3:used_chunks 35615218 > STAT 3:free_chunks 6054 > STAT 3:free_chunks_end 0 > STAT 3:mem_requested 5042960945 > STAT 3:get_hits 52627536 > STAT 3:cmd_set 832629765 > STAT 3:delete_hits 20814 > STAT 3:incr_hits 0 > STAT 3:decr_hits 0 > STAT 3:cas_hits 0 > STAT 3:cas_badval 0 > STAT 4:chu
SERVER_ERROR out of memory storing object
What is Memcached doing that is causing it to generate SERVER_ERROR out of memory storing object? Is that a malloc error? What stats can I look at to understand why this is happening? I'm running 1.4.4 memcached -m 4096 -c 8192 on a machine with 8GB RAM. Top says it has 4GB resident on CentOS 6.2. I have a script that is forking and sending 1MB sets to unique keys and I see the out of memory errors. BTW -- I have another CentOS 6.2 with 64GB also 1.4.4 where I start with "memcached -m 4096" and cannot make it happen. (Although the md5sums for the two binaries are different for some reason.) I'm curious about this stat: STAT total_malloced 8,590,762,448 (my commas) Total amount of memory allocated to slab pages. Is that the problem? But -m is set to 4096. stats settings STAT maxbytes 0 STAT maxconns 8192 STAT tcpport 11211 STAT udpport 11211 STAT inter NULL STAT verbosity 0 STAT oldest 0 STAT evictions on STAT domain_socket NULL STAT umask 700 STAT growth_factor 1.25 STAT chunk_size 48 STAT num_threads 4 STAT stat_key_prefix : STAT detail_enabled no STAT reqs_per_event 20 STAT cas_enabled yes STAT tcp_backlog 1024 STAT binding_protocol auto-negotiate STAT auth_enabled_sasl no STAT item_size_max 1048576 END $ (echo stats items; echo quit) | netcat 10.35.1.94 11211 | grep outofmemory STAT items:1:outofmemory 0 STAT items:2:outofmemory 0 STAT items:3:outofmemory 0 STAT items:4:outofmemory 0 STAT items:5:outofmemory 0 STAT items:6:outofmemory 0 STAT items:7:outofmemory 0 STAT items:8:outofmemory 0 STAT items:9:outofmemory 0 STAT items:10:outofmemory 0 STAT items:11:outofmemory 0 STAT items:12:outofmemory 0 STAT items:13:outofmemory 0 STAT items:14:outofmemory 0 STAT items:15:outofmemory 0 STAT items:16:outofmemory 0 STAT items:17:outofmemory 0 STAT items:18:outofmemory 0 STAT items:19:outofmemory 0 STAT items:20:outofmemory 0 STAT items:21:outofmemory 0 STAT items:22:outofmemory 0 STAT items:23:outofmemory 0 STAT items:24:outofmemory 0 STAT items:25:outofmemory 0 STAT items:26:outofmemory 0 STAT items:27:outofmemory 0 STAT items:28:outofmemory 0 STAT items:29:outofmemory 0 STAT items:30:outofmemory 0 STAT items:31:outofmemory 0 STAT items:32:outofmemory 0 STAT items:33:outofmemory 0 STAT items:34:outofmemory 0 STAT items:35:outofmemory 0 STAT items:36:outofmemory 0 STAT items:37:outofmemory 0 STAT items:38:outofmemory 0 STAT items:39:outofmemory 0 STAT items:40:outofmemory 0 STAT items:41:outofmemory 0 STAT items:42:outofmemory 0 stats slabs STAT 1:chunk_size 96 STAT 1:chunks_per_page 10922 STAT 1:total_pages 1 STAT 1:total_chunks 10922 STAT 1:used_chunks 7 STAT 1:free_chunks 8 STAT 1:free_chunks_end 10907 STAT 1:mem_requested 18446744073709550798 STAT 1:get_hits 48561 STAT 1:cmd_set 35887 STAT 1:delete_hits 0 STAT 1:incr_hits 0 STAT 1:decr_hits 0 STAT 1:cas_hits 0 STAT 1:cas_badval 0 STAT 2:chunk_size 120 STAT 2:chunks_per_page 8738 STAT 2:total_pages 625 STAT 2:total_chunks 5461250 STAT 2:used_chunks 5461249 STAT 2:free_chunks 1 STAT 2:free_chunks_end 0 STAT 2:mem_requested 2018891193 STAT 2:get_hits 22305549989 STAT 2:cmd_set 6257109186 STAT 2:delete_hits 1 STAT 2:incr_hits 0 STAT 2:decr_hits 0 STAT 2:cas_hits 0 STAT 2:cas_badval 0 STAT 3:chunk_size 152 STAT 3:chunks_per_page 6898 STAT 3:total_pages 5164 STAT 3:total_chunks 35621272 STAT 3:used_chunks 35615218 STAT 3:free_chunks 6054 STAT 3:free_chunks_end 0 STAT 3:mem_requested 5042960945 STAT 3:get_hits 52627536 STAT 3:cmd_set 832629765 STAT 3:delete_hits 20814 STAT 3:incr_hits 0 STAT 3:decr_hits 0 STAT 3:cas_hits 0 STAT 3:cas_badval 0 STAT 4:chunk_size 192 STAT 4:chunks_per_page 5461 STAT 4:total_pages 98 STAT 4:total_chunks 535178 STAT 4:used_chunks 535178 STAT 4:free_chunks 0 STAT 4:free_chunks_end 0 STAT 4:mem_requested 18446744073444110870 STAT 4:get_hits 3951361 STAT 4:cmd_set 426309882 STAT 4:delete_hits 301012 STAT 4:incr_hits 0 STAT 4:decr_hits 0 STAT 4:cas_hits 0 STAT 4:cas_badval 0 STAT 5:chunk_size 240 STAT 5:chunks_per_page 4369 STAT 5:total_pages 1 STAT 5:total_chunks 4369 STAT 5:used_chunks 4361 STAT 5:free_chunks 8 STAT 5:free_chunks_end 0 STAT 5:mem_requested 18446744073703660821 STAT 5:get_hits 208048 STAT 5:cmd_set 27841509 STAT 5:delete_hits 56459 STAT 5:incr_hits 0 STAT 5:decr_hits 0 STAT 5:cas_hits 0 STAT 5:cas_badval 0 STAT 6:chunk_size 304 STAT 6:chunks_per_page 3449 STAT 6:total_pages 1 STAT 6:total_chunks 3449 STAT 6:used_chunks 3442 STAT 6:free_chunks 7 STAT 6:free_chunks_end 0 STAT 6:mem_requested 12807641 STAT 6:get_hits 122481 STAT 6:cmd_set 81722774 STAT 6:delete_hits 261246 STAT 6:incr_hits 0 STAT 6:decr_hits 0 STAT 6:cas_hits 0 STAT 6:cas_badval 0 STAT 7:chunk_size 384 STAT 7:chunks_per_page 2730 STAT 7:total_pages 1 STAT 7:total_chunks 2730 STAT 7:used_chunks 235 STAT 7:free_chunks 2495 STAT 7:free_chunks_end 0 STAT 7:mem_requested 833939 STAT 7:get_hits 1675991 STAT 7:cmd_set 2394225 STAT 7:delete_hits 181239 STAT 7:incr_hits 0 STAT 7:decr_hits 0 STAT 7:cas_hits 0 STAT 7
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
> El 09/01/12 06:12, dormando escribió: > > > Hey, could you please try to reproduce the issue with 1.4.11-beta1: > > http://code.google.com/p/memcached/wiki/ReleaseNotes1411beta1 > > > > I've closed the logic issues and fixed a few other things besides. Would > > be very good to know if you're still able to bug it out. > > > > Thanks! > > -Dormando > > Using 1.4.11-beta1 I can't reproduce "out of memory storing object" error, > thanks! Awesome! Thanks for verifying so quickly. I'll try to wrap this up into an -rc1 today. Hoping some other people try it too :)
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 09/01/12 06:12, dormando escribió: Hey, could you please try to reproduce the issue with 1.4.11-beta1: http://code.google.com/p/memcached/wiki/ReleaseNotes1411beta1 I've closed the logic issues and fixed a few other things besides. Would be very good to know if you're still able to bug it out. Thanks! -Dormando Using 1.4.11-beta1 I can't reproduce "out of memory storing object" error, thanks! --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
> dormando, with a new script setting a random exptime I can reproduce the > problem in a fresh memcached 1.4.10 (it doesn't happen with earlier versions): > > https://gist.github.com/1564556 > > With the first evictions memcached starts reporting "SERVER_ERROR out of > memory storing object". Those are the -default- settings that I'm using: > > [snip] > > Making a diff with 1.4.9 it seems that is something related with > do_item_alloc().. Hey, could you please try to reproduce the issue with 1.4.11-beta1: http://code.google.com/p/memcached/wiki/ReleaseNotes1411beta1 I've closed the logic issues and fixed a few other things besides. Would be very good to know if you're still able to bug it out. Thanks! -Dormando
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
> El 05/01/12 12:00, Santi Saez escribió: > > Making a diff with 1.4.9 it seems that is something related with > > do_item_alloc().. > More information: > > - reverting "remove the depth search from item_alloc" commit solves the > problem: > > https://github.com/memcached/memcached/commit/ca5016c54111e062c771d20fcc4662400713c634 > > - using exptime=0 it doesn't happen. I probably fixed this a few weeks ago in my branch, but I'm still wringing the bugs from it. If you can hold on for a few days for the beta and test that when I post it, it should be better. Thanks! -Dormando
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 05/01/12 12:00, Santi Saez escribió: Making a diff with 1.4.9 it seems that is something related with do_item_alloc().. More information: - reverting "remove the depth search from item_alloc" commit solves the problem: https://github.com/memcached/memcached/commit/ca5016c54111e062c771d20fcc4662400713c634 - using exptime=0 it doesn't happen. Regards, --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 30/12/11 17:51, dormando escribió: Now that you've left it out for a while, can you try storing a few things again and snapshot the items/slabs stats? I'm curious to see if the tailrepairs counter goes up at all. dormando, with a new script setting a random exptime I can reproduce the problem in a fresh memcached 1.4.10 (it doesn't happen with earlier versions): https://gist.github.com/1564556 With the first evictions memcached starts reporting "SERVER_ERROR out of memory storing object". Those are the -default- settings that I'm using: stats settings STAT maxbytes 67108864 STAT maxconns 1024 STAT tcpport 11211 STAT udpport 11211 STAT inter NULL STAT verbosity 0 STAT oldest 0 STAT evictions on STAT domain_socket NULL STAT umask 700 STAT growth_factor 1.25 STAT chunk_size 48 STAT num_threads 4 STAT num_threads_per_udp 4 STAT stat_key_prefix : STAT detail_enabled no STAT reqs_per_event 20 STAT cas_enabled yes STAT tcp_backlog 1024 STAT binding_protocol auto-negotiate STAT auth_enabled_sasl no STAT item_size_max 1048576 STAT maxconns_fast no STAT hashpower_init 0 END Making a diff with 1.4.9 it seems that is something related with do_item_alloc().. Regards, --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 30/12/11 17:51, dormando escribió: What client is this script written for, exactly? By 6 different servers you mean you're running 6 copies of that script from 6 places, or even more? It's a Python script I wrote to try to reproduce the error, but we're getting "out of memory" errors from PHP frontends. I run the same script from 6 different servers, using only one instance takes more time to reproduce the error but it also happens. Any other startup options? (or just list `stats settings` as well) stats settings STAT maxbytes 1090519040 STAT maxconns 50 STAT tcpport 11211 STAT udpport 11211 STAT inter 0.0.0.0:11211,0.0.0.0:11212,0.0.0.0:11213,0.0.0.0:11214,0.0.0.0:11215,0.0.0.0:11216,0.0.0.0:11217,0.0.0.0:11218,0.0.0.0:11219, STAT verbosity 0 STAT oldest 0 STAT evictions on STAT domain_socket NULL STAT umask 700 STAT growth_factor 1.25 STAT chunk_size 48 STAT num_threads 8 STAT num_threads_per_udp 1 STAT stat_key_prefix : STAT detail_enabled no STAT reqs_per_event 20 STAT cas_enabled yes STAT tcp_backlog 1024 STAT binding_protocol auto-negotiate STAT auth_enabled_sasl no STAT item_size_max 1048576 STAT maxconns_fast no STAT hashpower_init 0 END Now that you've left it out for a while, can you try storing a few things again and snapshot the items/slabs stats? I'm curious to see if the tailrepairs counter goes up at all. tailrepairs are always = 0. We have detected the same behavior in another server running memcached 1.4.10 with the same configuration, and we've been running 1.4.2 for years without this problem in +100 servers. Regards, --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
> Hello, > > After 3 weeks with memcached 1.4.10 in production, today we have start > getting randomly this error: > > SERVER_ERROR out of memory storing object with memcached > > I can reproduce it with a simple set+get loop, this is the Python > script that I have used (running the script from 6 different servers > it takes ~20 minutes to occur again): > > http://pastie.org/3096017 What client is this script written for, exactly? By 6 different servers you mean you're running 6 copies of that script from 6 places, or even more? > It runs with standard libevent package from Debian (version 1.4.13), > custom kernel (Linux 2.6.32.41), multiport support (32 x -l flag), > with -m 42000 (the servers has 48G) and -t 8 threads (same # of > cores). Any other startup options? (or just list `stats settings` as well) > SLAB, item, and general stats are also available on: > > http://pastie.org/3096085 > > The server is out of production and we haven't flushed it yet, so if > you want we can test whatever you want! :) Now that you've left it out for a while, can you try storing a few things again and snapshot the items/slabs stats? I'm curious to see if the tailrepairs counter goes up at all.
SERVER_ERROR out of memory storing object with memcached 1.4.10
Hello, After 3 weeks with memcached 1.4.10 in production, today we have start getting randomly this error: SERVER_ERROR out of memory storing object with memcached I can reproduce it with a simple set+get loop, this is the Python script that I have used (running the script from 6 different servers it takes ~20 minutes to occur again): http://pastie.org/3096017 It runs with standard libevent package from Debian (version 1.4.13), custom kernel (Linux 2.6.32.41), multiport support (32 x -l flag), with -m 42000 (the servers has 48G) and -t 8 threads (same # of cores). SLAB, item, and general stats are also available on: http://pastie.org/3096085 The server is out of production and we haven't flushed it yet, so if you want we can test whatever you want! :) Regards, --Santi
Re: SERVER_ERROR out of memory
Thanks for the answer, that was exactly what I was looking for. I just wanted to know that upgrading the server was a reasonable way to fix it. We've put a lot of logging around the area where we had the problem, so we'll be able to see if it happens again on this version, but hopefully it won't. :-) /Henrik On Thu, Mar 31, 2011 at 21:37, dormando wrote: > > Hella old, badly ported, tons and tons of bugs fixed in the middle, no > stats, so... you can only really guess as to what happened :) > > If I had to guess though, it'd be the old bug that was fixed with the > "tail repairs" function. Very rarely refcount's wouldn't get all released > on an item and it would get stuck in an unevictable form. once you got 50 > of those sitting at the bottom of the LRU you wouldn't be able to store > anything at all. Less than 50 and you'll get occasional errors. > > Seriously guys; running really old code then asking wtf happened is really > hard to deal with. I'm sort of amazed how often this comes up. We've added > gobs of stats to the newer versions so you can see exactly what goes > right/wrong, but the further back you go the more guesswork is involved. >
Re: SERVER_ERROR out of memory
> We were having some weird sporadic errors on our product, and after > scratching our heads a lot and digging down into it, it turned out that > we were getting the "SERVER_ERROR out of memory" error when storing items on > our memcached cluster, but here's the weird part: > > We only got the error occasionally. Most writes went ok, but some of them > just failed We estimated the error rate to about 1 in 30. > All memcached servers had grown to the memory limit we set (512MB). > I ran stats slabs, and there were plenty of slabs of all sizes. > The number of evictions ticked up slowly, but definitely not as fast as it > should, given the rate at which we stored items. > The items that failed were all very small, with an expiry of 5 seconds. > And we were running version 1.2.5 for Windows. > And we weren't running with the -M option. > > We upgraded to version 1.4.4 now, and restarted them, and it'll take a week > or two for the cache servers to get full again, and we're > hoping the error won't come back. > > > But what happened? How could we get that error, when the servers just should > have evicted lots of objects instead? How come only a fraction > of the writes failed that way? What does the error actually mean, since the > servers obviously weren't out of memory? And how can we prevent > it from happening again? Hella old, badly ported, tons and tons of bugs fixed in the middle, no stats, so... you can only really guess as to what happened :) If I had to guess though, it'd be the old bug that was fixed with the "tail repairs" function. Very rarely refcount's wouldn't get all released on an item and it would get stuck in an unevictable form. once you got 50 of those sitting at the bottom of the LRU you wouldn't be able to store anything at all. Less than 50 and you'll get occasional errors. Seriously guys; running really old code then asking wtf happened is really hard to deal with. I'm sort of amazed how often this comes up. We've added gobs of stats to the newer versions so you can see exactly what goes right/wrong, but the further back you go the more guesswork is involved.
SERVER_ERROR out of memory
We were having some weird sporadic errors on our product, and after scratching our heads a lot and digging down into it, it turned out that we were getting the "SERVER_ERROR out of memory" error when storing items on our memcached cluster, but here's the weird part: We only got the error occasionally. Most writes went ok, but some of them just failed We estimated the error rate to about 1 in 30. All memcached servers had grown to the memory limit we set (512MB). I ran stats slabs, and there were plenty of slabs of all sizes. The number of evictions ticked up slowly, but definitely not as fast as it should, given the rate at which we stored items. The items that failed were all very small, with an expiry of 5 seconds. And we were running version 1.2.5 for Windows. And we weren't running with the -M option. We upgraded to version 1.4.4 now, and restarted them, and it'll take a week or two for the cache servers to get full again, and we're hoping the error won't come back. But what happened? How could we get that error, when the servers just should have evicted lots of objects instead? How come only a fraction of the writes failed that way? What does the error actually mean, since the servers obviously weren't out of memory? And how can we prevent it from happening again? /Henrik Schröder
Re: SERVER_ERROR out of memory
Strongly recommend you upgrade to a much newer version. There've been huge numbers of bug fixes and clearer responses in situations like this. On Apr 19, 3:38 am, Deepan Chakravarthy wrote: > Hi, > I am getting SERVER_ERROR out of memory while I try to store one > particular set of data. I have allocated 256mb for memcache. Memcache has > only used 302100 bytes (less than 1mb)of memory so far. below are outputs > from memcache log and memcache stats. > > <7 get rankdatapre>7 END > > <8 new client connection > <8 get 889483>8 sending key 889483 > >8 END > <8 get 889483 > >8 sending key 889483 > >8 END > > <8 set 889483 1 36000 578>8 STORED > > <8 connection closed. > <8 new client connection > <8 get mytg690411576>8 END > > <8 set mytg690411576 1 1800 1969>8 STORED > > <8 connection closed. > <7 set rankdatapre 1 3000 2528612 > > >7 SERVER_ERROR out of memory > > I get errors while i try to set rankdatapre. It has 2528612 bytes (2.4MB) > of data. Below is the output from memcache stats. > > Array > ( > [pid] => 24831 > [uptime] => 1129 > [time] => 1240134667 > [version] => 1.1.12 > [rusage_user] => 0.37 > [rusage_system] => 0.84 > [curr_items] => 484 > [total_items] => 2254 > [bytes] => 302100 > [curr_connections] => 1 > [total_connections] => 1567 > [connection_structures] => 6 > [cmd_get] => 5603 > [cmd_set] => 2254 > [get_hits] => 4686 > [get_misses] => 917 > [bytes_read] => 1007864320 > [bytes_written] => 2650966 > [limit_maxbytes] => 268435456 > )
Re: SERVER_ERROR out of memory
Hello! On Sun, Apr 19, 2009 at 04:08:20PM +0530, Deepan Chakravarthy wrote: > I am getting SERVER_ERROR out of memory while I try to store one > particular set of data. I have allocated 256mb for memcache. Memcache has > only used 302100 bytes (less than 1mb)of memory so far. below are outputs > from memcache log and memcache stats. [...] > I get errors while i try to set rankdatapre. It has 2528612 bytes (2.4MB) > of data. Below is the output from memcache stats. http://code.google.com/p/memcached/wiki/FAQ#Why_are_items_limited_to_1_megabyte_in_size? Maxim Dounin
SERVER_ERROR out of memory
Hi, I am getting SERVER_ERROR out of memory while I try to store one particular set of data. I have allocated 256mb for memcache. Memcache has only used 302100 bytes (less than 1mb)of memory so far. below are outputs from memcache log and memcache stats. <7 get rankdatapre >7 END <8 new client connection <8 get 889483 >8 sending key 889483 >8 END <8 get 889483 >8 sending key 889483 >8 END <8 set 889483 1 36000 578 >8 STORED <8 connection closed. <8 new client connection <8 get mytg690411576 >8 END <8 set mytg690411576 1 1800 1969 >8 STORED <8 connection closed. <7 set rankdatapre 1 3000 2528612 >7 SERVER_ERROR out of memory I get errors while i try to set rankdatapre. It has 2528612 bytes (2.4MB) of data. Below is the output from memcache stats. Array ( [pid] => 24831 [uptime] => 1129 [time] => 1240134667 [version] => 1.1.12 [rusage_user] => 0.37 [rusage_system] => 0.84 [curr_items] => 484 [total_items] => 2254 [bytes] => 302100 [curr_connections] => 1 [total_connections] => 1567 [connection_structures] => 6 [cmd_get] => 5603 [cmd_set] => 2254 [get_hits] => 4686 [get_misses] => 917 [bytes_read] => 1007864320 [bytes_written] => 2650966 [limit_maxbytes] => 268435456 )