Re: SERVER_ERROR out of memory storing object
On Wed, Sep 25, 2013 at 9:58 AM, dormando dorma...@rydia.net wrote: If you have a memory limit of 2MB, and start uploading 3 1MB objects, the third one will cause an out of memory error. During upload a free object is pulled to be written into. If you are actively writing to, or actively reading from + writing to, more objects than are available for it to reserve, it'll bail with an OOM error. It's only able to look at the tail for this, so it's more common with large objects. I apologize for not understanding more. So, all pages have been allocated (i.e. Memcached cannot just grab a new page) and there's nothing in the tail of the queue for something to evict. And that's because all recent items are being written at the same moment? That is, there's no more memory and Memcached is too busy with active items to find something to throw out? If this is rate related would one solution be to add more servers into the pool to spread out the load? Our Memcached isn't that busy -- if I can trust our Zabbix graphs 800 gets/s and 250 set/s and 1 evection/s. About 40M objects and 2500 connections on a single Memcached server. We do have large objects (which is a problem we need to fix). Why is OOM error more likely with large objects? Were there issues with the latest version? No, I could not get the latest version to issue the OOM error, but that was in a dev environment. I don't think I could get our old production version 1.4.4 to issue the OOM error either under dev. But, I have a lot more testing to do. The timeouts on production are a much bigger concern at this time. Thanks, -- Bill Moseley mose...@hank.org -- --- You received this message because you are subscribed to the Google Groups memcached group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: SERVER_ERROR out of memory storing object
I apologize for not understanding more. So, all pages have been allocated (i.e. Memcached cannot just grab a new page) and there's nothing in the tail of the queue for something to evict. And that's because all recent items are being written at the same moment? That is, there's no more memory and Memcached is too busy with active items to find something to throw out? If this is rate related would one solution be to add more servers into the pool to spread out the load? Our Memcached isn't that busy -- if I can trust our Zabbix graphs 800 gets/s and 250 set/s and 1 evection/s. About 40M objects and 2500 connections on a single Memcached server. We do have large objects (which is a problem we need to fix). Why is OOM error more likely with large objects? Look through 'stats slabs' and 'stats items' - your 1MB object space probably only has a handful of chunks on it (maybe even just one). That means it has very little memory to work with. Were there issues with the latest version? No, I could not get the latest version to issue the OOM error, but that was in a dev environment. I don't think I could get our old production version 1.4.4 to issue the OOM error either under dev. But, I have a lot more testing to do. The timeouts on production are a much bigger concern at this time. I haven't looked at your timeouts mail yet, sorry. The latest version should make those OOM errors less likely. You can also use slab rebalance to give more memory to the larger slab classes. -- --- You received this message because you are subscribed to the Google Groups memcached group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 09/01/12 06:12, dormando escribió: Hey, could you please try to reproduce the issue with 1.4.11-beta1: http://code.google.com/p/memcached/wiki/ReleaseNotes1411beta1 I've closed the logic issues and fixed a few other things besides. Would be very good to know if you're still able to bug it out. Thanks! -Dormando Using 1.4.11-beta1 I can't reproduce out of memory storing object error, thanks! --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 09/01/12 06:12, dormando escribió: Hey, could you please try to reproduce the issue with 1.4.11-beta1: http://code.google.com/p/memcached/wiki/ReleaseNotes1411beta1 I've closed the logic issues and fixed a few other things besides. Would be very good to know if you're still able to bug it out. Thanks! -Dormando Using 1.4.11-beta1 I can't reproduce out of memory storing object error, thanks! Awesome! Thanks for verifying so quickly. I'll try to wrap this up into an -rc1 today. Hoping some other people try it too :)
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
dormando, with a new script setting a random exptime I can reproduce the problem in a fresh memcached 1.4.10 (it doesn't happen with earlier versions): https://gist.github.com/1564556 With the first evictions memcached starts reporting SERVER_ERROR out of memory storing object. Those are the -default- settings that I'm using: [snip] Making a diff with 1.4.9 it seems that is something related with do_item_alloc().. Hey, could you please try to reproduce the issue with 1.4.11-beta1: http://code.google.com/p/memcached/wiki/ReleaseNotes1411beta1 I've closed the logic issues and fixed a few other things besides. Would be very good to know if you're still able to bug it out. Thanks! -Dormando
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 30/12/11 17:51, dormando escribió: Now that you've left it out for a while, can you try storing a few things again and snapshot the items/slabs stats? I'm curious to see if the tailrepairs counter goes up at all. dormando, with a new script setting a random exptime I can reproduce the problem in a fresh memcached 1.4.10 (it doesn't happen with earlier versions): https://gist.github.com/1564556 With the first evictions memcached starts reporting SERVER_ERROR out of memory storing object. Those are the -default- settings that I'm using: stats settings STAT maxbytes 67108864 STAT maxconns 1024 STAT tcpport 11211 STAT udpport 11211 STAT inter NULL STAT verbosity 0 STAT oldest 0 STAT evictions on STAT domain_socket NULL STAT umask 700 STAT growth_factor 1.25 STAT chunk_size 48 STAT num_threads 4 STAT num_threads_per_udp 4 STAT stat_key_prefix : STAT detail_enabled no STAT reqs_per_event 20 STAT cas_enabled yes STAT tcp_backlog 1024 STAT binding_protocol auto-negotiate STAT auth_enabled_sasl no STAT item_size_max 1048576 STAT maxconns_fast no STAT hashpower_init 0 END Making a diff with 1.4.9 it seems that is something related with do_item_alloc().. Regards, --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 05/01/12 12:00, Santi Saez escribió: Making a diff with 1.4.9 it seems that is something related with do_item_alloc().. More information: - reverting remove the depth search from item_alloc commit solves the problem: https://github.com/memcached/memcached/commit/ca5016c54111e062c771d20fcc4662400713c634 - using exptime=0 it doesn't happen. Regards, --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 05/01/12 12:00, Santi Saez escribió: Making a diff with 1.4.9 it seems that is something related with do_item_alloc().. More information: - reverting remove the depth search from item_alloc commit solves the problem: https://github.com/memcached/memcached/commit/ca5016c54111e062c771d20fcc4662400713c634 - using exptime=0 it doesn't happen. I probably fixed this a few weeks ago in my branch, but I'm still wringing the bugs from it. If you can hold on for a few days for the beta and test that when I post it, it should be better. Thanks! -Dormando
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
El 30/12/11 17:51, dormando escribió: What client is this script written for, exactly? By 6 different servers you mean you're running 6 copies of that script from 6 places, or even more? It's a Python script I wrote to try to reproduce the error, but we're getting out of memory errors from PHP frontends. I run the same script from 6 different servers, using only one instance takes more time to reproduce the error but it also happens. Any other startup options? (or just list `stats settings` as well) stats settings STAT maxbytes 1090519040 STAT maxconns 50 STAT tcpport 11211 STAT udpport 11211 STAT inter 0.0.0.0:11211,0.0.0.0:11212,0.0.0.0:11213,0.0.0.0:11214,0.0.0.0:11215,0.0.0.0:11216,0.0.0.0:11217,0.0.0.0:11218,0.0.0.0:11219, STAT verbosity 0 STAT oldest 0 STAT evictions on STAT domain_socket NULL STAT umask 700 STAT growth_factor 1.25 STAT chunk_size 48 STAT num_threads 8 STAT num_threads_per_udp 1 STAT stat_key_prefix : STAT detail_enabled no STAT reqs_per_event 20 STAT cas_enabled yes STAT tcp_backlog 1024 STAT binding_protocol auto-negotiate STAT auth_enabled_sasl no STAT item_size_max 1048576 STAT maxconns_fast no STAT hashpower_init 0 END Now that you've left it out for a while, can you try storing a few things again and snapshot the items/slabs stats? I'm curious to see if the tailrepairs counter goes up at all. tailrepairs are always = 0. We have detected the same behavior in another server running memcached 1.4.10 with the same configuration, and we've been running 1.4.2 for years without this problem in +100 servers. Regards, --Santi
Re: SERVER_ERROR out of memory storing object with memcached 1.4.10
Hello, After 3 weeks with memcached 1.4.10 in production, today we have start getting randomly this error: SERVER_ERROR out of memory storing object with memcached I can reproduce it with a simple set+get loop, this is the Python script that I have used (running the script from 6 different servers it takes ~20 minutes to occur again): http://pastie.org/3096017 What client is this script written for, exactly? By 6 different servers you mean you're running 6 copies of that script from 6 places, or even more? It runs with standard libevent package from Debian (version 1.4.13), custom kernel (Linux 2.6.32.41), multiport support (32 x -l flag), with -m 42000 (the servers has 48G) and -t 8 threads (same # of cores). Any other startup options? (or just list `stats settings` as well) SLAB, item, and general stats are also available on: http://pastie.org/3096085 The server is out of production and we haven't flushed it yet, so if you want we can test whatever you want! :) Now that you've left it out for a while, can you try storing a few things again and snapshot the items/slabs stats? I'm curious to see if the tailrepairs counter goes up at all.