Re: Memcached crashing under load
On Fri, Mar 13, 2009 at 02:19, meppum mmep...@gmail.com wrote: I was load testing memcached and have been experiencing consistent crashing when approaching 11k total connections (not concurrent). Load testing of course has its uses, but is your scenario even remotely likely to happen in your live system? What your scripts is actually testing is how fast you can recycle sockets, but if you use a client that supports connection pooling, this is never going to become an issue. Here's some stats from one of our cache servers: uptime 449124 time 1237196926 version1.2.5 curr_items 168078 curr_connections 24 total_connections 334 connection_structures 38 cmd_get642209176 cmd_set6052589 get_hits 499909822 get_misses 142299354 It's getting 1400 gets per second in average, many of those are multi-gets, so the actual amount of requests per second is maybe a tenth of that, say that it gets somewhere between 100-150 req/s, but because all clients use connection pooling, the number of concurrent connections is only 24, and in the five days it's been up, it's been going through a total of 334 connections, and those are because the connection-pools vary in size according to load, so there's some recycling going on. I cannot imagine what kind of load we would need to put on our systems to run into the problem your are testing for, but I'm sure that we'd encounter many many other problems before we reach that point. You didn't say much about your application, but doesn't the Python client support connection pooling? Why don't you make sure you use that, instead of lowering some TCP timeout values on your servers so that they can recycle sockets faster? It seems to me that you are looking for a complex solution to a non-problem. /Henrik
Re: Memcached crashing under load
As a simple test, try increasing the backlog int from the default (1024) of memcached higher. http://linux.die.net/man/2/listen The backlog parameter defines the maximum length the queue of pending connections may grow to. If a connection request arrives with the queue full the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that retries succeed. -- Chris Goffinet MyBlogLog Senior Performance Engineer Yahoo! San Francisco, CA United States On Mar 15, 2009, at 5:51 PM, meppum wrote: I've created a python script that should trigger this error on any machine running python-memcached or cmemcached. Hopefully this will clear things up. It looks like neither the client or the daemon crash, but that the daemon cannot respond in time so an error is thrown. I find this odd because the curr_connections stat never exceeds the number of possible connections. If anyone can shed some light on this i'd appreciate it. This is the how i started memcached: memcached -m 8 -c 1024 -v -l 127.0.0.1 -d The script: try: import cmemcache as memcache except ImportError: import memcache c = memcache.Client([127.0.0.1:11211]) c.set('abc', '123') c.disconnect_all() for i in range(2): if i % 1000 == 0: print iteration: %s % i c = memcache.Client([127.0.0.1:11211]) c.get('abc') c.disconnect_all() On Mar 13, 7:32 am, meppum mmep...@gmail.com wrote: That's a good question. To be clear it's not a hard crash, it's just that one or the other starts throwing errors regarding connection timeouts when there should be more than enough connections available. How can I tell if it's the client or server throwing the errors? On Mar 12, 9:33 pm, Dustin dsalli...@gmail.com wrote: For clarity -- are you saying the server is crashing, or the client? On Mar 12, 6:19 pm, meppum mmep...@gmail.com wrote: I was load testing memcached and have been experiencing consistent crashing when approaching 11k total connections (not concurrent). Below is a sample of some python code I have developed to isolate this problem as well as my setup and the error I get. I searched google and couldn't seem to find an answer. -- Python Code: import cmemcache c = cmemcache.Client([127.0.0.1:11211]) c.set('abc', '123') c.disconnect_all() for i in range(2): c = cmemcache.Client([127.0.0.1:11211]) c.get('abc') c.disconnect_all() - Error: [w...@1236906533.149172] mcm_server_connect_next_avail():2338 [not...@1236906533.149172] mcm_server_connect_next_avail():2328 [w...@1236906537.889442] mcm_server_writable():3178: timeout: Operation now in progress: write select(2) call timed out [w...@1236906537.889442] mcm_server_connect():2295: select(2) failed: Operation now in progress: select(2) timed out on establishing connection connect(): -1 [not...@1236906537.889442] mcm_server_connect():2302: Operation already in progress [not...@1236906537.889442] mcm_server_connect_next_avail():2333: Operation already in progress [w...@1236906537.889442] mcm_server_connect_next_avail():2338 [not...@1236906537.889442] mcm_server_connect_next_avail():2328 - Setup: -Ubuntu Intrepid -Libmemcached 1.4.0.rc2-1 -Cmemcache 0.95 -Memcached 1.2.6 -Python 2.5 -meppum
Re: Memcached crashing under load
On Sun, Mar 15, 2009 at 8:49 PM, meppum mmep...@gmail.com wrote: The script: try: import cmemcache as memcache except ImportError: import memcache c = memcache.Client([127.0.0.1:11211]) c.set('abc', '123') c.disconnect_all() for i in range(2): if i % 1000 == 0: print iteration: %s % i c = memcache.Client([127.0.0.1:11211]) c.get('abc') c.disconnect_all() This script will not run multiple memcached requests in parallel. Is that what you were going for? -- David blog: http://www.traceback.org twitter: http://twitter.com/dstanek
Re: Memcached crashing under load
When I run your script + daemon, I noticed some timeouts. I adjusted these sysctl and managed to keep running the script over and over, and could not see timeouts: sudo /sbin/sysctl -w net.ipv4.tcp_tw_recycle=1 sudo /sbin/sysctl -w net.ipv4.tcp_tw_reuse=1 sudo /sbin/sysctl -w net.ipv4.tcp_fin_timeout=10 On your system, what happens when you do the same? Do you see any improvement? -- Chris Goffinet MyBlogLog Senior Performance Engineer Yahoo! San Francisco, CA United States On Mar 15, 2009, at 6:39 PM, meppum wrote: Yes, i went for the simplest script that caused the error on my machine. On Mar 15, 9:29 pm, David Stanek dsta...@dstanek.com wrote: On Sun, Mar 15, 2009 at 8:49 PM, meppum mmep...@gmail.com wrote: The script: try: import cmemcache as memcache except ImportError: import memcache c = memcache.Client([127.0.0.1:11211]) c.set('abc', '123') c.disconnect_all() for i in range(2): if i % 1000 == 0: print iteration: %s % i c = memcache.Client([127.0.0.1:11211]) c.get('abc') c.disconnect_all() This script will not run multiple memcached requests in parallel. Is that what you were going for? -- David blog:http://www.traceback.org twitter:http://twitter.com/dstanek
Re: Memcached crashing under load
For clarity -- are you saying the server is crashing, or the client? On Mar 12, 6:19 pm, meppum mmep...@gmail.com wrote: I was load testing memcached and have been experiencing consistent crashing when approaching 11k total connections (not concurrent). Below is a sample of some python code I have developed to isolate this problem as well as my setup and the error I get. I searched google and couldn't seem to find an answer. -- Python Code: import cmemcache c = cmemcache.Client([127.0.0.1:11211]) c.set('abc', '123') c.disconnect_all() for i in range(2): c = cmemcache.Client([127.0.0.1:11211]) c.get('abc') c.disconnect_all() - Error: [w...@1236906533.149172] mcm_server_connect_next_avail():2338 [not...@1236906533.149172] mcm_server_connect_next_avail():2328 [w...@1236906537.889442] mcm_server_writable():3178: timeout: Operation now in progress: write select(2) call timed out [w...@1236906537.889442] mcm_server_connect():2295: select(2) failed: Operation now in progress: select(2) timed out on establishing connection connect(): -1 [not...@1236906537.889442] mcm_server_connect():2302: Operation already in progress [not...@1236906537.889442] mcm_server_connect_next_avail():2333: Operation already in progress [w...@1236906537.889442] mcm_server_connect_next_avail():2338 [not...@1236906537.889442] mcm_server_connect_next_avail():2328 - Setup: -Ubuntu Intrepid -Libmemcached 1.4.0.rc2-1 -Cmemcache 0.95 -Memcached 1.2.6 -Python 2.5 -meppum