On Fri, Mar 13, 2009 at 02:19, meppum <mmep...@gmail.com> wrote: > > I was load testing memcached and have been experiencing consistent > crashing when approaching 11k total connections (not concurrent). > > Load testing of course has its uses, but is your scenario even remotely likely to happen in your live system? What your scripts is actually testing is how fast you can recycle sockets, but if you use a client that supports connection pooling, this is never going to become an issue.
Here's some stats from one of our cache servers: uptime 449124 time 1237196926 version 1.2.5 curr_items 168078 curr_connections 24 total_connections 334 connection_structures 38 cmd_get 642209176 cmd_set 6052589 get_hits 499909822 get_misses 142299354 It's getting 1400 gets per second in average, many of those are multi-gets, so the actual amount of requests per second is maybe a tenth of that, say that it gets somewhere between 100-150 req/s, but because all clients use connection pooling, the number of concurrent connections is only 24, and in the five days it's been up, it's been going through a total of 334 connections, and those are because the connection-pools vary in size according to load, so there's some recycling going on. I cannot imagine what kind of load we would need to put on our systems to run into the problem your are testing for, but I'm sure that we'd encounter many many other problems before we reach that point. You didn't say much about your application, but doesn't the Python client support connection pooling? Why don't you make sure you use that, instead of lowering some TCP timeout values on your servers so that they can recycle sockets faster? It seems to me that you are looking for a complex solution to a non-problem. /Henrik