#11331: Memcached backend closes connection after every request ---------------------------------------+------------------------------------ Reporter: boo...@gmail.com | Owner: nobody Status: new | Milestone: Component: Cache system | Version: 1.0 Resolution: | Keywords: Stage: Unreviewed | Has_patch: 0 Needs_docs: 0 | Needs_tests: 0 Needs_better_patch: 0 | ---------------------------------------+------------------------------------ Comment (by boo...@gmail.com):
I've managed to trigger it on multiple systems so far - as python really doesn't care about where it runs, we develop on windows and test in a vm with ubuntu server (8.10 x64); deployment will be on ubuntu 8.04 LTS x64. Using Python-memcached. I run memcached locally with -vv, so every event is printed to console. When I started using it, I was already wondering about the part with "<104 connection closed." "<104 new client connection" spam, but I didn't give it any thought as it was much faster than "good enough" already. The problem is when we added feeds, we made sure that a cache hit meant 0 queries to the database - that, plus no template rendering (simplejson.dumps, pprint or xml.etree magic) meant that you get a huge throughput when benchmarked in a loop. It probably costs as much cpu as getting hello world over memcached. Example url: http://worldoflogs.com/feeds/guilds/7/raids/?t=plain I think TIME_WAIT on both the vm and windows is a minute, it takes about that long to recover from the out of sockets condition. When netstat -an | grep TIME_WAIT | wc -l reaches around 10k, connections start to fail increasing chance the longer I let JMeter run. Oh, and that netstat -an command sometimes returns 0 when there's too many of them open under windows, I bet they never stress tested that :P Finally, there was no beefy hardware involved triggering this: Local: manage.py runserver (/me hides), standard core 2 duo desktop, 1 request thread in JMeter. VM: apache 2.2 / mod_wsgi 2.5 / python 2.5 / django 1.02, VirtualBox, 1 cpu. 1 request thread, JMeter runs on the VM host. On production, we would probably run out of socket much quicker, with 2x quad core xeons (core2 generation) on each machine, in theory. In practice, our traffic isn't this high yet, during the peak, there were usually 3k sockets stuck in TIME_WAIT, that number is reduced by half now with ~1.3k in TIME_WAIT (I blame no pconnect in psycopg2 backend and port 80), 150 ESTABLISHED and 200 FIN_WAIT1/2. (output from netstat -an --tcp | awk '/tcp/ {print $6}' | sort | uniq -c) Fun fact: during benchmarks, the python process, memcached and JMeter uses equal amounts of resources -- it's incredible how efficient the python side of the setup is, 75 requests for a single thread for the whole combination is just silly. @tuning: preferably not - it works fine without the disconnect_all() call after every request, I've saw a few horror story threads about how it messes up clients behind a NAT and break random things, I prefer not to find out about that on production. -- Ticket URL: <http://code.djangoproject.com/ticket/11331#comment:2> Django <http://code.djangoproject.com/> The Web framework for perfectionists with deadlines. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django updates" group. To post to this group, send email to django-updates@googlegroups.com To unsubscribe from this group, send email to django-updates+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-updates?hl=en -~----------~----~----~----~------~----~------~--~---