#11331: Memcached backend closes connection after every request
---------------------------------------+------------------------------------
          Reporter:  boo...@gmail.com  |         Owner:  nobody
            Status:  new               |     Milestone:        
         Component:  Cache system      |       Version:  1.0   
        Resolution:                    |      Keywords:        
             Stage:  Unreviewed        |     Has_patch:  0     
        Needs_docs:  0                 |   Needs_tests:  0     
Needs_better_patch:  0                 |  
---------------------------------------+------------------------------------
Comment (by boo...@gmail.com):

 I've managed to trigger it on multiple systems so far - as python really
 doesn't care about where it runs, we develop on windows and test in a vm
 with ubuntu server (8.10 x64); deployment will be on ubuntu 8.04 LTS x64.
 Using Python-memcached.

 I run memcached locally with -vv, so every event is printed to console.
 When I started using it, I was already wondering about the part with "<104
 connection closed." "<104 new client connection" spam, but I didn't give
 it any thought as it was much faster than "good enough" already.

 The problem is when we added feeds, we made sure that a cache hit meant 0
 queries to the database - that, plus no template rendering
 (simplejson.dumps, pprint or xml.etree magic) meant that you get a huge
 throughput when benchmarked in a loop. It probably costs as much cpu as
 getting hello world over memcached. Example url:
 http://worldoflogs.com/feeds/guilds/7/raids/?t=plain

 I think TIME_WAIT on both the vm and windows is a minute, it takes about
 that long to recover from the out of sockets condition. When netstat -an |
 grep TIME_WAIT | wc -l reaches around 10k, connections start to fail
 increasing chance the longer I let JMeter run. Oh, and that netstat -an
 command sometimes returns 0 when there's too many of them open under
 windows, I bet they never stress tested that :P


 Finally, there was no beefy hardware involved triggering this:

 Local: manage.py runserver (/me hides), standard core 2 duo desktop, 1
 request thread in JMeter.
 VM: apache 2.2 / mod_wsgi 2.5 / python 2.5 / django 1.02, VirtualBox, 1
 cpu. 1 request thread, JMeter runs on the VM host.

 On production, we would probably run out of socket much quicker, with 2x
 quad core xeons (core2 generation) on each machine, in theory. In
 practice, our traffic isn't this high yet, during the peak, there were
 usually 3k sockets stuck in TIME_WAIT, that number is reduced by half now
 with ~1.3k in TIME_WAIT (I blame no pconnect in psycopg2 backend and port
 80), 150 ESTABLISHED and 200 FIN_WAIT1/2. (output from netstat -an --tcp |
 awk '/tcp/ {print $6}' | sort | uniq -c)

 Fun fact: during benchmarks, the python process, memcached and JMeter uses
 equal amounts of resources -- it's incredible how efficient the python
 side of the setup is, 75 requests for a single thread for the whole
 combination is just silly.


 @tuning: preferably not - it works fine without the disconnect_all() call
 after every request, I've saw a few horror story threads about how it
 messes up clients behind a NAT and break random things, I prefer not to
 find out about that on production.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/11331#comment:2>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to