Thanks Graham. We'll try it with flock and get you a backtrace if it happens again.
On Oct 19, 4:49 pm, Graham Dumpleton <[email protected]> wrote: > On 20 October 2010 10:39, Patrick Michael Kane <[email protected]> wrote: > > > > > > > > > > > Hey Graham: > > > Replies inline: > > > On Tue, Oct 19, 2010 at 4:21 PM, Graham Dumpleton < > > [email protected]> wrote: > > >> On 20 October 2010 00:54, Patrick Michael Kane <[email protected]> > >> wrote: > >> > Hey all: > > >> > We're running a Django site using Apache+mod_wsgi. The stack is > >> > pretty typical: Apache 2.2.14, mod_wsgi 3.3, Python 2.6.1. We're > >> > running single-threaded using DaemonProcess (WSGIDaemonProcess staging > >> > threads=1 processes=8 maximum-requests=10 python-path=[path] display- > >> > name=%{GROUP}). > > >> > We're seeing an issue where the Apache server will stop responding to > >> > all requests (even non-WSGI ones) after a fairly large number of > >> > requests. When we strace the wsgi processes, they are all in poll(), > >> > waiting to be talked to: > > >> > poll([{fd=3, events=POLLIN}], 1, -1... > > >> > When we look at the httpds, they are blocking on connect to the > >> > mod_wsgi unix socket: > > >> > connect(69, {sa_family=AF_FILE, path="/home/actionkit/releases/stable/ > >> > apache/logs/.1873.28.7.sock"}, 110 > > >> > A restart fixes it and then the problem goes away for a few days to a > >> > week. We're seeing the same behavior across 3 identical webservers. > > >> > The problem happens once every few days. > > >> > If folks have any ideas on the cause, I'm all ears. Alternatively, if > >> > there's additional steps we can take to debug that would be helpful, > >> > let me know. > > >> Can you provide the output of running -V option on Apache. Eg: > > >> /usr/sbin/httpd -V > > >> Want to verify what MPM you are using and some of the compilation options. > > > Server version: Apache/2.2.3 > > Server built: Jan 21 2009 22:00:55 > > Server's Module Magic Number: 20051115:3 > > Server loaded: APR 1.2.7, APR-Util 1.2.7 > > Compiled using: APR 1.2.7, APR-Util 1.2.7 > > Architecture: 64-bit > > Server MPM: Prefork > > threaded: no > > forked: yes (variable process count) > > Server compiled with.... > > -D APACHE_MPM_DIR="server/mpm/prefork" > > -D APR_HAS_SENDFILE > > -D APR_HAS_MMAP > > -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) > > -D APR_USE_SYSVSEM_SERIALIZE > > -D APR_USE_PTHREAD_SERIALIZE > > The above two definitions mean that when 'default' accept mutex type > is used that 'sysvsem' is used. This is the one I am worried about. > > In your Apache configuration set: > > WSGIAcceptMutex flock > > and see how things go. > > I'll be updating mod_wsgi code in trunk to deal with problem I saw and > add extra debugging to try and highlight when problem occurs and work > out why it might be happening. > > > > > > > > > > > -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT > > -D APR_HAS_OTHER_CHILD > > -D AP_HAVE_RELIABLE_PIPED_LOGS > > -D DYNAMIC_MODULE_LIMIT=128 > > -D HTTPD_ROOT="/etc/httpd" > > -D SUEXEC_BIN="/usr/sbin/suexec" > > -D DEFAULT_PIDLOG="logs/httpd.pid" > > -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" > > -D DEFAULT_LOCKFILE="logs/accept.lock" > > -D DEFAULT_ERRORLOG="logs/error_log" > > -D AP_TYPES_CONFIG_FILE="conf/mime.types" > > -D SERVER_CONFIG_FILE="conf/httpd.conf" > > > As to strace on WSGI processes, really need a gdb stack trace on all > >> threads if you can manage to get one. > > >> I need to see if the actual request handler threads have exited. My > >> latest theory is that they might have exited. The poll() may be the > >> main thread which is just waiting for shutdown indicator to be sent > >> via a socketpair from signal handler. > > > I can get a backtrace, but we're not compiled with -g in our > > production > > environment. Is a non-debugging backtrace going to be useful? > > So long as not -O compiled, should still show functions. > > As documented in: > > http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_C... > > use: > > thread apply all bt > > > It > > would be > > possible for us to compile mod_wsgi with -g, if that were helpful, but > > redeploying our Apache with debugging would be a tall order. > > > Since we're running single-threaded, is a single process is ok? > > If you mean single daemon process, probably not as means requests are > sequential and can't handle concurrent requests, which depending on > application and amount of traffic may be bad. > > > > > > > > > > >> Can you also add to your Apache configuration files at global scope: > > >> WSGIAcceptMutex xxx > > > Accept mutex lock mechanism 'xxx' is invalid. Valid accept mutex > > mechanisms > > for this platform are: default, flock, fcntl, sysvsem, pthread. > > >> Do the same thing for the directive: > > >> AcceptMutex xxx > > > xxx is an invalid mutex mechanism; Valid accept mutexes for this > > platform > > and MPM are: default, flock, fcntl, sysvsem, pthread. > > >> BTW, any reason you are running maximum-requests so low? One of the > >> observations in the past is that the problem is made worse when WSGI > >> processes are being restarted a lot as would be case for small number > >> of maximum requests. > > > Memory leaks. > > Hmmm, pretty severe leaks if has to be that low. > > Do you know what is causing it and whether affects all URLs or only > code used by some URLs? > > What one could do is create multiple daemon process groups and > delegate just the subset of URLs which exhibit memory creap/leaks to > daemon process group of their own and set maximum requests low on > that, but for every else in original daemon process group, eliminate > maximum requests and allow process to stay resident all the time. > Would give better overall performance as not restarting processes and > having to reload application all the time. > > Graham -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
