On Nov 16, 5:15 pm, Graham Dumpleton <[email protected]>
wrote:

> > On the system that was exhibiting the problem, I moved the daemon
> > count from 4 to 8, and threads from 5 to 8, and bumped up the max
> > requests. This appears to have mitigated the problem, but as it
> > appears for now to be a bug either within mod_wsgi or apache I'm happy
> > to help any way I can in tracking it down.
>
> Are you in a position to be able to upgrade versions of Apache and mod_wsgi?
>
> Are you willing to run mod_wsgi from subversion trunk? The subversion
> version has additional logging to help gather more information abbot
> this issue.

Upgraded mod_wsgi I can do. However, since I have the aforementioned
changes the web server cluster has run with 100% uptime for 5 days,
when previously we averaged one failure per day (across a 3 machine
farm). Whatever the problem is, having more daemon processes per group
seems to mitigte it.


> Can you try and get stack traces of stuck daemon process group using
> gdb script recipe right at end of:
>
>    http://code.google.com/p/modwsgi/wiki/DebuggingTechniques
>
> Please also post whether using prefork Apache MPM, plus mod_wsgi
> daemon process and related directives from mod_wsgi configuration.

We use prefork (havn't seen the need to switch) and run 43 vhosts each
which have the following configuration:

WSGIDaemonProcess <name> user=apache group=apache display-name=<name>
processes=4 threads=5 maximum-request
s=400

Then vhosts then scriptalias the vhost root to a wsgi handler.

We isolated the failures to one specific vhost which has higher
traffic than the others, and it's been changed to 8,8,1024 instead of
4,5,400

> I worked with that OP a fair bit off list to help sort this out, but
> put it aside at the moment as busy in last few weeks of a job before I
> leave. They weren't able to move to latest mod_wsgi source so can get
> debug output from extra code I added.

We have our web servers behind a layer 7 switch, and we've already
built the latest mod_wsgi.  I'll keep in touch around the failures and
once it trips again, I'll upgrade mod_wsgi to get you the logging you
need.

a stacktrace on the other hand is harder. I'd really need to bring an
additional machine into the farm for that, as the chance of a node
failure would seem to be larger. I can probably spare the staff
resources to make that happen over the christmas period should this
issue remain unresolved by then.

Cheers,

Matt

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to