One thing you can do to try and work out what the process is doing
before you kill it off, is to attach gdb to it and get a stack trace
of all active threads. If the issue is in a C extension module, that
may show what module and where. See:

http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Debugging_Crashes_With_GDB

Another thing that can be done is try to force a stack trace of where
in Python code threads are running. As explained in:

https://mail.google.com/mail/#search/stack+trace+python/12b114f3ec8f295b

you can start with a background thread which monitors for the
existence of a specific file, it that file exists, then do the stack
trace and write it to a file. So, when issue occurs, you would touch
the monitored file and trigger it.

The actual generation of a Python stack trace for all threads is
detailed in stacktraces() function in:

http://groups.google.com/group/modwsgi/browse_frm/thread/b0fced31f191df59

In that case the generation was triggered from a URL request, but if
process is non response to new requests because of issue or you have
more than one process and so don't know which will get request, you
need to trigger it from background thread instead.

Graham

On 8 December 2010 20:59, Alessandro Pasotti <[email protected]> wrote:
> 2010/12/7 Graham Dumpleton <[email protected]>
>>
>> On 7 December 2010 18:59, elpaso <[email protected]> wrote:
>> > On 1 Dic, 11:55, Graham Dumpleton <[email protected]> wrote:
>> >> On 1 December 2010 21:45, Alessandro Pasotti <[email protected]>
>> >> wrote:
>> >>
>> >
>> >> > I thought I couldn't  use worker MPM because the c/c++ libs used in
>> >> > GeoDjango are not thread safe and worker MPS uses threads, but if
>> >> > WSGI
>> >> > directives have full control I guess setting threads=1 will suffice.
>> >>
>> >> You aren't running GeoDjango in the Apache child processes so doesn't
>> >> matter that they are multithread with worker MPM as they only proxy
>> >> requests. The daemon mode processes that GeoDjango runs in you are
>> >> configuring separately and so you can say that for them you have
>> >> multiple processes with single thread.
>> >>
>> >> Graham
>> >
>> > Thank you for the clarifications,
>> >
>> > I followed your advice and switched to worker MPM, but the problem
>> > became even worse and the server went down with out of memory errors
>> > last night.
>>
>> Have you determined whether the issue is an accumulation of a lot of
>> processes vs specific active processes using up lots of memory?
>
>
> Yes, the second:
> * normal process memory is about 300MB VIRT and less than 100MB RES, 4-5%
> memory
> * about once a day, one ( sometimes two) processe start eating CPU  (kernel)
> and grow to 1GB VIRT memory and 600MB RES, % memory goes to 40%.
> * after a few hours the same processes go slowly down to normal % memory
> (4-5) and RES goes down to normal but VIRT remains over 1GB
> the number of WSGI processes as seen by top remains 10.
>
>>
>> Have you done any checks on how many open files processes in the
>> application have to ensure they are cleaning up resources properly?
>
>
> No, how could I check it ?
> $ lsof PID ?
>
>>
>> Have you done any monitoring of ongoing memory usage and tried to
>> match up the time when process memory started to increase with
>> specific URL requests which appear in the Apache access log in case it
>> is specific URLs that have memory problems?
>
> Yes, without any success: I cannot find a relationship between incoming URL
> requests and memory/CPU grow. I suspect this has something to do with heavy
> geographic buffer processing using the GEOS ctype bindings within Django but
> I cannot prove it.
>
>>
>> Have you set LimitRequestBody directive in Apache to some value known
>> to be a bit larger than largest expected POST size to avoid problems
>> with people doing malicious uploads of large files which are causing
>> you problems due to your code trying to read such POST content into
>> memory all at once?
>>
>
> No, but I cannot see any "strange/evil" request in the logs.
>
>>
>> > Now I went back to prefork and with this configuration memory seems
>> > under control and it's running well:
>> >
>> > WSGIDaemonProcess ml.eu user=www-data group=www-data processes=10
>> > threads=1 display-name=WSGI inactivity-timeout=240 maximum-
>> > requests=100
>> >
>> > I also suspect that python 2.5 could be responsible, I cannot switch
>> > to 2.6 on that server.
>> >
>> > I will keep experimenting until I find a stable configuration.
>>
>> I would suggest not simply trying to fiddle with the configuration. In
>> all probability this is going to be caused by an issue with your
>> application, so you need to debug the application and what processes
>> are doing with respect to memory use to work out the cause rather than
>> trying to treat the symptoms. As such, the above are just a few things
>> you can try, but by no means all.
>
>
> I completely agree that with my  "try and pray" approach I'm not  going very
> far, unfortunately I'm not skilled enough to examine a running process. I
> tried with strace on the "hungry" process but it seems idle (nothing printed
> in strace).
>
>
> --
> Alessandro Pasotti
> w3:   www.itopen.it
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/modwsgi?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to