Re: [modwsgi] nginx vs apache

Kent Bower Tue, 22 Mar 2016 04:14:39 -0700

A huge, grateful "thank you!"

Kent


On Mon, Mar 21, 2016 at 6:30 PM, Graham Dumpleton <
graham.dumple...@gmail.com> wrote:

>
> On 22 Mar 2016, at 4:01 AM, Kent Bower <k...@bowermail.net> wrote:
>
> In your recipe for a background monitoring thread watching memory
> consumption, after issuing the SIGUSR1, I'd probably just want the thread
> to exit instead of sleeping... do I just do "sys.exit()" to safely
> accomplish that?
>
>
> The code isn’t just sleeping. It waits on a queue object which has
> something placed on it when mod_wsgi is shutting down the process via
> atexit callback. When the thread gets that it will exit cleanly, with the
> main thread waiting on it to exit to ensure it isn’t running.
>
> If you just call sys.exit() that results in a SystemExit exception being
> raised which causes the thread to exit but leaves an exception in the error
> logs.
>
> The use of the queue is better as it ensures that threads are shutdown
> properly when process is shutting down, else you risk that the thread could
> try and run while interpreter is being destroyed, causing Python to crash
> the process.
>
> Also, regarding my observations of paster returning garbage-collected
> memory to the OS, was I just getting lucky while monitoring (the memory was
> at the very top of the allocated memory)?  This is a universal python issue?
>
>
> It is a universal issue with any programs running on a UNIX system.
>
> You may want to Google up some articles on how memory allocation in UNIX
> as well as in Python works.
>
>
> Again, thanks for all your help!
>
> On Sat, Mar 19, 2016 at 11:22 PM, Graham Dumpleton <
> graham.dumple...@gmail.com> wrote:
>
>>
>> On 20 Mar 2016, at 1:10 AM, Kent Bower <k...@bowermail.net> wrote:
>>
>> Thanks Graham, few more items inline...
>>
>> On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <
>> graham.dumple...@gmail.com> wrote:
>>
>>>
>>> On 17 Mar 2016, at 11:28 PM, Kent Bower <k...@bowermail.net> wrote:
>>>
>>> My answers are below, but before you peek, Graham, note that you and I
>>> have been through this memory discussion before & I've read the vertical
>>> partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On",
>>> considered maximum-requests, etc.
>>>
>>> After years of this, I'm resigned to the fact that python is memory
>>> hungry, especially built on many of these web-stack and database libraries,
>>> etc.  I'm Ok with that.   I'm fine with a high-water RAM mark imposed by
>>> running under Apache, mostly.  But, dang, it sure would be great if the 1
>>> or 2% of requests that really (and legitimately) hog a ton of RAM, like,
>>> say 500MB extra, didn't keep it when done.  I may revisit vertical
>>> partitioning again, but last time I did I think I found that the 1 or 2% in
>>> my case generally won't be divisible by url.  In most cases I wouldn't know
>>> whether the particular request is going to need lots of RAM until
>>> *after *the database queries return (which is far too late for vertical
>>> partitioning to be useful).
>>>
>>> So I was mostly just curious about the status of nginx running wsgi,
>>> which doesn't solve python's memory piggishness, but would at least
>>> relinquish the extra RAM once python garbage collected.
>>>
>>>
>>> Where have you got the idea that using nginx would result in memory
>>> being released back to the OS once garbage collected? It isn’t able to do
>>> that.
>>>
>>> The situations are very narrow as to when a process is able to give back
>>> memory to the operating system. It can only be done when the now free
>>> memory was at top of allocated memory. This generally only happens for
>>> large block allocations and not in normal circumstances for a running
>>> Python application.
>>>
>>
>>
>> At this point I'm not sure where I got that idea, but I'm surprised at
>> this.  For example, my previous observations of paster running wsgi were
>> that it is quite faithful at returning free memory to the OS.  Was I just
>> getting lucky, or would paster be different for some reason?
>>
>> In any case, if nginx won't solve that, then I can't see any reason to
>> even consider it over apache/mod_wsgi.  Thank you for answering that.
>>
>>
>>>
>>> (Have you considered a max-memory parameter to mod_wsgi that would
>>> gracefully stop taking requests and shutdown after the threshold is reached
>>> for platforms that would support it?  I recall -- maybe incorrectly -- you
>>> saying on Windows or certain platforms you wouldn't be able to support
>>> that.  What about the platforms that *could *support it?  It seems to
>>> me to be the very best way mod_wsgi could approach this Apache RAM nuance,
>>> so seems like it would be tremendously useful for the platforms that could
>>> support it.)
>>>
>>>
>>> You can do this yourself rather easily with more recent mod_wsgi version.
>>>
>>> If you create a background thread from a WSGI script file, in similar
>>> way as monitor for code changes does in:
>>>
>>>
>>> http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces
>>>
>>> but instead of looking for code changes, inside the main loop of the
>>> background thread do:
>>>
>>>     import os
>>>     import mod_wsgi
>>>
>>>     metrics = mod_wsgi.process_metrics()
>>>
>>>     if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:
>>>         os.kill(os.getpid(), signal.SIGUSR1)
>>>
>>> So mod_wsgi provides the way of determining the amount of memory without
>>> resorting to importing psutil, which is quite fat in itself, but how you
>>> use it is up to you.
>>>
>>
>>
>> Right, that's an idea; (could even be a shell script that takes this
>> approach, I suppose, but I like your recipe.)
>>
>> Unfortunately, I don't want to *automate *bits that can feasibly clobber
>> blocked sessions.  SIGUSR1, after graceful-timeout & shutdown-timeout, can
>> result in ungraceful killing.  Our application shares a database with an
>> old legacy application which was poorly written to hold transactions while
>> waiting on user input (this was apparently common two decades ago).  So,
>> unfortunately, it isn't terribly uncommon that our application is blocked
>> at the database level waiting for someone using the legacy application who
>> has a record(s) locked and may not even be at their desk or may have gone
>> to lunch.  Sometimes our client's IT staff has to hunt down these people or
>> decide to kill their database session.  In any case, from a professional
>> point of view, our application should be the responsible one and wait
>> patiently, allowing our client's IT staff the choice of how to handle those
>> cases.  So, while the likelihood is pretty low, even with graceful-timeout
>> & shutdown-timeout set at a very high value like 5 minutes,* I still run
>> the risk of killing legitimate sessions with SIGUSR1*.  (I've brought
>> this up before and you didn't agree with my gripe and I do understand why,
>> but in my use case, I don't feel I can automate that route responsibly....
>> we do use SIGUSR1 manually sometimes, when we can monitor and react to
>> cases where a session is blocked at the database level.)
>>
>>
>> If we have discussed it previously, then I may not have anything more to
>> add.
>>
>> Did I previously suggest offloading this memory consuming tasks behind a
>> job queue run under Celery or something else? That way they are out of the
>> web server processes at least.
>>
>> inactivity-timeout doesn't present this concern: it won't ever kill
>> anything, just silently restarts like a good boy when inactive.  I've
>> recently reconsidered dropping that way down from 30 minutes.  (When I
>> first implemented this, it was just to reclaim RAM at the end of the day,
>> so that's why it is 30 minutes.  I didn't like the idea of churning new
>> processes during busy periods, but I've been thinking 1 or 2 minutes may be
>> quite reasonable.)
>>
>> If I could signal processes to shutdown at their next opportunity
>> (meaning the next time they are handling no requests, like
>> inactivity-timeout), that would solve many issues in this regard for me
>> because I could signal these processes when their RAM consumption is high
>> and let them restart when "convenient," being the ultimate in
>> gracefulness.  SIGUSR2 could mean "the next time you get are completely
>> idle," while SIGUSR1 continues to mean "initiate shutdown now.”
>>
>>
>> That is what SIGUSR1 does it you set graceful-timeout large enough. It is
>> SIGINT or SIGTERM which is effectively initiate shutdown now. So shouldn’t
>> be a need to have a SIGUSR2 as SIGUSR1 should already do what you are
>> hoping for with a reasonable setting of graceful-timeout.
>>
>>
>>> Do note that if using SIGUSR1 to restart the current process (which
>>> should only be done for deamon mode), you should also set graceful-timeout
>>> option to WSGIDaemonProcess if you have long running requests. It is the
>>> maximum time process will wait to shutdown while still waiting for requests
>>> when doing a SIGUSR2 graceful shutdown of process, before going into forced
>>> shutdown mode where no requests will be accepted and requests can be
>>> interrupted.
>>>
>>> Here (
>>> http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html)
>>> you discuss nginx's tendency to block requests that may otherwise be
>>> executing in a different process, depending on timing, etc.  Is this issue
>>> still the same (I thought I read a hint somewhere that there may be a
>>> workaround for that), so I ask.
>>>
>>>
>>> That was related to someones attempt to embedded a Python interpreter
>>> inside of nginx processes themselves. That project died a long time ago. No
>>> one embeds Python interpreters inside of nginx processes. It was a flawed
>>> design.
>>>
>>> I don’t what you are reading to get all these strange ideas. :-)
>>>
>>
>>
>> Google, I suppose ;)   That's why I finally asked you when I couldn't
>> find anything more about it via Google.
>>
>>
>>
>>>
>>> And so I wanted your opinion on nginx...
>>>
>>> ====
>>> Here is what you asked for if it can still be useful.
>>>
>>> I'm on mod_wsgi-4.4.6 and the particular server that prompted me this
>>> time is running Apache 2.4 (prefork), though some of our clients use 2.2
>>> (prefork).
>>>
>>> Our typical wsgi conf setting is something like this, though threads and
>>> processes varies depending on server size:
>>>
>>> LoadModule wsgi_module modules/mod_wsgi.so
>>> WSGIPythonHome /home/rarch/tg2env
>>> # see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning
>>> timeouts
>>> WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800
>>> display-name=%{GROUP} graceful-timeout=5
>>> python-eggs=/home/rarch/tg2env/lib/python-egg-cache
>>>
>>>
>>> Is your web server really going to be idle for 30 minutes? I can’t see
>>> how that would have been doing anything.
>>>
>>> Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed.
>>>
>>> It used to apply when there were active requests and they were blocked,
>>> as well as when no requests were running.
>>>
>>> Now it only applies to case where there are no requests.
>>>
>>> The case for running but blocked requests is now handled by
>>> request-timeout.
>>>
>>> You may be better of setting request-timeout now to be a more reasonable
>>> value for your expected longest request, but set inactivity-timeout to
>>> something much shorter.
>>>
>>> So suggest you play with that.
>>>
>>> Also, are you request handles I/O or CPU intensive and how many requests?
>>>
>>> Such a high number of processes and threads always screams to me that
>>> half the performance problems are due to setting these too [HIGH], invoking
>>> pathological OS process swapping issues and Python GIL issues.
>>>
>>>
>>
>> Yes, the requests are I/O intensive (that is, database intensive, which
>> adds a huge overhead to our typical request).  Often requests finish in
>> under a second or two, but they also can take many seconds (not
>> *terrible *for the user, but sometimes they do a lot of processing with
>> many trips to the database).
>> We have several clients (companies), so the number of requests varies
>> widely, but can get pretty heavy on busy days (like black friday, since
>> they are in retail).   We've played with those numbers quite a bit and
>> without high numbers like that, responsiveness suffers because we backlog
>> due to requests often taking several seconds.
>>
>> Thanks for all your input, you've been tremendously helpful!
>> Kent
>>
>>
>>
>>
>>> WSGIProcessGroup rarch
>>> WSGISocketPrefix run/wsgi
>>> WSGIRestrictStdout Off
>>> WSGIRestrictStdin On
>>> # Memory tweak.
>>> http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html
>>> WSGIRestrictEmbedded On
>>> WSGIPassAuthorization On
>>>
>>> # we'll make the /tg/ directory resolve as the wsgi script
>>> WSGIScriptAlias /tg
>>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py
>>> process-group=rarch application-group=%{GLOBAL}
>>> WSGIScriptAlias /debug/tg
>>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py
>>> process-group=rarch application-group=%{GLOBAL}
>>>
>>> MaxRequestsPerChild  0
>>> <IfModule prefork.c>
>>> MaxClients       308
>>> ServerLimit      308
>>> </IfModule>
>>> <IfModule worker.c>
>>> ThreadsPerChild  25
>>> MaxClients       400
>>> ServerLimit      16
>>> </IfModule>
>>>
>>>
>>> Thanks for all your help and for excellent software!
>>> Kent
>>>
>>>
>>> On Wed, Mar 16, 2016 at 7:27 PM, Graham Dumpleton <
>>> graham.dumple...@gmail.com> wrote:
>>>
>>>> On the question of whether nginx will solve this problem, I can’t see
>>>> how.
>>>>
>>>> When one talks about nginx and Python web applications, it is only as a
>>>> proxy for HTTP requests to some backend WSGI server. The Python web
>>>> application doesn’t run in nginx itself. So memory issues and how to deal
>>>> with them are the provence of the WSGI server used, whatever that is and
>>>> not nginx.
>>>>
>>>> Anyway, answer the questions below and can start with that.
>>>>
>>>> You really want to be using recent mod_wsgi version and not Apache 2.2.
>>>>
>>>> Apache 2.2 design has various issues and bad configuration defaults
>>>> which means it can gobble up more memory than you want. Recent mod_wsgi
>>>> versions have workarounds for Apache 2.2 issues and are much better at
>>>> eliminating those Apache 2.2 issues. Recent mod_wsgi versions also have
>>>> fixes for memory usage problems in some corner cases. As far as what I mean
>>>> by recent, I recommend 4.4.12 or later. The most recent version is 4.4.21.
>>>> If you are stuck with 3.4 or 3.5 from your Linux distro that is not good
>>>> and that may increase problems.
>>>>
>>>> So long as got recent mod_wsgi version then can look at using vertical
>>>> partitioning to farm out memory hungry request handlers to their own daemon
>>>> process group and better configure those to handle that and recycle
>>>> processes based on activity or, memory usage. A blog post related to that
>>>> is:
>>>>
>>>> *
>>>> http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html
>>>>
>>>> Graham
>>>>
>>>> On 17 Mar 2016, at 7:15 AM, Graham Dumpleton <
>>>> graham.dumple...@gmail.com> wrote:
>>>>
>>>> What version of mod_wsgi and Apache are you using?
>>>>
>>>> Are you stuck with old versions of both?
>>>>
>>>> For memory tracking there are API calls mod_wsgi provides in recent
>>>> versions for getting memory usage which can be used as part of scheme to
>>>> trigger a process restart. You can’t use sys.exit(), but can use signals to
>>>> trigger a clean shutdown of a process. Again better to have recent mod_wsgi
>>>> versions as can then also set up some graceful timeout options for signal
>>>> induced restart.
>>>>
>>>> Also, what is your mod_wsgi configuration so can make sure doing all
>>>> the typical things one would do to limit memory usage, or quarantine
>>>> particular handlers which are memory hungry?
>>>>
>>>> Graham
>>>>
>>>> On 17 Mar 2016, at 4:29 AM, Kent Bower <k...@bowermail.net> wrote:
>>>>
>>>> Interesting idea..  yes, we are using multiple threads and also other
>>>> stack frameworks, so that's not straightforward, but worth thinking
>>>> about... not sure how to approach that with the other threads.  Thank you
>>>> Bill.
>>>>
>>>> On Wed, Mar 16, 2016 at 1:11 PM, Bill Freeman <ke1g...@gmail.com>
>>>> wrote:
>>>>
>>>>> I don't know about nginx, but one possibility, if the large memory
>>>>> requests are infrequent, is to detect when you have completed one and
>>>>> trigger the exit/reload of the daemon process (calling sys.exit() is not
>>>>> the way, since there could be other threads in the middle of something,
>>>>> unless you run one thread per process).
>>>>>
>>>>> On Wed, Mar 16, 2016 at 7:50 AM, Kent <jkentbo...@gmail.com> wrote:
>>>>>
>>>>>> I'm looking for a very brief high-level pros vs. cons of wsgi under
>>>>>> *apache *vs. under *nginx *and then to be pointed to more details I
>>>>>> can study myself (or at least the latter).
>>>>>>
>>>>>> Our application occasionally allows requests that consume a large
>>>>>> amount of RAM (no obvious way around that, they are valid requests) and
>>>>>> occasionally this causes problems since we can't reclaim the RAM readily
>>>>>> from apache.  (We already have tweaked with and do use
>>>>>> "inactivity-timeout".   This helps, but still now and then we hit 
>>>>>> problems
>>>>>> where we run into swapping to disk.)
>>>>>>
>>>>>> I'm wondering if nginx may solve this problem.  I've read much of
>>>>>> what you (Graham) have had to say about the memory strategies with apache
>>>>>> and mod_wsgi, but wonder what your opinion of nginx is and where you've
>>>>>> already discussed this.  I've read articles I could find you've written 
>>>>>> on
>>>>>> nginx, such as "Blocking requests and nginx version of mod_wsgi,"  but
>>>>>> wonder if the same weaknesses are still applicable today, 7 years later?
>>>>>>
>>>>>>
>>>>>> Thank you very much in advance!
>>>>>> Kent
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "modwsgi" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to modwsgi+unsubscr...@googlegroups.com.
>>>>>> To post to this group, send email to modwsgi@googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to a topic in the
>>>>> Google Groups "modwsgi" group.
>>>>> To unsubscribe from this topic, visit
>>>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>> modwsgi+unsubscr...@googlegroups.com.
>>>>> To post to this group, send email to modwsgi@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to modwsgi+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to modwsgi@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "modwsgi" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> modwsgi+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to modwsgi@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "modwsgi" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to modwsgi+unsubscr...@googlegroups.com.
>>> To post to this group, send email to modwsgi@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/modwsgi.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "modwsgi" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> modwsgi+unsubscr...@googlegroups.com.
>>> To post to this group, send email to modwsgi@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/modwsgi.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to modwsgi+unsubscr...@googlegroups.com.
>> To post to this group, send email to modwsgi@googlegroups.com.
>> Visit this group at https://groups.google.com/group/modwsgi.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "modwsgi" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> modwsgi+unsubscr...@googlegroups.com.
>> To post to this group, send email to modwsgi@googlegroups.com.
>> Visit this group at https://groups.google.com/group/modwsgi.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to modwsgi+unsubscr...@googlegroups.com.
> To post to this group, send email to modwsgi@googlegroups.com.
> Visit this group at https://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "modwsgi" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> modwsgi+unsubscr...@googlegroups.com.
> To post to this group, send email to modwsgi@googlegroups.com.
> Visit this group at https://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to modwsgi+unsubscr...@googlegroups.com.
To post to this group, send email to modwsgi@googlegroups.com.
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] nginx vs apache

Reply via email to