Re: What persists in memory across requests?

akaariai Fri, 18 May 2012 13:52:18 -0700

On May 18, 7:54 pm, Ken J <k...@filewave.com> wrote:
> I'm quite curious as to what persists in memory across requests in
> terms of django application variables.  I'm currently running a django
> app in mod_wsgi daemon mode.
>
> Because of performance concerns when dealing with large numbers of
> concurrent requests, I wanted to modify django to keep persistent DB
> connections to Postgres using a connection pool.
>
> This in turn got me wondering, how can I persist a thread pool, or
> even a simple DB connection, across requests?  I realize anything that
> is global to the wsgi entry point script will persist.  The current
> wsgi entry point I'm using is something like:
>
> import django.core.handlers.wsgi
>
> _application = django.core.handlers.wsgi.WSGIHandler()
>
> def application(environ, start_response):
>     return _application(environ, start_response)
>
> Obviously _application will remain, but since code modules are
> dynamically loaded based on URL resolvers, would the view/model/db
> connection not be destroyed once the variables referencing said
> objects go out of scope?
>
> From logging statements, it has become apparent I can in fact make DB
> connections persistent simply by not closing the DB connection after
> the request has finished.  Unfortunately, I also found this to slowly
> leak socket connections to the DB eventually making it so that I can't
> log into the DB, hence why I was looking into a connection pool.
>
> Anyways, I was hoping someone could shed some light as to the
> internals of python/django on why django/db/__init__.py is able to
> reference persistent connections.
>
> My best guess is that because
>
> connections = ConnectionHandler(settings.DATABASES)
>
> is at the top level of a module, it remains held within the python
> interpreter after being imported, thus holding a reference.
>
> Any insight would be greatly appreciated.


First, you should look into external connection poolers. PgBouncer is
excellent if you need just a connection pool. Pg-pool II is another
option, but it provides way more than just a connection pool.

If you would like to implement a connection pool in Python code, it
isn't the easiest thing to do. If you are using Django 1.4, then the
connections object will be thread local - that is, it provides a
different connection for each thread. Note that if you are using
multiple processes (instead of threads) then the processes share
nothing between each other. This is the reason you should look for
external poolers - they can see all connection attempts
simultaneously, but Python poolers are restricted to seeing one
processes at a time.

Still, if you want to experiment on poolers in Python code, I have
written a couple of attempts for a connection pool you can use as
bases of your work. https://github.com/akaariai/django-psycopg-pooled
and https://github.com/akaariai/django_pooled.

Short answer from my experiments: you do not want to implement
connection pooling in Python, at least if performance is the reason
for pooling. You could however do some other funny things, like
rewriting queries to use prepared statements.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: What persists in memory across requests?

Reply via email to