Re: Problem with performance on a django site

motard Fri, 29 May 2009 01:48:39 -0700

Hi!

Thanks for all your suggestions. I've been able to solve the issue.
Now the page reports to be executing roughly 80-90 SQL queries and is
performing well.

The problem was mainly in two places of my code where I was looping
through all the results of a related set for every row in a paged
object. The problem hasn't become evident earlier because being an
intranet application, the site does not receive much traffic.

Once the related set has started to grow, the problem has emerged.

I was doing something like this:

{% for element in mypagedlist.object_list %}
    {% for related_element in element.related_element_set.all %}
        #Do something
    {% endfor %}
{% endfor %}

Regarding the snippet 461, I am using the Django Debug Toolbar, which
has helped me a lot: 
http://github.com/robhudson/django-debug-toolbar/tree/master

Regarding, select_related, first I was having trouble because it
wasn't following foreign keys that allowed blank=True. After
specifying the fields to follow, it started to work. But even while
being able to drop the number of SQL queries to around 10-15, the
query itself was taking much longer to execute. The page performs much
better doing 90 simple SQL queries, than 10-15 with 1 that is complex.

Since at first I didn't know what was happening, I introduced caching
via the simple local memory cache.
In this way I've been able to also reduce some queries that where
being repeated in different parts of the site.
What I am not so clear about is if this will report any important
benefit, since the SQL part has never been the bottleneck. So placing
some querysets or objects in the cache instead of the mysql db doesn't
result in such a big performance gain. (Still, the cache needs to
pickle, unpickle things and you need more logic to control that the
cache gets updated as it should and contains the keys you expect.)
It does seem much more interesting to cache a whole page or page
fragment, since that caching occurs with the template being already
rendered, so this can potentially increase a lot the performance.
The problem with that approach on pages that have a lot of calculated
data on them is that the cached page is likely to get outdated very
quickly.

One thing that I'm not sure if it's a fault of mine is that the
cache.add() method throws a NotImplementedError, even though in the
documentation it says it's available...
Now I've set up a function that will handle the get and set of the
cache like this:

def obtain_from_cache(key):
    key_list = key.split('_') # This is a small convention for knowing
what to call
    if key_list[0] == 'MYGROUPOFKEYS':
        result = cache.get(key)
        if not result:
            result = cache.set(key, Group.objects.get
(id__exact=key_list[1]).user_set.all())) #This is just an example
        return result

This code works well for keys that you simply will never want to
change before they expire.
If someone has a better way of working with the cache, please comment.

My conclusion in the end is that you have to pay a lot of attention to
what information you expose in the templates. Being able to call
object methods in the template is very powerful indeed, but can lead
to important problems if you don't pay attention at what you do.

Another thing learned is that having querysets which are evaluated
lazyly doesn't mean you can also start being lazy with your code ;)

Regarding the server setup:

Has anyone experience with mod_wsgi on Windows? I've read that, since
it's Windows, mod_wsgi can't be run in daemon mode and that means the
performance gain won't be that much over mod_python.

Again, thanks for the support.

Stefan

On 28 mayo, 17:39, Steve Howell <showel...@yahoo.com> wrote:
> On May 28, 8:19 am, jrs_66 <jrs...@yahoo.com> wrote:
>
> > 250 queries on one page seems to me to be dangerously high.  I would
> > have to guess you could reduce that significantly.  Turn on query
> > output to see if the Django ORM isn't creating sloppy queries in
> > loops.  My guess is that with some code alterations you could help....
>
> This seems like a good place to start to me.
>
> As far as debugging, there might be some tools on djangosnippets that
> might help the original poster:
>
> http://www.djangosnippets.org/snippets/461/
>
> As somebody else suggested on the thread, I'm not entirely sure about
> the diagnosis that the template rendering is really the bottleneck,
> since queries are lazily executed.
>
> For debugging purposes, you can force queries to be executed earlier
> with little impact on the template by turning them into lists, which
> helps you to more easily identify bottlenecks (but not good for prod,
> it uses more memory).
>
>     objs = list(objs)
>
> To reduce the sheer number of queries, I recommend experimenting the
> select_related() feature.  I haven't used it in Django, but I've used
> equivalent features in Rails, and it can lead to pretty drastic
> speedups, but I'll give the caveat that I was usually trying to reduce
> delays from latency.  Here are docs on select_related():
>
> http://docs.djangoproject.com/en/dev/ref/models/querysets/#id4
>
> Also, as always, it might make good sense to simply divide and
> conquer, refactor as you go, etc
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Problem with performance on a django site

Reply via email to