Re: Optimizing Django ORM SQL queries when in a loop
Hi guys, I tried Join.Together's approach using the "in" filter parameter and it worked great. It turns out that the other query I was running was actually producing the wrong result set! Not Django's fault, rather I was pulling in the wrong data. The code looks nicer and is definitely more maintainable now. I'm quite looking forward to reducing more overhead with tips like this. Thanks a lot Join and Malcolm! Mike. On Aug 28, 5:43 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > I'm going to give this a try. I'll let you know how I get on! > > Thanks guys! > > Mike. > > On Aug 27, 6:18 pm, "[EMAIL PROTECTED]" > > <[EMAIL PROTECTED]> wrote: > > user_id_list=User.objects.filter(whatever).values_list('id', > > flat=True) > > posts=Post.objects.filter(user__id__in=user_id_list) > > > Would that work? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Optimizing Django ORM SQL queries when in a loop
I'm going to give this a try. I'll let you know how I get on! Thanks guys! Mike. On Aug 27, 6:18 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > user_id_list=User.objects.filter(whatever).values_list('id', > flat=True) > posts=Post.objects.filter(user__id__in=user_id_list) > > Would that work? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Optimizing Django ORM SQL queries when in a loop
user_id_list=User.objects.filter(whatever).values_list('id', flat=True) posts=Post.objects.filter(user__id__in=user_id_list) Would that work? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Optimizing Django ORM SQL queries when in a loop
Hi Malcolm, Thanks very much for your help! I was (up until a few weeks ago) running off of the trunk, but I've stopped at an earlier revision (7922, I think). I kept updating but things kept breaking so I thought I'd just finalise the revision. I did actually give it a go using Q objects yesterday, but it still seems to be generating the extra queries. I'll try updating to the latest Django release, then I will make sure for certain that there are only two queries being run for that particular aspect of the code. I'll let you know how I get on! Thanks again for your help. Regards, Mike. On Aug 26, 6:36 pm, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote: > On Tue, 2008-08-26 at 08:05 -0700, [EMAIL PROTECTED] wrote: > > Hi guys, > > > I'm trying to optimize one of my views that is doing a rather (naive?) > > query like this: > > > ... > > post_stream = Post.objects.filter(poster=some_user) > > for other_user in other_user_queryset: > > post_stream = post_stream | Post.objects.filter(poster=other_user) > > ... > > [...] > > > The ORM generates a SQL query for each of the additional piped OR > > queries (this becomes a large number very quickly!). > > Which version of Django are you using? Because this isn't true for the > current subversion code. How are you determining that a query is run > esach time? > > Querysets are only evaluated when you first access their results and > "or"-ing together two querysets doesn't access the results, so it > constructs a single SQL query. The above loop should execute two > queries: one for the other_user_queryset (since you're accessing the > results there) and then one when you access the results of post_stream. > > A more concise, and probably clearer, way to write the above loop is to > use Q() objects: > > post_filter = Q(poster=some_user) > for user in other_queryset: > post_filter = post_filter | Q(poster=user) > post_stream = Post.objects.filter(post_filter) > > That's a little bit more lightweight at the Python level, since it's not > constructing and copying a QuerySet object each time around. From a > style point of view it's also focusing on constructing the filter, which > is the significant thing the loop does, rather than the queryset, which > is just a way to get at the results which are filtered. > > Regards, > Malcolm --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Optimizing Django ORM SQL queries when in a loop
On Tue, 2008-08-26 at 08:05 -0700, [EMAIL PROTECTED] wrote: > Hi guys, > > I'm trying to optimize one of my views that is doing a rather (naive?) > query like this: > > ... > post_stream = Post.objects.filter(poster=some_user) > for other_user in other_user_queryset: > post_stream = post_stream | Post.objects.filter(poster=other_user) > ... [...] > The ORM generates a SQL query for each of the additional piped OR > queries (this becomes a large number very quickly!). Which version of Django are you using? Because this isn't true for the current subversion code. How are you determining that a query is run esach time? Querysets are only evaluated when you first access their results and "or"-ing together two querysets doesn't access the results, so it constructs a single SQL query. The above loop should execute two queries: one for the other_user_queryset (since you're accessing the results there) and then one when you access the results of post_stream. A more concise, and probably clearer, way to write the above loop is to use Q() objects: post_filter = Q(poster=some_user) for user in other_queryset: post_filter = post_filter | Q(poster=user) post_stream = Post.objects.filter(post_filter) That's a little bit more lightweight at the Python level, since it's not constructing and copying a QuerySet object each time around. From a style point of view it's also focusing on constructing the filter, which is the significant thing the loop does, rather than the queryset, which is just a way to get at the results which are filtered. Regards, Malcolm --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Optimizing Django ORM SQL queries when in a loop
Hi guys, I'm trying to optimize one of my views that is doing a rather (naive?) query like this: ... post_stream = Post.objects.filter(poster=some_user) for other_user in other_user_queryset: post_stream = post_stream | Post.objects.filter(poster=other_user) ... The idea is that I want to display a post stream of lots of different user's aggregated posts in one place, intermixed. by date (I've left out the order_by filter). The ORM generates a SQL query for each of the additional piped OR queries (this becomes a large number very quickly!). I was wondering if there were any methods available to combine this into a single query so as to stop hitting the database? I don't really want to resort to writing raw SQL, as I imagine this is a very common scenario, but if this is the only way, then I'll give it a try! I've added select_related to both of the object managers which helps somewhat, but not totally. Also, I'm a bit of a beginner at SQL optimisation, so if I've made an obvious newbie error - please let me know! Thanks for your help, Mike. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---