Re: Count is different when using custom SQL?

Malcolm Tredinnick Wed, 10 Dec 2008 17:56:37 -0800


On Wed, 2008-12-10 at 02:11 -0800, Darthmahon wrote:
[...]
> Now what I've noticed is that the number of results (count) is
> different between the results that have not been group and those that
> have. Hopefully the code below makes some sense:
> 
> ###############################################
> 
> // get results
> friends_albums = Albums.objects.filter(users__in=friend_list,
> date__gte=today).extra(select={'count': 'count(1)'}).select_related()
> 
> // count results
> count['friends1'] = friends_albums.count()
> 
> // now group by
> friends_albums.query.group_by = ['album_id']
> 
> // count results again once grouped
> count['friends2'] = friends_albums.count()
> 
> ###############################################
> 
> So count['friends1'] counts the results before being grouped, but once
> I group them in the line after, count['friends2'] returns a different
> number.


Details below, but the first one is wrong and the second one is right.

> The count in the extra parameter in the query is the correct number I
> want, but I can't figure out how to access this within the View.pm
> file?
> 
> Any ideas? I'll try and provide some examples of data when I get home
> if this isn't enough information.

I found it a bit hard to follow what you were doing, but I think the
explanation is the same regardless. To understand what's going on here,
if I were you, I'd look at the SQL generated in both cases (look at
friends_album.query.as_sql()) and run that SQL directly in an SQL shell
to see what the database returns.

I'm kind of surprised your first attempt works at all, since it almost
certainly is generating invalid SQL. However when I tested a similar
situation, PostgreSQL raised the sort of error I was expecting (you
can't do counting on a specific select column unless it's grouped), but
SQLite returned an answer. The answer in the latter case was almost
totally bogus, in the sense of not meaning much, but it didn't raise an
error.

So I think you should throw away your first query and only use the
grouped version, as that's the correct SQL and going to do what you
want. By way of example (using SQLite), here's a similar case, using a
Tag model that has a parent_id field (it's just the tagging model and
test data for my blog):

(1) This is the normal sort of situation, grouping by the independent
column(s):

        sqlite> select parent_id, count(1) from weblog_tag group by parent_id;
        parent_id   count(1)  
        ----------  ----------
                    5         
        2           3         
        8           2         
        9           2         
        10          1         

(2) This is the nonsensical case. No grouping, but still attempting to
count by the first column:

        sqlite> select parent_id, count(1) from weblog_tag;
        parent_id   count(1)  
        ----------  ----------
        8           13        

The "8" for the parent_id column here is odd (well, any answer is as
valid as any other in this "doesn't make sense" situation). It turns out
to be the result of count(parent_id) -- in other words, the number of
non-NULL occurrences of parent_id.

Note that if I try to run the second query on a stricter database
(PostgreSQL), I get this:

        website=# select parent_id, count(1) from weblog_tag;
        ERROR:  column "weblog_tag.parent_id" must appear in the GROUP BY 
clause or be used in an aggregate function

Hopefully that explains why you're seeing different results and which
one to trust. But, again, play around with the queries in an SQL shell
and see what's going on

Good eye, picking up the difference, though, and at least wondering what
on earth was going on.

Regards,
Malcolm


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Count is different when using custom SQL?

Reply via email to