Hi all,

I think I've found a strange case where QuerySet.count() does not match 
len(queryset), and the behaviour is certainly unexpected.
But I'm not sure whether this is a bug, some mistake on my part, or a known 
limitation of combining .extra(), .distinct() and .count().
I am aware of the default ordering interfering with .distinct(), and 
already add .order_by() to get rid of it.

I have a model called RadioSignal, with an integer field "rssi". I'm 
interested in finding out how many distinct values for "rssi / 10" there 
are (-10, -20, -30, etc).

Here is a commented "./manage.py shell" record:

>>> RadioSignal.objects.count()
523 
>>> RadioSignal.objects.order_by().values('rssi').distinct().count() 
49
>>> connection.queries[-1]['sql']
'SELECT COUNT(DISTINCT "maps_radiosignal"."rssi") FROM "maps_radiosignal"'

Looks okay so far. But I'm interested in each distinct tens of RSSI values, 
not every single value. I can compute these with .extra():

>>> len(RadioSignal.objects.order_by().extra({'tens': 'rssi / 
10'}).values('tens').distinct())
6
>>> RadioSignal.objects.order_by().extra({'tens': 'rssi / 
10'}).values('tens').distinct()
[{'tens': -8}, {'tens': -4}, {'tens': -5}, {'tens': -9}, {'tens': -6}, 
{'tens': -7}]
>>> connection.queries[-1]['sql']
'SELECT DISTINCT (rssi / 10) AS "tens" FROM "maps_radiosignal" LIMIT 21'

Also looks good so far. But running len() on a queryset is unnecessary if I 
only need the count.

>>> RadioSignal.objects.order_by().extra({'tens': 'rssi / 
10'}).values('tens').distinct().count()
523
>>> connection.queries[-1]['sql']
'SELECT COUNT(DISTINCT "maps_radiosignal"."id") FROM "maps_radiosignal"'

Uhoh. Somehow .count() keeps the .distinct() part, but replaces the 
.extra() and .values() parts with counting primary keys?

I tried it with values('tens'), values_list('tens'), and 
values_list('tens', flat=True), as well no change.
So far I've tested Django 1.6 with Python 2.7 and PostgreSQL 9.3, and 
Django 1.7 with Python 3.4 and PostgreSQL 9.1, all seem to behave the same.

I can work around this, since the possible resulting querysets are pretty 
small, and evaluating them with len() isn't too slow. But I'm wondering if 
it's a bug in Django, or something else I've missed?

Mattias


-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/21085de5-a899-437c-b13a-8584104df65f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to