Hi all,
I think I've found a strange case where QuerySet.count() does not match
len(queryset), and the behaviour is certainly unexpected.
But I'm not sure whether this is a bug, some mistake on my part, or a known
limitation of combining .extra(), .distinct() and .count().
I am aware of the default ordering interfering with .distinct(), and
already add .order_by() to get rid of it.
I have a model called RadioSignal, with an integer field "rssi". I'm
interested in finding out how many distinct values for "rssi / 10" there
are (-10, -20, -30, etc).
Here is a commented "./manage.py shell" record:
>>> RadioSignal.objects.count()
523
>>> RadioSignal.objects.order_by().values('rssi').distinct().count()
49
>>> connection.queries[-1]['sql']
'SELECT COUNT(DISTINCT "maps_radiosignal"."rssi") FROM "maps_radiosignal"'
Looks okay so far. But I'm interested in each distinct tens of RSSI values,
not every single value. I can compute these with .extra():
>>> len(RadioSignal.objects.order_by().extra({'tens': 'rssi /
10'}).values('tens').distinct())
6
>>> RadioSignal.objects.order_by().extra({'tens': 'rssi /
10'}).values('tens').distinct()
[{'tens': -8}, {'tens': -4}, {'tens': -5}, {'tens': -9}, {'tens': -6},
{'tens': -7}]
>>> connection.queries[-1]['sql']
'SELECT DISTINCT (rssi / 10) AS "tens" FROM "maps_radiosignal" LIMIT 21'
Also looks good so far. But running len() on a queryset is unnecessary if I
only need the count.
>>> RadioSignal.objects.order_by().extra({'tens': 'rssi /
10'}).values('tens').distinct().count()
523
>>> connection.queries[-1]['sql']
'SELECT COUNT(DISTINCT "maps_radiosignal"."id") FROM "maps_radiosignal"'
Uhoh. Somehow .count() keeps the .distinct() part, but replaces the
.extra() and .values() parts with counting primary keys?
I tried it with values('tens'), values_list('tens'), and
values_list('tens', flat=True), as well no change.
So far I've tested Django 1.6 with Python 2.7 and PostgreSQL 9.3, and
Django 1.7 with Python 3.4 and PostgreSQL 9.1, all seem to behave the same.
I can work around this, since the possible resulting querysets are pretty
small, and evaluating them with len() isn't too slow. But I'm wondering if
it's a bug in Django, or something else I've missed?
Mattias
--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-users/21085de5-a899-437c-b13a-8584104df65f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.