Michael Radjiez quite reasonably raised a problem with a recent change I
made to fix a bug with QuerySet.count() (it was previously ignoring any
"distinct" modifiers). The problem is that if you are using a version of
SQLite prior to version 3.2.6 (released in September 2005),
QuerySet.distinct().count() generates a query that is not understood by
your version of SQLite. :-(

So what is our policy on working around arguably broken databases? This
is a little trickier than "some MySQL installs don't do transactions",
since it isn't just ignored -- it crashes Django.

A  couple of solutions (from my favourite to absolutely least
favourite):

(1) If you are using a sufficiently old version of SQLite, the SQL
produced by distinct().count() does not respect the "distinct" modifier.

        GOOD:
        - Code runs to completion. You always get an answer. It is, in
        fact, the same as you got with Django last week. If you really
        need distinct counting, upgrade SQLite.
        
        - Analogous to the way we treat versions of MySQL without
        transactions (and probably a reasonable approach).
        
        BAD:
        - You get different answers to other people in some queries. And
        your answers are wrong (see the final point for why this might
        be a problem).
        
        - The tests are going to be a real pain to make work
        conditionally like this (again, maybe the MYSql-transaction
        approach might be reasonable: the test will always fail and you
        will know it's because of your SQLite?!)

(2) Inside the SQLite backend, try to work out if the version is
sufficiently old and do the "count(distinct(..))" bit by hand by doing a
"SELECT DISTINCT..." and then counting the number of results (sadly, in
pysqlite2, cursor.rowcount is always -1 after a query, so can't cheat).

        GOOD: 
        - you get the same (right) answer under all versions and so your
        applications work the same, no matter which database is
        installed.
        
        BAD:
        - it's going to be a little inefficient and very memory hungry
        if you have a "bad" version of SQLite. Counting the results is
        going to require calling len() on the queryset (or equivalent).
        Very large result sets on a shared server will run out of
        memory.
        
(3) We don't support QuerySet.distinct().count() on older versions of
SQLite. We can make the tests conditional, so that if you get the
"unsupported" error, it just informs you (raises an exception?). We
might even be able to make the tests always pass.

        GOOD:
        - no more accidental huge memory consumption. You are explicitly
        told when you're in trouble.
        
        - upgrading SQLite if you want things to work properly is really
        not that big a deal (as compared to, say, PostgreSQL or MySQL),
        since it's such a tiny, binary (36K when stripped on my machine)
        linked to some very common libraries. Even a static version is
        pretty small. It was originally designed to be built into
        distributed packaged. So we are still not making it that hard
        for people to use SQLite + Django.
        
        BAD:
        - if you have an older version of SQLite and want to run an app
        that uses distinct().count(), you have to do some sysadmin work
        (that is why the first version may be better -- at least the
        code runs, if not necessarily very efficiently).
        
(4) We back out this change.

        GOOD:
        - Works the same for everybody again.
        
        BAD:
        - QuerySets now give wrong results and behave
        counter-intuitively.
        
        - Some not uncommon situations, such as pagination, become
        problematic without manually coding SQL. A typical example is as
        in the test suite: articles can be assigned to multiple
        publications. You want to display all articles from a set of
        publications. So you need distinct(). Trying to work out the
        pagination, you obviously need distinct().count(), since count()
        over-counts the number of articles (every article that is in
        multiple magazines is counted multiple times) and would result
        in page breaks coming too early.

Cheers,
Malcolm


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~----------~----~----~----~------~----~------~--~---

Reply via email to