On Wed, Mar 24, 2010 at 4:56 AM, Phlip <phlip2...@gmail.com> wrote:
> Djangoists:
>
> Suppose I want a test to fail if the database would respond to a given
> QuerySet without using my favorite index.
>
> This test case would attempt, at developer test time, to solve the
> same problem as a "soak test" would, at acceptance testing time. So
> the earlier a test fails - without soaking! - the better. Even if my
> favorite index works, today, an innocent-looking refactor, tomorrow,
> might somehow prevent the database from using it.
>
> To solve that problem (for tests that use MySQL), add an assertion
> like this:
>
>        self.assert_performant(Order.objects.all(),
> must_have='my_favorite_index')
>
> When that fails, it emits all this:
>
> AssertionError: SELECT `orders_order`.`id`, ... ORDER BY
> `orders_order`.`purchasedate` DESC
>
>  should contain `my_favorite_index` in EXPLAIN
>
> ((1L, u'SIMPLE', u'orders_order', u'ALL', None, None, None, None, 5L,
> u'Using filesort'),)
>
> The assertion converted Order.objects.all() into its SQL SELECT
> statement, and then ran EXPLAIN on this. Because 'my_favorite_index'
> did not appear anywhere in the results, the assertion faults, and
> reports all its constituent details.
>
> To deny that an EXPLAIN may contain a given string, such as the
> dreaded 'ALL' (indicating a query that hits every record in a table,
> no matter how long), use without='ALL'. The default behavior, with no
> arguments, is without='ALL'. And the assertion returns both the query
> and the explanation (as a tuple of tuples of tuples), for more
> arbitrarily complex assertions.
>
> The sauce:
>
>    def assert_performant(self, qs, without=None, must_have=None):
>        query = str(qs.query)
>        from django.db import connection, transaction
>        cursor = connection.cursor()
>        cursor.execute('EXPLAIN ' + query)
>        report = cursor.fetchall()
>        results = '` in EXPLAIN\n\n' + repr(report)
>        if not without and not must_have:  without = 'ALL'
>
>        if must_have:
>            self.assertTrue( must_have in repr(report), query + '\n\n
> should contain `' +
>                              must_have + results)
>        if without:
>            self.assertFalse( without in repr(report), query + '\n\n
> should not contain `' +
>                               without + results )
>
>        return (query, report)
>
> Warning: Developer tests should run with short, suite fixtures.
> However, a database may see fit to skip and index and read a whole
> table, if it fits in memory. Oh, and even worse, developer tests
> should use sqlite3, to permit TDD speed, and its EXPLAIN command
> returns ... something different.
>
> Other than that, I have used this technique, copiously, on other
> projects, and whenever that assertion failed it indeed saved us from
> accidentally deploying a slow SQL statement to our server!

Hi Phlip,

Interesting idea! Thanks for sharing.

I've just spent a bunch of time hunting down a performance regression
that was caused by someone tweaking a database index, so I'd love to
see something like this in trunk.

A couple of quick comments:

 * You're using str(qs.query) to get the contents of the query. This
will work fine when you are only using a single database, but in the
general case, you should be using
qs.query.get_compiler(qs.db).as_sql(). This will ensure that the SQL
that is EXPLAINed is exactly the same as the query that is issued,
including database specific modifications.

 * Similarly, connection.cursor() will only work in the single
database case. connections[qs.db].cursor() will guarantee you explain
the query on the right database.

 * The code you provide will work for MySQL, but it won't work for
other databases. Postgres uses a different format for describing query
plans; SQLite parses explain statements, but doesn't actually do
anything with the request. I don't know what Oracle does, but I
guarantee it won't be pleasant ;-) I appreciate that MySQL is your
personal priority, but it's worth noting for readers who might be
hoping to use this technique on other databases.

 * An assertion named "performant" will get into trunk over my cold,
dead body :-) People: performant is not a word. Or, at least, it is,
but not one that means what you think it means.

[1] http://boulter.com/blog/2004/08/19/performant-is-not-a-word/

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to