subject:"assert_performant\(query

assert_performant(query_set)

2010-03-23 Thread Phlip

Djangoists:

Suppose I want a test to fail if the database would respond to a given
QuerySet without using my favorite index.

This test case would attempt, at developer test time, to solve the
same problem as a "soak test" would, at acceptance testing time. So
the earlier a test fails - without soaking! - the better. Even if my
favorite index works, today, an innocent-looking refactor, tomorrow,
might somehow prevent the database from using it.

To solve that problem (for tests that use MySQL), add an assertion
like this:

self.assert_performant(Order.objects.all(),
must_have='my_favorite_index')

When that fails, it emits all this:

AssertionError: SELECT `orders_order`.`id`, ... ORDER BY
`orders_order`.`purchasedate` DESC

 should contain `my_favorite_index` in EXPLAIN

((1L, u'SIMPLE', u'orders_order', u'ALL', None, None, None, None, 5L,
u'Using filesort'),)

The assertion converted Order.objects.all() into its SQL SELECT
statement, and then ran EXPLAIN on this. Because 'my_favorite_index'
did not appear anywhere in the results, the assertion faults, and
reports all its constituent details.

To deny that an EXPLAIN may contain a given string, such as the
dreaded 'ALL' (indicating a query that hits every record in a table,
no matter how long), use without='ALL'. The default behavior, with no
arguments, is without='ALL'. And the assertion returns both the query
and the explanation (as a tuple of tuples of tuples), for more
arbitrarily complex assertions.

The sauce:

def assert_performant(self, qs, without=None, must_have=None):
query = str(qs.query)
from django.db import connection, transaction
cursor = connection.cursor()
cursor.execute('EXPLAIN ' + query)
report = cursor.fetchall()
results = '` in EXPLAIN\n\n' + repr(report)
if not without and not must_have:  without = 'ALL'

if must_have:
self.assertTrue( must_have in repr(report), query + '\n\n
should contain `' +
  must_have + results)
if without:
self.assertFalse( without in repr(report), query + '\n\n
should not contain `' +
   without + results )

return (query, report)

Warning: Developer tests should run with short, suite fixtures.
However, a database may see fit to skip and index and read a whole
table, if it fits in memory. Oh, and even worse, developer tests
should use sqlite3, to permit TDD speed, and its EXPLAIN command
returns ... something different.

Other than that, I have used this technique, copiously, on other
projects, and whenever that assertion failed it indeed saved us from
accidentally deploying a slow SQL statement to our server!

--
  Phlip
  http://zeekland.zeroplayer.com/Pigleg_Too/1

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-23 Thread Russell Keith-Magee

On Wed, Mar 24, 2010 at 4:56 AM, Phlip  wrote:
> Djangoists:
>
> Suppose I want a test to fail if the database would respond to a given
> QuerySet without using my favorite index.
>
> This test case would attempt, at developer test time, to solve the
> same problem as a "soak test" would, at acceptance testing time. So
> the earlier a test fails - without soaking! - the better. Even if my
> favorite index works, today, an innocent-looking refactor, tomorrow,
> might somehow prevent the database from using it.
>
> To solve that problem (for tests that use MySQL), add an assertion
> like this:
>
>        self.assert_performant(Order.objects.all(),
> must_have='my_favorite_index')
>
> When that fails, it emits all this:
>
> AssertionError: SELECT `orders_order`.`id`, ... ORDER BY
> `orders_order`.`purchasedate` DESC
>
>  should contain `my_favorite_index` in EXPLAIN
>
> ((1L, u'SIMPLE', u'orders_order', u'ALL', None, None, None, None, 5L,
> u'Using filesort'),)
>
> The assertion converted Order.objects.all() into its SQL SELECT
> statement, and then ran EXPLAIN on this. Because 'my_favorite_index'
> did not appear anywhere in the results, the assertion faults, and
> reports all its constituent details.
>
> To deny that an EXPLAIN may contain a given string, such as the
> dreaded 'ALL' (indicating a query that hits every record in a table,
> no matter how long), use without='ALL'. The default behavior, with no
> arguments, is without='ALL'. And the assertion returns both the query
> and the explanation (as a tuple of tuples of tuples), for more
> arbitrarily complex assertions.
>
> The sauce:
>
>    def assert_performant(self, qs, without=None, must_have=None):
>        query = str(qs.query)
>        from django.db import connection, transaction
>        cursor = connection.cursor()
>        cursor.execute('EXPLAIN ' + query)
>        report = cursor.fetchall()
>        results = '` in EXPLAIN\n\n' + repr(report)
>        if not without and not must_have:  without = 'ALL'
>
>        if must_have:
>            self.assertTrue( must_have in repr(report), query + '\n\n
> should contain `' +
>                              must_have + results)
>        if without:
>            self.assertFalse( without in repr(report), query + '\n\n
> should not contain `' +
>                               without + results )
>
>        return (query, report)
>
> Warning: Developer tests should run with short, suite fixtures.
> However, a database may see fit to skip and index and read a whole
> table, if it fits in memory. Oh, and even worse, developer tests
> should use sqlite3, to permit TDD speed, and its EXPLAIN command
> returns ... something different.
>
> Other than that, I have used this technique, copiously, on other
> projects, and whenever that assertion failed it indeed saved us from
> accidentally deploying a slow SQL statement to our server!

Hi Phlip,

Interesting idea! Thanks for sharing.

I've just spent a bunch of time hunting down a performance regression
that was caused by someone tweaking a database index, so I'd love to
see something like this in trunk.

A couple of quick comments:

 * You're using str(qs.query) to get the contents of the query. This
will work fine when you are only using a single database, but in the
general case, you should be using
qs.query.get_compiler(qs.db).as_sql(). This will ensure that the SQL
that is EXPLAINed is exactly the same as the query that is issued,
including database specific modifications.

 * Similarly, connection.cursor() will only work in the single
database case. connections[qs.db].cursor() will guarantee you explain
the query on the right database.

 * The code you provide will work for MySQL, but it won't work for
other databases. Postgres uses a different format for describing query
plans; SQLite parses explain statements, but doesn't actually do
anything with the request. I don't know what Oracle does, but I
guarantee it won't be pleasant ;-) I appreciate that MySQL is your
personal priority, but it's worth noting for readers who might be
hoping to use this technique on other databases.

 * An assertion named "performant" will get into trunk over my cold,
dead body :-) People: performant is not a word. Or, at least, it is,
but not one that means what you think it means.

[1] http://boulter.com/blog/2004/08/19/performant-is-not-a-word/

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-24 Thread Phlip

Russell Keith-Magee wrote:

> Interesting idea! Thanks for sharing.

http://www.oreillynet.com/ruby/blog/2008/01/assert_efficient_sql.html

> I've just spent a bunch of time hunting down a performance regression
> that was caused by someone tweaking a database index, so I'd love to
> see something like this in trunk.

Thanks for having 1 point more enthusiasm than me! The assertion only
works for MySQL (or anything with a similar EXPLAIN result), and all
TDD tests "should" run in SQLite3, for speed. (At my day gig I really
ought to get off my lazy butt and configure our integration test
server to run in MySQL instead of SQLite3, and then any MySQL-specific
tests would run inside if statements. The closer to the production
environment the better.)

> A couple of quick comments:
>
>  * You're using str(qs.query) to get the contents of the query. This
> will work fine when you are only using a single database, but in the
> general case, you should be using
> qs.query.get_compiler(qs.db).as_sql(). This will ensure that the SQL
> that is EXPLAINed is exactly the same as the query that is issued,
> including database specific modifications.

Ah, that's why I asked that in the prequel post, then gave up the
instant I discovered qs.query. And you are indeed correct, because,
for example, a harmless-looking filter like .filter(status='P') was
coming out SELECT ... status = P, without quotes on the P. I was about
to ask about that; tx for the answer.

>  * An assertion named "performant" will get into trunk over my cold,
> dead body :-) People: performant is not a word. Or, at least, it is,
> but not one that means what you think it means.

Ah, hence the word is free for anyone to use on any code that won't be
submitting to the trunk - thanks! And the pseudo-word "Performant"
fits into a dead-spot where there's no other word, so we are back to
the classic dispute between the Prescriptive and the Descriptive
dictionary. Swell.

(Inside joke: "Swell" was formerly proscribed...)

More importantly, the assertion does _not_ check we are perform...ance-
oriented, it checks we are _compliant_ to some DBA standard.
Performance checks are for soak tests. So try this:

  query, report = self.assert_explain_sql( qs,
mysql=dict(index='frob'),

sqlite=dict(whatever='42'),

bamboozql=dict(without='ALL') )

  self.assert_contains('ORDER BY foot_size', query)

Now gaze in awe at that last assertion. If we wanted to test that we
have a correct .order() builder, we might write a test that pushes
records out-of-order, then queries the database and checks the records
are in the right order. That kind of test smacks of brute-force, and
the database might accidentally fool the test, and put the records
into order for some other reason.

Because we are really testing if our calls to the QuerySet builder
were correct, we will pin down their behavior if we parse back the
intermediate value - the actual SELECT statement, and then interrogate
it.

Oh, yeah, and such a test is also perform...ing. Well. Because it
doesn't build records or read them.

I will see when I have time to code that up...

--
  Phlip
  http://c2.com/cgi/wiki?MoreliaViridis

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-29 Thread Phlip

> qs.query.get_compiler(qs.db).as_sql()

That gives this error message with the usual zero Google hits for that
error message:

  AttributeError: 'BaseQuery' object has no attribute 'get_compiler'

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-29 Thread Phlip

>   AttributeError: 'BaseQuery' object has no attribute 'get_compiler'

qs._as_sql() returns a tuple of a SELECT statement, and a (),
presumably with wildcards; I will start there.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-29 Thread Phlip

>  * Similarly, connection.cursor() will only work in the single
> database case. connections[qs.db].cursor()

Aaaand that's not working either. (No connections, no qs.db.)

I must take a break until someone unravels all this, because I don't
know enough about the architecture inside QuerySet...

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-29 Thread Russell Keith-Magee

On Tue, Mar 30, 2010 at 12:05 AM, Phlip  wrote:
>> qs.query.get_compiler(qs.db).as_sql()
>
> That gives this error message with the usual zero Google hits for that
> error message:
>
>  AttributeError: 'BaseQuery' object has no attribute 'get_compiler'

Sorry - a point of clarification. the technique I gave you is 1.2
specific. There's no analog under 1.1, because 1.1. doesn't support
multiple databases.

I presumed you were using 1.2 because your original post mentioned
using str(qs.query), rather than qs.as_sql().

Yours,
Russ Magee %_)

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: assert_performant(query_set)

2010-03-29 Thread Phlip

Russell Keith-Magee wrote:

> Sorry - a point of clarification. the technique I gave you is 1.2
> specific. There's no analog under 1.1, because 1.1. doesn't support
> multiple databases.

That explains a lot; tx.

My current impediment - probably nothing to do with Django - is ...

cursor.execute('EXPLAIN ' + query)

...seems to scramble a sqlite3 database. If I take the EXPLAIN out I
don't get unrelated test failures (tho the new assertion naturally
does not work!).

I will make it work for MySQL!

--
  Phlip

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

assert_performant(query_set)

Re: assert_performant(query_set)

Re: assert_performant(query_set)

Re: assert_performant(query_set)

Re: assert_performant(query_set)

Re: assert_performant(query_set)

Re: assert_performant(query_set)

Re: assert_performant(query_set)

8 matches

Site Navigation

Mail list logo

Footer information