assert_performant(query_set)
Djangoists: Suppose I want a test to fail if the database would respond to a given QuerySet without using my favorite index. This test case would attempt, at developer test time, to solve the same problem as a "soak test" would, at acceptance testing time. So the earlier a test fails - without soaking! - the better. Even if my favorite index works, today, an innocent-looking refactor, tomorrow, might somehow prevent the database from using it. To solve that problem (for tests that use MySQL), add an assertion like this: self.assert_performant(Order.objects.all(), must_have='my_favorite_index') When that fails, it emits all this: AssertionError: SELECT `orders_order`.`id`, ... ORDER BY `orders_order`.`purchasedate` DESC should contain `my_favorite_index` in EXPLAIN ((1L, u'SIMPLE', u'orders_order', u'ALL', None, None, None, None, 5L, u'Using filesort'),) The assertion converted Order.objects.all() into its SQL SELECT statement, and then ran EXPLAIN on this. Because 'my_favorite_index' did not appear anywhere in the results, the assertion faults, and reports all its constituent details. To deny that an EXPLAIN may contain a given string, such as the dreaded 'ALL' (indicating a query that hits every record in a table, no matter how long), use without='ALL'. The default behavior, with no arguments, is without='ALL'. And the assertion returns both the query and the explanation (as a tuple of tuples of tuples), for more arbitrarily complex assertions. The sauce: def assert_performant(self, qs, without=None, must_have=None): query = str(qs.query) from django.db import connection, transaction cursor = connection.cursor() cursor.execute('EXPLAIN ' + query) report = cursor.fetchall() results = '` in EXPLAIN\n\n' + repr(report) if not without and not must_have: without = 'ALL' if must_have: self.assertTrue( must_have in repr(report), query + '\n\n should contain `' + must_have + results) if without: self.assertFalse( without in repr(report), query + '\n\n should not contain `' + without + results ) return (query, report) Warning: Developer tests should run with short, suite fixtures. However, a database may see fit to skip and index and read a whole table, if it fits in memory. Oh, and even worse, developer tests should use sqlite3, to permit TDD speed, and its EXPLAIN command returns ... something different. Other than that, I have used this technique, copiously, on other projects, and whenever that assertion failed it indeed saved us from accidentally deploying a slow SQL statement to our server! -- Phlip http://zeekland.zeroplayer.com/Pigleg_Too/1 -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
On Wed, Mar 24, 2010 at 4:56 AM, Phlip wrote: > Djangoists: > > Suppose I want a test to fail if the database would respond to a given > QuerySet without using my favorite index. > > This test case would attempt, at developer test time, to solve the > same problem as a "soak test" would, at acceptance testing time. So > the earlier a test fails - without soaking! - the better. Even if my > favorite index works, today, an innocent-looking refactor, tomorrow, > might somehow prevent the database from using it. > > To solve that problem (for tests that use MySQL), add an assertion > like this: > > self.assert_performant(Order.objects.all(), > must_have='my_favorite_index') > > When that fails, it emits all this: > > AssertionError: SELECT `orders_order`.`id`, ... ORDER BY > `orders_order`.`purchasedate` DESC > > should contain `my_favorite_index` in EXPLAIN > > ((1L, u'SIMPLE', u'orders_order', u'ALL', None, None, None, None, 5L, > u'Using filesort'),) > > The assertion converted Order.objects.all() into its SQL SELECT > statement, and then ran EXPLAIN on this. Because 'my_favorite_index' > did not appear anywhere in the results, the assertion faults, and > reports all its constituent details. > > To deny that an EXPLAIN may contain a given string, such as the > dreaded 'ALL' (indicating a query that hits every record in a table, > no matter how long), use without='ALL'. The default behavior, with no > arguments, is without='ALL'. And the assertion returns both the query > and the explanation (as a tuple of tuples of tuples), for more > arbitrarily complex assertions. > > The sauce: > > def assert_performant(self, qs, without=None, must_have=None): > query = str(qs.query) > from django.db import connection, transaction > cursor = connection.cursor() > cursor.execute('EXPLAIN ' + query) > report = cursor.fetchall() > results = '` in EXPLAIN\n\n' + repr(report) > if not without and not must_have: without = 'ALL' > > if must_have: > self.assertTrue( must_have in repr(report), query + '\n\n > should contain `' + > must_have + results) > if without: > self.assertFalse( without in repr(report), query + '\n\n > should not contain `' + > without + results ) > > return (query, report) > > Warning: Developer tests should run with short, suite fixtures. > However, a database may see fit to skip and index and read a whole > table, if it fits in memory. Oh, and even worse, developer tests > should use sqlite3, to permit TDD speed, and its EXPLAIN command > returns ... something different. > > Other than that, I have used this technique, copiously, on other > projects, and whenever that assertion failed it indeed saved us from > accidentally deploying a slow SQL statement to our server! Hi Phlip, Interesting idea! Thanks for sharing. I've just spent a bunch of time hunting down a performance regression that was caused by someone tweaking a database index, so I'd love to see something like this in trunk. A couple of quick comments: * You're using str(qs.query) to get the contents of the query. This will work fine when you are only using a single database, but in the general case, you should be using qs.query.get_compiler(qs.db).as_sql(). This will ensure that the SQL that is EXPLAINed is exactly the same as the query that is issued, including database specific modifications. * Similarly, connection.cursor() will only work in the single database case. connections[qs.db].cursor() will guarantee you explain the query on the right database. * The code you provide will work for MySQL, but it won't work for other databases. Postgres uses a different format for describing query plans; SQLite parses explain statements, but doesn't actually do anything with the request. I don't know what Oracle does, but I guarantee it won't be pleasant ;-) I appreciate that MySQL is your personal priority, but it's worth noting for readers who might be hoping to use this technique on other databases. * An assertion named "performant" will get into trunk over my cold, dead body :-) People: performant is not a word. Or, at least, it is, but not one that means what you think it means. [1] http://boulter.com/blog/2004/08/19/performant-is-not-a-word/ Yours, Russ Magee %-) -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
Russell Keith-Magee wrote: > Interesting idea! Thanks for sharing. http://www.oreillynet.com/ruby/blog/2008/01/assert_efficient_sql.html > I've just spent a bunch of time hunting down a performance regression > that was caused by someone tweaking a database index, so I'd love to > see something like this in trunk. Thanks for having 1 point more enthusiasm than me! The assertion only works for MySQL (or anything with a similar EXPLAIN result), and all TDD tests "should" run in SQLite3, for speed. (At my day gig I really ought to get off my lazy butt and configure our integration test server to run in MySQL instead of SQLite3, and then any MySQL-specific tests would run inside if statements. The closer to the production environment the better.) > A couple of quick comments: > > * You're using str(qs.query) to get the contents of the query. This > will work fine when you are only using a single database, but in the > general case, you should be using > qs.query.get_compiler(qs.db).as_sql(). This will ensure that the SQL > that is EXPLAINed is exactly the same as the query that is issued, > including database specific modifications. Ah, that's why I asked that in the prequel post, then gave up the instant I discovered qs.query. And you are indeed correct, because, for example, a harmless-looking filter like .filter(status='P') was coming out SELECT ... status = P, without quotes on the P. I was about to ask about that; tx for the answer. > * An assertion named "performant" will get into trunk over my cold, > dead body :-) People: performant is not a word. Or, at least, it is, > but not one that means what you think it means. Ah, hence the word is free for anyone to use on any code that won't be submitting to the trunk - thanks! And the pseudo-word "Performant" fits into a dead-spot where there's no other word, so we are back to the classic dispute between the Prescriptive and the Descriptive dictionary. Swell. (Inside joke: "Swell" was formerly proscribed...) More importantly, the assertion does _not_ check we are perform...ance- oriented, it checks we are _compliant_ to some DBA standard. Performance checks are for soak tests. So try this: query, report = self.assert_explain_sql( qs, mysql=dict(index='frob'), sqlite=dict(whatever='42'), bamboozql=dict(without='ALL') ) self.assert_contains('ORDER BY foot_size', query) Now gaze in awe at that last assertion. If we wanted to test that we have a correct .order() builder, we might write a test that pushes records out-of-order, then queries the database and checks the records are in the right order. That kind of test smacks of brute-force, and the database might accidentally fool the test, and put the records into order for some other reason. Because we are really testing if our calls to the QuerySet builder were correct, we will pin down their behavior if we parse back the intermediate value - the actual SELECT statement, and then interrogate it. Oh, yeah, and such a test is also perform...ing. Well. Because it doesn't build records or read them. I will see when I have time to code that up... -- Phlip http://c2.com/cgi/wiki?MoreliaViridis -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
> qs.query.get_compiler(qs.db).as_sql() That gives this error message with the usual zero Google hits for that error message: AttributeError: 'BaseQuery' object has no attribute 'get_compiler' -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
> AttributeError: 'BaseQuery' object has no attribute 'get_compiler' qs._as_sql() returns a tuple of a SELECT statement, and a (), presumably with wildcards; I will start there. -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
> * Similarly, connection.cursor() will only work in the single > database case. connections[qs.db].cursor() Aaaand that's not working either. (No connections, no qs.db.) I must take a break until someone unravels all this, because I don't know enough about the architecture inside QuerySet... -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
On Tue, Mar 30, 2010 at 12:05 AM, Phlip wrote: >> qs.query.get_compiler(qs.db).as_sql() > > That gives this error message with the usual zero Google hits for that > error message: > > AttributeError: 'BaseQuery' object has no attribute 'get_compiler' Sorry - a point of clarification. the technique I gave you is 1.2 specific. There's no analog under 1.1, because 1.1. doesn't support multiple databases. I presumed you were using 1.2 because your original post mentioned using str(qs.query), rather than qs.as_sql(). Yours, Russ Magee %_) -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: assert_performant(query_set)
Russell Keith-Magee wrote: > Sorry - a point of clarification. the technique I gave you is 1.2 > specific. There's no analog under 1.1, because 1.1. doesn't support > multiple databases. That explains a lot; tx. My current impediment - probably nothing to do with Django - is ... cursor.execute('EXPLAIN ' + query) ...seems to scramble a sqlite3 database. If I take the EXPLAIN out I don't get unrelated test failures (tho the new assertion naturally does not work!). I will make it work for MySQL! -- Phlip -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.