[sqlalchemy] Re: Optimizing a slow query
On Jun 8, 5:09 am, beewee [EMAIL PROTECTED] wrote: Hi, thanks for your answers. Other improvements would include (as previously stated by Michael) would be to make sure you have indexed all of the items in your WHERE, ORDER BY, and ON clauses. I created this index: create index viewforum on forum_post (topic_id, id); Is this right? As a quick check, try reducing the 140,000 offset to 0 If I do so, the query is really fast (0.01 sec), but with a big offset it still takes more than 10 seconds :/ This is what EXPLAIN says now:http://paste.pocoo.org/show/64838/ I executed OPTIMIZE TABLE forum_post but it didn't speed up the queries. That narrows the problem down to the sub-select then, since this select will only return 15 rows, but it has to run through a minimum of 140,015 rows to get those 15. To verify this is the issue, try doing the sub-select by itself and doing and explain on it to verify that is the issue. Simplified query: explain SELECT ... FROM (SELECT ... FROM forum_post WHERE 9250 = forum_post.topic_id ORDER BY forum_post.id LIMIT 14, 15) AS anon_1 LEFT OUTER JOIN ... ON anon_1.forum_post_id = forum_attachment_1.post_id LEFT OUTER JOIN ... ON anon_1.forum_post_author_id = portal_user_1.id ORDER BY anon_1.forum_post_id, forum_attachment_1.id, portal_user_1.id; Try to do just the sub-select: SELECT ... FROM forum_post WHERE 9250 = forum_post.topic_id ORDER BY forum_post.id LIMIT 14, 15; and an explain on it: SELECT ... FROM forum_post WHERE 9250 = forum_post.topic_id ORDER BY forum_post.id LIMIT 14, 15 Doing a quick sanity check on my system, a similar query to this subselect with a LIMIT 14, 15 takes about 2 seconds after a server restart and about 0.750 seconds once the indexes have been cached in memory. This is on an old 2.8 GHz Xeon with Hyperthreading with a software RAID filesystem that runs about 150 MB/s and 4 GB of memory for mysql with a dabase index size of 8 GB. So really, your 10-second query time is only 5 times slower which could be explained by not having enough memory devoted to the correct MySQL buffers, so MySQL starts having to wait on the hard drive. Take a look at some tuning parameters that depend upon the database engine that you are using. If you're still out of ideas, consider putting a timestamp on the records and indexing the timestamp. You can then do a query based upon a time range which I know is fast since I do it all the time on a 750-million record stock database. Can you explain this a little bit more? The forum_post table already has a time stamp (pub_date = the date when the post was written), but I have no idea how to filter the records using this column to just get the posts of a specific page of a specific topic. If you index the timestamp, then you can provide this as a filter. For example, in your sub-select, if you limit it to posts to the topic within the last 30 days, then your sub-select query could easily only have to deal with 5,000 records instead of 140,000. For example: From: SELECT ... FROM forum_post WHERE 9250 = forum_post.topic_id ORDER BY forum_post.id LIMIT 14, 15 To: SELECT ... FROM forum_post WHERE 9250 = forum_post.topic_id AND pub_date = DATE_SUB(NOW(), INTERVAL 31 DAY) ORDER BY forum_post.id LIMIT 15 This should greatly limit the number of rows that the database must deal with internally. Also, keep in mind that it is very important that you make sure that the DATE_SUB calculation can be done just once, and not for every row. For example, pub_date = DATE_SUB(NOW(), INTERVAL 31 DAY) will be very fast, but TO_DAYS(pub_date) - TO_DAYS(NOW()) 31 will be extremely slow since the database will have to do the calculation for every row. Regards, Eric --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
Hi, thanks for your answers. Other improvements would include (as previously stated by Michael) would be to make sure you have indexed all of the items in your WHERE, ORDER BY, and ON clauses. I created this index: create index viewforum on forum_post (topic_id, id); Is this right? As a quick check, try reducing the 140,000 offset to 0 If I do so, the query is really fast (0.01 sec), but with a big offset it still takes more than 10 seconds :/ This is what EXPLAIN says now: http://paste.pocoo.org/show/64838/ I executed OPTIMIZE TABLE forum_post but it didn't speed up the queries. If you're still out of ideas, consider putting a timestamp on the records and indexing the timestamp. You can then do a query based upon a time range which I know is fast since I do it all the time on a 750-million record stock database. Can you explain this a little bit more? The forum_post table already has a time stamp (pub_date = the date when the post was written), but I have no idea how to filter the records using this column to just get the posts of a specific page of a specific topic. Benjamin Wiegand On 7 Jun., 16:22, EricHolmberg [EMAIL PROTECTED] wrote: It looks like your sub-select (before the joins) is processing up to 140,015 records, so that will slow things down since the database may not optimize that sub-selection based upon your outer joins. As a quick check, try reducing the 140,000 offset to 0 (I know this won't work for your application, but do this as a sanity check) in your limit statement. If that works, then try refactoring your database (potentially with a timestamp like I explain later). subselect: (... WHERE forum_post.topic_id = 9250 ORDER BY forum_post.id LIMIT 14, 15) Other improvements would include (as previously stated by Michael) would be to make sure you have indexed all of the items in your WHERE, ORDER BY, and ON clauses. Also, check to see if you can do arrange your web page such that you can reduce the amount of information required initially which will reduce your query time. Really, 198,398 records is pretty small, so even with a filesort, I wouldn't expect this to take very long. Have you double-checked that your server's file system is running okay, that your DB tables aren't fragemented, etc? You can handle most of this optimization by doing an OPTIMIZE table_name; query in MySQL. If you're still out of ideas, consider putting a timestamp on the records and indexing the timestamp. You can then do a query based upon a time range which I know is fast since I do it all the time on a 750-million record stock database. -Eric --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
Hi, create index viewforum on forum_post (topic_id, id); You probably want: create index forum_post_topic_id on forum_post (topic_id); Paul --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
create index viewforum on forum_post (topic_id, id); You probably want: create index forum_post_topic_id on forum_post (topic_id); are you sure? A key on topic_id and id makes my query much faster for low offsets, while without it the query takes even for low offsets 10 seconds. Benjamin Wiegand --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
Hi, create index forum_post_topic_id on forum_post (topic_id); are you sure? A key on topic_id and id makes my query much faster for low offsets, while without it the query takes even for low offsets 10 seconds. I didn't follow the beginning of this thread, so I'm not sure exactly your query, but if you're searching on topic_id, I'd expect the index I suggested to help (although I'm not sure how much), and the combined index you mentioned to make almost no difference. Did you try the one I suggested? But from your results, I wonder if the slowness is caused by something else, not particularly by the lack of an index. Paul --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
It looks like your sub-select (before the joins) is processing up to 140,015 records, so that will slow things down since the database may not optimize that sub-selection based upon your outer joins. As a quick check, try reducing the 140,000 offset to 0 (I know this won't work for your application, but do this as a sanity check) in your limit statement. If that works, then try refactoring your database (potentially with a timestamp like I explain later). subselect: (... WHERE forum_post.topic_id = 9250 ORDER BY forum_post.id LIMIT 14, 15) Other improvements would include (as previously stated by Michael) would be to make sure you have indexed all of the items in your WHERE, ORDER BY, and ON clauses. Also, check to see if you can do arrange your web page such that you can reduce the amount of information required initially which will reduce your query time. Really, 198,398 records is pretty small, so even with a filesort, I wouldn't expect this to take very long. Have you double-checked that your server's file system is running okay, that your DB tables aren't fragemented, etc? You can handle most of this optimization by doing an OPTIMIZE table_name; query in MySQL. If you're still out of ideas, consider putting a timestamp on the records and indexing the timestamp. You can then do a query based upon a time range which I know is fast since I do it all the time on a 750-million record stock database. -Eric --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
Hi, I merged the two tables into a single one, changed the mapping but the query still needs more than 10 seconds :/ That's how EXPLAIN looks like now: http://paste.pocoo.org/show/63542/ I think the problem is this one: | 2 | DERIVED | forum_post | ALL| forum_post_topic_id | forum_post_topic_id | 4 | NULL| 198398 | Using filesort | do you have any idea, why mysql is using filesort for this and how I can get mysql not to search through 198398 rows? These are the indexes of forum_post: http://paste.pocoo.org/show/63543/ Benjamin Wiegand On 5 Jun., 20:33, Michael Bayer [EMAIL PROTECTED] wrote: for starters I'd combine post_table and post_text_table into onenot much is accomplished there by having two tables. Also make sure forum_post.topic_id is indexed. On Jun 5, 2008, at 1:37 PM, beewee wrote: Hi, we're writing a bulleting board using sqlalchemy at the moment, but we have the problem, that the database query for viewing a topic is quite slow for big topics. These are the relevant table definitions and mappings: http://paste.pocoo.org/show/62703/ This is the query that is slow:http://paste.pocoo.org/show/62706/ This is what EXPLAIN says:http://paste.pocoo.org/show/62708/ Executing this query needs up to 25 seconds on our test server that's just idling. On our productive servers (which aren't idling, of course) phpbb is able to execute an adequate query much faster. May you can help us speeding up this query? I don't know what to improve, since all parts of the query already use a key (except the derived one), but unfortunately I have a quite small knowledge of improving database queries. Thank you very much, Benjamin Wiegand --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing a slow query
for starters I'd combine post_table and post_text_table into onenot much is accomplished there by having two tables. Also make sure forum_post.topic_id is indexed. On Jun 5, 2008, at 1:37 PM, beewee wrote: Hi, we're writing a bulleting board using sqlalchemy at the moment, but we have the problem, that the database query for viewing a topic is quite slow for big topics. These are the relevant table definitions and mappings: http://paste.pocoo.org/show/62703/ This is the query that is slow: http://paste.pocoo.org/show/62706/ This is what EXPLAIN says: http://paste.pocoo.org/show/62708/ Executing this query needs up to 25 seconds on our test server that's just idling. On our productive servers (which aren't idling, of course) phpbb is able to execute an adequate query much faster. May you can help us speeding up this query? I don't know what to improve, since all parts of the query already use a key (except the derived one), but unfortunately I have a quite small knowledge of improving database queries. Thank you very much, Benjamin Wiegand --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---