On Wed, Jul 11, 2012 at 5:46 AM, Reindl Harald <h.rei...@thelounge.net>wrote:
> > mysql> show profiles; > > > +----------+------------+------------------------------------------------------------------------------------------+ > > | Query_ID | Duration | Query > > | > > > +----------+------------+------------------------------------------------------------------------------------------+ > > | 1 | 0.32261700 | SELECT SQL_NO_CACHE * FROM > > cms1_quickbar_groups force key (qbq_key) ORDER BY qg_sort ASC | > > | 2 | 0.24592100 | SELECT SQL_NO_CACHE * FROM > > cms1_quickbar_groups ORDER BY qg_sort ASC | > > > +----------+------------+------------------------------------------------------------------------------------------+ > > 2 rows in set (0.00 sec) > > this may be true for small data where it does not matter at all > but if this would be a large table it would cause a lot of I/O > > While I agree with you that Ewen's microbenchmark is not the most convincing, I do agree with his reasoning, and I encourage you to try benchmarking both options on your dataset. Remember to flush both the OS disk cache and the mysql buffer pool between runs, so that your benchmarks are actually reflecting cold runs instead of partially warmed up runs. So why do I believe no index is faster for your particular query? Well, a secondary index (qbq_key in your case) is usually key/value pairs of the form [index key, page ID pointing to tuple]. So if we answered your query with qbq_key, we don't need to do a sort, *but* we'll need to do roughly one disk seek for each key in the index (I'm assuming here that qg_sort values don't have any strong correlation with qg_id, and that the pages for the table aren't already in the buffer pool/OS disk cache). Compare that with a case where we don't use an index, so we must do a filesort. Since mysql has an optimization ( https://dev.mysql.com/doc/refman/5.5/en/order-by-optimization.html - read the "modified filesort algorithm" section) which stores the entire tuple (if under a threshold size controlled by max_length_for_sort_data, which is 1024 bytes by default which your schema seems to fall under) when sorting instead of just the [sort key, page ID], the filesort can actually avoid much of random disk seeks (since mergesort itself is a very sequential IO heavy algorithm). So for this plan, you end up basically doing several sequential IOs over the entire table (whereas the previous plan just had to read the entire table once, albeit randomly). Most likely the mysql optimizer has calculated that several sequential scans over the table are much faster than a bunch of random disk seeks which reads the table at once. In fact, I believe the general rule of thumb for DB optimizers is that if you need to read more than 10% of a table from an index, you are better off using a sequential scan. Like I said, I encourage you to measure the performance to convince yourself that mysql is actually doing the right thing.