From: "Scott Gifford" > >> Apparently MySQL's optimizer sees that it can use the primary key for > >> mirealsource_home_supplemental to do the query, but for some reason > >> decides not to. > > This is often the case when the query will probably return more than 30% of > > the records in that table. In such cases it is more efficient to do a full > > table scan (which is indicated here by the 'ALL' type in the explain > > output). > > Right, ALL would be a great plan if it weren't for the LIMIT 1.
The LIMIT 1 will be performed *after* the recordset is sorted :-( > > Furthermore MySQL can only use an index for sorting if all columns in the > > ORDER BY clause are from the first table in the explain output that doesn't > > have a 'const' join type. This is why setting the ORDER BY to > > mirealsource_homes_supplemental.mls_num will remove the 'Using filesort' and > > result in faster sorting. > > I'm a little surprised MySQL can't figure out on its own that it can > get the same effect by using mls_num from either table, since the > tables are joined on it, so the values will always be identical. You have two LEFT JOINs, so the values of mls_num might be something (the identical value you refer to) or NULL. So, you expect MySQL to evaluate the JOIN conditions, figure out that the ORDER BY column actually is the same as columns from other tables and see if one of these columns is actually the first non-const table in the execution path? I agree with you that 'using filesort' should be considered in the optimizer, but this requires quite a bit of knowledge on the part of the optimizer. (E.g. how many more rows should I allow in the estimated evaluation set in favour of sorting with an index? How should this relate to the sizes of various buffers, etc.? What about different engines (MyISAM and InnoDB store and handle indexes quite differently)?). > > MySQL tries to optimize queries by (among others) guestimating which path > > will result in the smallest number of records. It appears that this path > > with 100 * 8 * 8 * 8 (51200) records is the minimum size it can achieve. > > It looks to me like it's actually 100 * 1 * 1 * 1 = 100 (8 is the key > length), You're right, I looked in the wrong column. > which is the same as what I get when I force a STRAIGHT_JOIN. > So with two plans that will use the same number of records, I'm > surprised MySQL doesn't choose the one that will allow it to use an > index. There is no limiting condition whatsoever in the WHERE clause (except the JOIN condition), so it will try to estimate based on the cardinality and the number of records in the table which excecution path will require it to evaluate the smallest number of records. It will try to estimate whether the use of an index is appropriate or not. But I can imagine that it will favour a full table scan of a small table that will result in a small recordset, especially when it knows that the other, bigger tables are JOINed with only 1 or a few records per join. I do think that improvements on the optimizer should have a high priority for the product, but I guess that features such as stored procedures, triggers, MySQL Cluster will be more important for the larger, paying customers. With a bigger share in the paying market there will be more resources available to improve other parts of the product. > I know there are a couple of tricks like that to fix this one query. > What makes this hard is that that query is one of many that can be > generated by a CGI-to-SQL search gateway. If I put in a FORCE INDEX > (mls_num) and the user searches by price, for example, query > performance will be very bad. In order to know the right indexes to > force, as far as I can tell I'd have to implement my own optimizer, > which seems somewhat excessive. I'm about to do the same myself, although it's not a very high priority project right now. We see performance differences in queries with InnoDB tables which can be explained by variations in the cardinality estimates in InnoDB resulting in different execution paths. We will look into doing an EXPLAIN for certain generated queries and adding some USE INDEX clues to make sure the right execution path is chosen. This is part of a routine that generates many different queries based on a wide variety of search forms, predefined selections, etc. Most of the time MySQL does a great job with finding the 'right' execution path, but sometimes it goes horribly wrong. > > In this query you want the data where mls_num is as small as possible. Is > > there a way you can limit the number of records by using an extra where > > condition? This way you may change the order of the tables and make the > > query faster. > > I tried that using mirealsource_homes.mls_num in the WHERE clause and > it didn't make a difference. I tried to say that you could/should try to limit the number of records that are selected in the first place. E.g. we had a list of the ten most recently subscribed members. This was queried by selecting the active members, sort them on subscription date and limit it to ten. As more and more members are added the resulting recordset increases and sorting becomes very slow. When we were asked to optimize the query we figured out that each day a lot of new members were added to the site, so we would select the active members who joined in the last ten days, sort these and limit that to ten. This was a lot faster! (Yes, we checked whether it would result in at least ten records and if not, increase the number of days we would look into the past, but this is not likely to happen.) So, can you think of a limitation that would preselect a part of the mirealsource_homes.mls_num table? > Are there any other global things I can try, to tell MySQL to avoid > table scans? The queries almost always use at most LIMIT 10, so a > table scan is generally not the right idea A full table scan is the right idea as long as you select more than 30% of the table, sort it and after that apply the limit 10 (which is the only solution for such a query). Regards, Jigal. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]