"Robert DiFalco" <[EMAIL PROTECTED]> wrote on 03/09/2006 12:32:44 PM:
> I have some questions regarding InnoDB indices. > > Say I have a table with millions of records. On of the fields is a type > field that has a possible value of 1,2,3, or 4. I sometimes query by the > type field and may at other times order on it. > > Do queries benefit from an index with this low of a selectivity? > > If through the query browser I find that they do, should I increase the > selectivity of the index by making it a compound index and adding my > Primary Key to the TYPE index? > > If I make it a compound index, does it have any positive impact on > INSERT, UPDATE, or maybe just DELETE? > > R. > > Hi Robert, I assume you have already read through the entire optimization section in the manual: http://dev.mysql.com/doc/refman/4.1/en/optimization.html -or- http://dev.mysql.com/doc/refman/5.0/en/optimization.html (as appropriate) as this entire chapter deals with the finer details of the questions you are asking. So I will try to just answer you in the general sense so that perhaps you can make better sense of what you already read. One of the most frequent bottlenecks to query performance is related to physically retrieving data from the hard disks. Indexes, though very useful, actually slow down performance if you need to pull more than about 1/3 of any table's rows off of the disks. Columns of data not already included in the indexes themselves must be retrieved from the disk before their values can be used as part of a result or a part of a comparison or as part of a formula. What happens to the performance is that it begins to take longer (mostly due to the random access disk seek operations) to pick lots of individual records (based on an index hit) that it would have taken if you had just found the beginning of the table data and streamed the whole table through memory in one big burst of data. Now, because the indexes to a table are loaded into memory before query evaluation the optimizer can estimate how many rows of a table it would find if it used one index over another (and starting with 5.0 how many it would find based on certain combinations of indexes). If all of the data you need from a table is actually part of an index then the entire "read data from disk" portion of the query can also be skipped (under the right conditions). So having what is known as a "covering index" (a multi-column index where some of the right-most columns are listed mostly to avoid actual table reads) can seriously improve the performance of certain queries while providing a "normal" boost to the performance of a bunch of others. How you execute your queries should not matter. By the time the database server sees it, one query looks just like every other. So I don't see how running a query in the Query Browser would be any faster than if you had issued the same query from the CLI or via any of the other connection APIs. Declaring compound indexes make sense if your query patterns frequently use those columns or if you are trying to create a covering index. Improving the cardinality of any index can only help its performance. The up-side to indexes (keys) is that they improve the chances of quickly finding the data you are looking for thereby improving your overall query performance. The down-side is that the addition of every new record has to create the appropriate additions to every index on the table. The same goes for UPDATES and DELETES. You have to benchmark your performance in your environment to figure out when enough indexes becomes too many for your comfort. Each new index also takes up space in memory and room on the disk so too many indexes can starve your system for resources, too. Like I said above, these are just some general guidelines. The nitty gritty can be found in the manual. Shawn Green Database Administrator Unimin Corporation - Spruce Pine