> On Sunday, 25 November, 2012, 11:58, Jay A. Kreibich wrote: > each column is usually undesirable. A given SELECT can usually only > use one index per query (or sub-query), so it rarely makes sense to > stack up the indexes... adding unused indexes only slows down > insert/update/deletes, as well as makes the file size much larger.
Generally speaking indexes should be treated as a performance trade-off. When you create an index (other than a UNIQUE index used to enforce a constraint, or an index on a parent or child key in a foreign-key relationship, where such an index may greatly increase INSERT or UPDATE performance) you are "moving" execution time from the retrieval processing to the maintenance processing of your application. When you add an index, you are (usually) optimizing retrieval and query operations. The execution time saved during such query operations does not disappear (it is not recovered). These processor cycles and I/O operations are "removed" from retrieval operations and "spent" when you perform updates to the database to maintain the indexes. In other words, optimization does not make all operations faster -- the time required for your application to perform its functions can be viewed as a fixed size bag of water. When you optimize some operation by adding an index you are "pushing in" the bag of water in some specific place. It bulges out somewhere else, and some other operation becomes slower because now you have to perform additional operations elsewhere to maintain the indexes. The objective is to save a "huge" amount of processing time in one operation and spend that savings in "wee bits" distributed over many other operations where it is not noticeable. For example, adding a particular index may decrease the time to perform a certain query from 5 minutes to 5 seconds. However, it will add a few hundred milliseconds to each update or insert operation. If this trade-off is acceptable, then you have made a good trade-off. On the other hand, if the indexes you added to optimize the query increases each update/insert by a few seconds, the trade-off may not be acceptable. You may need to find an acceptable middle ground ... > Additionally, indexing any column (even one used heavily in a query > filter) is not useful if the index doesn't reduce the working data > set to ~10% of the rows. In other words, having an index on a column > that has all one value (be it NULL or anything else) accomplishes > nothing but significantly slowing down queries that use the index. > Same is true, for example, of a True/False column with an even > distribution. Generally, unless the index can be used to eliminate > the vast majority of rows, it will slow down, rather than speed up, a > query. The exception to this is, of course, where the index created is a covering index because using a covering index, while it may not necessarily reduce the number of rows significantly, eliminates the accesses to the base table to retrieve data that might be being used in further operations. Care needs to be taken to not prematurely add indexes that will add maintenance cost but not significantly improve query performance (ie, be careful not to just add water into the bag -- the objective is to poke it around, not just add more water). Adding indexes required to enforce or optimize constraint and foreign key operations are almost always required -- but other indexes should not be added until you are sure that they will actually be required because overuse of indexes can severely hamper overall performance. --- () ascii ribbon campaign against html e-mail /\ www.asciiribbon.org _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users