At 8:33 PM -0400 6/21/04, D. Richard Hipp wrote: <snip>
Hey, your reply was sent exactly 2 seconds before mine (assuming synchronized clocks); not bad.
Let N be the total number of entires in the table and let K be the number of entries with a matching key. Then to do a full table scan requires O(N) time. To use an index, you first consult the index to get a list of K rowids of potential matches. This takes time O(K). Then for each rowid you have to look up the corresponding row in the btree, an O(logN) operation. So the total time for using an index is O(KlogN).
If there is an upper bound on the number of duplicate key entries in a table (in other words if there are lots of different keys) then K will be a constant and O(KlogN) simplifies to O(logN) which is clearly less than O(N). In that case, it pays to use the index.
But suppose there are only a fixed number of different key values. As the N grows, the number of duplicate keys will increase. K will be a linear function of N. So O(KlogN) becomes O(NlogN) which is clearly more than O(N). In that case, it is best to use a full table scan when N is large.
Good explanation. Basically what I was getting at, though I didn't give any math formulas.
And remember folks, these fomulas simply give the time differences for *fetches* (not data changes) when using an index or not. But that's the most important consideration.
-- Darren Duncan
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]