At 8:33 PM -0400 6/21/04, D. Richard Hipp wrote:
<snip>

Hey, your reply was sent exactly 2 seconds before mine (assuming synchronized clocks); not bad.

Let N be the total number of entires in the table and let
K be the number of entries with a matching key.  Then to
do a full table scan requires O(N) time.  To use an index,
you first consult the index to get a list of K rowids of
potential matches.  This takes time O(K).  Then for each
rowid you have to look up the corresponding row in the
btree, an O(logN) operation.  So the total time for using
an index is O(KlogN).

If there is an upper bound on the number of duplicate key
entries in a table (in other words if there are lots of
different keys) then K will be a constant and O(KlogN)
simplifies to O(logN) which is clearly less than O(N).
In that case, it pays to use the index.

But suppose there are only a fixed number of different
key values.  As the N grows, the number of duplicate keys
will increase.  K will be a linear function of N.  So
O(KlogN) becomes O(NlogN) which is clearly more than O(N).
In that case, it is best to use a full table scan when
N is large.

Good explanation. Basically what I was getting at, though I didn't give any math formulas.


And remember folks, these fomulas simply give the time differences for *fetches* (not data changes) when using an index or not. But that's the most important consideration.

-- Darren Duncan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to