Hannes Ricklefs wrote:
Hello,
does anyone have any experience in the amount of time increase for an UPDATE,
INSERT, SELECT in relation to the size of the actual database?
Is it proportional or exponential for example...
Regards,
Hannes
Hannes,
The execution time for inserts depends upon the number of existing
records and the number of indexes. For each insert the engine must
locate the correct btree block to add a new record to in both the table
and each index. This is an O(log N) operation for both indexes and
tables. The base of the log depends upon the fanout of the btree blocks,
which in turn depends on the size of the records and index keys, but
usually isn't important. So for inserts you have a time T that is
proportional to:
T ~= (number of indexes + 1) O(log number of records)
Note that this depends primarily upon the number of records in the
table, not the size of the database. The size of the database depends
upon the number of records in all the tables. The database size may have
a minor impact on the insert speed by causing the btree blocks to be
spread out further across the disk, therefore requiring additional disk
seeks to read the blocks if they aren't already in the disk cache.
Similar arguments apply to updates. Each update needs to locate the
existing record, and then store the modified values. This may move the
records in the table (if it changes an integer primary key) and in each
of the indexes that were affected by the update. Updates often use
indexes to locate the records to be changed, or are performing a full
table scan if all records are being modified or there is no index. For a
table scan the amortized cost of locating the record is constant, for an
indexed lookup the cost is O(log N), for an unindexed lookup the time is
O(N). After the record is changed it is written back and then all
affected indexes have to be updated. This requires locating and deleting
the existing index record, and inserting a new index record. Both of
these are O(log N) operations on each index. So we have a time
proportional to:
T~= choose_one_of(O(1), O(log N), O(N)) + if_int_pk(O(log N)) + (2 *
number of indexes)O(log N)
where N is the number of records in the table.
Selects are generally too complicated to be analyzed in this manner
especially considering joins, grouping and sorting. But the same general
principles apply. A full table scan lookup has a constant amortized
cost, an indexed lookup has a O(log N) cost, and a nunindexed lookup has
an O(N) cost. Joins generally multiply these values together. For
example a full table scan joined to a second table using an index
requires a time proportional to
T ~= O(1) + O(log number of records in the second table) = O(log N2)
for each record returned. For the entire select the time is proportional to
T~= O(N1) * O(log N2)
Sorting a select generally takes O(N log N) where N is the number of
result records, but if an appropriate index is available that can be
used to retrieve the records in the correct order and eliminate the need
for a sort.
As you can see, the short answer is "it depends...".
HTH
Dennis Cote