Hannes Ricklefs wrote:

Hello,

does anyone have any experience in the amount of time increase for an UPDATE, INSERT, SELECT in relation to the size of the actual database?

Is it proportional or exponential for example...
Regards,
Hannes

Hannes,

The execution time for inserts depends upon the number of existing records and the number of indexes. For each insert the engine must locate the correct btree block to add a new record to in both the table and each index. This is an O(log N) operation for both indexes and tables. The base of the log depends upon the fanout of the btree blocks, which in turn depends on the size of the records and index keys, but usually isn't important. So for inserts you have a time T that is proportional to:

T ~= (number of indexes + 1) O(log number of records)

Note that this depends primarily upon the number of records in the table, not the size of the database. The size of the database depends upon the number of records in all the tables. The database size may have a minor impact on the insert speed by causing the btree blocks to be spread out further across the disk, therefore requiring additional disk seeks to read the blocks if they aren't already in the disk cache.

Similar arguments apply to updates. Each update needs to locate the existing record, and then store the modified values. This may move the records in the table (if it changes an integer primary key) and in each of the indexes that were affected by the update. Updates often use indexes to locate the records to be changed, or are performing a full table scan if all records are being modified or there is no index. For a table scan the amortized cost of locating the record is constant, for an indexed lookup the cost is O(log N), for an unindexed lookup the time is O(N). After the record is changed it is written back and then all affected indexes have to be updated. This requires locating and deleting the existing index record, and inserting a new index record. Both of these are O(log N) operations on each index. So we have a time proportional to:

T~= choose_one_of(O(1), O(log N), O(N)) + if_int_pk(O(log N)) + (2 * number of indexes)O(log N) where N is the number of records in the table.

Selects are generally too complicated to be analyzed in this manner especially considering joins, grouping and sorting. But the same general principles apply. A full table scan lookup has a constant amortized cost, an indexed lookup has a O(log N) cost, and a nunindexed lookup has an O(N) cost. Joins generally multiply these values together. For example a full table scan joined to a second table using an index requires a time proportional to

T ~= O(1) + O(log number of records in the second table) = O(log N2)

for each record returned. For the entire select the time is proportional to

T~= O(N1) * O(log N2)

Sorting a select generally takes O(N log N) where N is the number of result records, but if an appropriate index is available that can be used to retrieve the records in the correct order and eliminate the need for a sort.

As you can see, the short answer is "it depends...".

HTH
Dennis Cote


Reply via email to