* Sebastian Bergmann > I am currently trying to optimize the indexes / keys for the MySQL > schema of phpOpenTracker.
I don't know phpOpenTracker, but I have a few comments. > I'm wondering if multiple keys, like > > CREATE TABLE pot_accesslog ( > accesslog_id int(10) unsigned NOT NULL, > client_id int(10) unsigned NOT NULL, > document_id int(11) NOT NULL, > timestamp int(10) unsigned NOT NULL, > > PRIMARY KEY (accesslog_id, client_id, document_id, timestamp) > ); This primary key makes it possible to have multiple records with the same accesslog_id, multiple records with the same accesslog_id/client_id combination, and multiple records with the same accesslog_id/client_id/document_id combination... it does however not allow two records with the same accesslog_id/client_id/document_id combination at the exact same time. Is this what you want? > would be better than single keys for each column, like > > CREATE TABLE pot_accesslog ( > accesslog_id int(10) unsigned NOT NULL, > client_id int(10) unsigned NOT NULL, > document_id int(11) NOT NULL, > timestamp int(10) unsigned NOT NULL, > > KEY accesslog_id (accesslog_id), > KEY client_id (client_id), > KEY document_id (document_id), > KEY timestamp (timestamp) > ); This statement does not define a primary key. Any combination and duplication is allowed. The restriction on two records with the same accesslog_id/client_id/document_id combination at the same time does not apply. It is normally best to define a primary key on a table, and it is very often used to restrict what kind of combinations must be unique. However, if the behaviour with no restrictions on duplicates is correct, you may be better of without a primary key in this case. Also, the second create statement defines indexes on all fields, making almost any query reasonably fast. The first statement only defined one index: the one on accesslog_id/client_id/document_id/timestamp. Any query _not_ providing an accesslog_id can not use the index, which is probably a bad thing. (I have not studied your queries, maybe you allways use the accesslog_id?) You can almost always improve the performance of queries by adding an index. The trick is to add the right index... :) If you for instance often did a "SELECT * FROM pot_accesslog WHERE client_id=$id AND timestamp > NOW() - INTERVAL 24 HOUR", you could add an index with the involved fields: KEY client_time (client_id,timestamp). You should not create too many indexes, though. Inserts, updates and deletes are slower when you have more indexes. You need to find indexes that may be used by different queries: the client_time (client_id,timestamp) index can also be used when you have a client_id, but not the timestamp. This is because client_id is the first part of the index. > Have a look at > > http://www.sebastian-bergmann.de/single_keys.htm CREATE TABLE pot_add_data ( accesslog_id int(10) unsigned NOT NULL, data_field varchar(32) NOT NULL, data_value varchar(255) NOT NULL, PRIMARY KEY (accesslog_id), KEY data_field (data_field), KEY data_value (data_value) ); Very often it is better to define this last index like this: KEY data_value (data_value(20)) This will save a lot of resources, and most likely improve the performance of your server...! this is however not allways true: if the data_value is equal for the first 20 chars this index is useless... It is almost never a god thing to have an index on the full 255 bytes of the varchar. The trick here is to find the correct value to use, 20 is of course not always the best. The point is that each of these characters of an index must be read into the memory of the server, if your tables are big a couple of characters can become megabytes of ram... when a predefined amount (key_buffer_size) is used, old keys is overwritten by new ones as they are read from disc. When keys are in ram, the server performs much faster than when it has to read them from disc. This means that if you have smaller keys, the overall preformance of the server is improved (unless the server have a lot of memory and key_buffer_size set at a value higher than the combined size of all used indexes for all used tables for all used databases on the server...) You can check the key_buffer_size of your surver with the statement: show variables like "key_buffer_size"; I think the default (on 3.23.x) is 8388600, but I'm not sure about this, it seems to be pretty low... up to half the amount of total memory is suggested in the documentation. <URL: http://www.mysql.com/doc/S/H/SHOW_VARIABLES.html > > for a version of the schema that uses single keys, together with > the output of EXPLAIN(query) for some of phpOpenTracker most commonly > used queries. > > Have a look at > > http://www.sebastian-bergmann.de/multi_keys.htm > > for a version of the schema that uses multiple keys, together with > the output of EXPLAIN(query) for some of phpOpenTracker most commonly > used queries. > > It'd be really great if some kind soul could have a look at this and > tell me what indexing strategy is suited best for this problem. I hope the above suggestions was usefull :) -- Roger --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php