alter table and 'something wrong in indexes'?
Hi! I use mysql/innodb tables on aws services in a small table (about 2M records) I changed some columns datatypes from unsigned int to decimal and from float to decimal I didn't change anything about primary key or other indexes after the change (done without troubles), all my queries where really slow and the "explain" was different; it was like the primary key index was corrupted or not used anymore (even if the 'check table' was fine) I had to drop an create again the table with the new datatypes and all was fine again any idea about what was wrong? I guess is not an aws problem, even if that is the first time I have a similar error thank you in advance bye bye MAS! -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Indexes strangeness
Hi All, Can someone please fill me in as what I am seeing here... I have two identical tables, with identical indexes, having different records. Both tables have +- 15m records in it... mysql> EXPLAIN SELECT ArticleID, DateObtained, DateAccessed, TimesAccessed FROM IDXa ORDER BY DateAccessed LIMIT 10; ++-+--+---+---+-+-+--++---+ | id | select_type | table| type | possible_keys | key | key_len | ref | rows | Extra | ++-+--+---+---+-+-+--++---+ | 1 | SIMPLE | IDXa | index | NULL | idxDateAccessed | 5 | NULL | 10 | NULL | ++-+--+---+---+-+-+--++---+ 1 row in set (0,00 sec) mysql> EXPLAIN SELECT ArticleID, DateObtained, DateAccessed, TimesAccessed FROM IDXb ORDER BY DateAccessed LIMIT 10; ++-+--+--+---+--+-+--+--++ | id | select_type | table| type | possible_keys | key | key_len | ref | rows | Extra | ++-+--+--+---+--+-+--+--++ | 1 | SIMPLE | IDXb | ALL | NULL | NULL | NULL| NULL | 15004858 | Using filesort | ++-+--+--+---+--+-+--+--++ 1 row in set (0,00 sec) Tables: mysql> SHOW CREATE TABLE IDXa\G *** 1. row *** Table: IDXa Create Table: CREATE TABLE `IDXa` ( `ArticleID` varchar(32) NOT NULL, `DateObtained` datetime NOT NULL, `DateAccessed` datetime NOT NULL, `TimesAccessed` int(5) unsigned NOT NULL, PRIMARY KEY (`ArticleID`), KEY `idxDateAccessed` (`DateAccessed`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0,00 sec) mysql> SHOW INDEXES FROM IDXa; +--++-+--+--+---+-+--++--++-+---+ | Table| Non_unique | Key_name| Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +--++-+--+--+---+-+--++--++-+---+ | IDXa | 0 | PRIMARY |1 | ArticleID| A |14086444 | NULL | NULL | | BTREE | | | | IDXa | 1 | idxDateAccessed |1 | DateAccessed | A | 1408644 | NULL | NULL | | BTREE | | | +--++-+--+--+---+-+--++--++-+---+ 2 rows in set (0,00 sec) mysql> SHOW CREATE TABLE IDXb\G *** 1. row *** Table: IDXb Create Table: CREATE TABLE `IDXb` ( `ArticleID` varchar(32) NOT NULL, `DateObtained` datetime NOT NULL, `DateAccessed` datetime NOT NULL, `TimesAccessed` int(5) unsigned NOT NULL, PRIMARY KEY (`ArticleID`), KEY `idxDateAccessed` (`DateAccessed`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0,00 sec) mysql> SHOW INDEXES FROM IDXb; +--++-+--+--+---+-+--++--++-+---+ | Table| Non_unique | Key_name| Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +--++-+--+--+---+-+--++--++-+---+ | IDXb | 0 | PRIMARY |1 | ArticleID| A |15007345 | NULL | NULL | | BTREE | | | | IDXb | 1 | idxDateAccessed |1 | DateAccessed | A | 1250612 | NULL | NULL | | BTREE | | | +--++-+--+--+---+-+--++--++-+---+ 2 rows in set (0,00 sec) Thnx. -- Regards, Chris Knipe
Indexes issue importing tablespaces
Hi, I have this problem among several different instaces of 5.6.20. I take all the steps as stated in http://dev.mysql.com/doc/refman/5.6/en/tablespace-copying.html and get no errors neither warnings, neither in the cli or the mysql log. (13:23:02) [borrame] alter table creditLine discard tablespace; Query OK, 0 rows affected (0.30 sec) (copy operation of the .cfg and .ibd files from the origin server) (13:23:19) [borrame] alter table creditLine import tablespace; Query OK, 0 rows affected (44.35 sec) 2014-10-10 13:26:42 1657 [Note] InnoDB: Importing tablespace for table 'letsbonus/creditLine' that was exported from host 'dualla' 2014-10-10 13:26:42 1657 [Note] InnoDB: Phase I - Update all pages 2014-10-10 13:27:04 1657 [Note] InnoDB: Sync to disk 2014-10-10 13:27:25 1657 [Note] InnoDB: Sync to disk - done! 2014-10-10 13:27:26 1657 [Note] InnoDB: Phase III - Flush changes to disk 2014-10-10 13:27:26 1657 [Note] InnoDB: Phase IV - Flush complete 2014-10-10 13:27:26 1657 [Note] InnoDB: borrame.creditLine autoinc value set to 87313435 After this, the indexes look empty: (13:27:26) [borrame] show index from creditLine; ++++--+--+---+-+--++--++-+---+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | ++++--+--+---+-+--++--++-+---+ | creditLine | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idClient | 1 | idClient | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCreditSubTypology | 1 | idCreditTypology | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCountry | 1 | idCountry | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idAffiliate | 1 | idAffiliate | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | endDate | 1 | endDate | A | 0 | NULL | NULL | YES | BTREE | | | | creditLine | 1 | status | 1 | status | A | 0 | NULL | NULL | | BTREE | | | ++++--+--+---+-+--++--++-+---+ 7 rows in set (0.00 sec) I have to optimize or null-alter the table to get them working: (13:27:34) [borrame] alter table creditLine engine = InnoDB; Query OK, 0 rows affected (12 min 57.41 sec) Records: 0 Duplicates: 0 Warnings: 0 (13:51:17) [borrame] show index from creditLine; ++++--+--+---+-+--++--++-+---+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | ++++--+--+---+-+--++--++-+---+ | creditLine | 0 | PRIMARY | 1 | id | A | 32237680 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idClient | 1 | idClient | A | 16118840 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCreditSubTypology | 1 | idCreditTypology | A | 1792 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCountry | 1 | idCountry | A | 8967 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idAffiliate | 1 | idAffiliate | A | 2 | NULL | NULL | | BTREE | | | | creditLine | 1 | endDate | 1 | endDate | A | 293069 | NULL | NULL | YES | BTREE | | | | creditLine | 1 | status | 1 | status | A | 4 | NULL | NULL | | BTREE | | | ++++--+--+---+-+--++--++-+---+ 7 rows in set (0.00 sec) Is this a know issue? Or something I'm missing? I've checked the doc and saw nothing related to this. Thanks, Rubén.
Re: Indexes issue importing tablespaces
Did you check if an ANALYZE TABLE is enough in this case? -- Wagner Bianchi Mobile: +55.31.8654.9510 Em 10/10/2014, às 09:06, Ruben Cardenal my...@ruben.cn escreveu: Hi, I have this problem among several different instaces of 5.6.20. I take all the steps as stated in http://dev.mysql.com/doc/refman/5.6/en/tablespace-copying.html and get no errors neither warnings, neither in the cli or the mysql log. (13:23:02) [borrame] alter table creditLine discard tablespace; Query OK, 0 rows affected (0.30 sec) (copy operation of the .cfg and .ibd files from the origin server) (13:23:19) [borrame] alter table creditLine import tablespace; Query OK, 0 rows affected (44.35 sec) 2014-10-10 13:26:42 1657 [Note] InnoDB: Importing tablespace for table 'letsbonus/creditLine' that was exported from host 'dualla' 2014-10-10 13:26:42 1657 [Note] InnoDB: Phase I - Update all pages 2014-10-10 13:27:04 1657 [Note] InnoDB: Sync to disk 2014-10-10 13:27:25 1657 [Note] InnoDB: Sync to disk - done! 2014-10-10 13:27:26 1657 [Note] InnoDB: Phase III - Flush changes to disk 2014-10-10 13:27:26 1657 [Note] InnoDB: Phase IV - Flush complete 2014-10-10 13:27:26 1657 [Note] InnoDB: borrame.creditLine autoinc value set to 87313435 After this, the indexes look empty: (13:27:26) [borrame] show index from creditLine; ++++--+--+---+-+--++--++-+---+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | ++++--+--+---+-+--++--++-+---+ | creditLine | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idClient | 1 | idClient | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCreditSubTypology | 1 | idCreditTypology | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCountry | 1 | idCountry | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idAffiliate | 1 | idAffiliate | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | endDate | 1 | endDate | A | 0 | NULL | NULL | YES | BTREE | | | | creditLine | 1 | status | 1 | status | A | 0 | NULL | NULL | | BTREE | | | ++++--+--+---+-+--++--++-+---+ 7 rows in set (0.00 sec) I have to optimize or null-alter the table to get them working: (13:27:34) [borrame] alter table creditLine engine = InnoDB; Query OK, 0 rows affected (12 min 57.41 sec) Records: 0 Duplicates: 0 Warnings: 0 (13:51:17) [borrame] show index from creditLine; ++++--+--+---+-+--++--++-+---+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | ++++--+--+---+-+--++--++-+---+ | creditLine | 0 | PRIMARY | 1 | id | A | 32237680 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idClient | 1 | idClient | A | 16118840 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCreditSubTypology | 1 | idCreditTypology | A | 1792 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCountry | 1 | idCountry | A | 8967 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idAffiliate | 1 | idAffiliate | A | 2 | NULL | NULL | | BTREE | | | | creditLine | 1 | endDate | 1 | endDate | A | 293069 | NULL | NULL | YES | BTREE | | | | creditLine | 1 | status | 1 | status | A | 4 | NULL | NULL | | BTREE | | | ++++--+--+---+-+--++--++-+---+ 7 rows in set (0.00 sec) Is this a know issue? Or something I'm missing? I've checked the doc and saw nothing related to this. Thanks, Rubén.
Re: Indexes issue importing tablespaces
Hi Wagner, Yes! Analyze solves the situation in a moment. (14:21:09) [borrame] alter table creditLine discard tablespace; Query OK, 0 rows affected (0.41 sec) (14:21:21) [borrame] alter table creditLine import tablespace; Query OK, 0 rows affected (23.48 sec) (14:24:55) [borrame] analyze table creditLine; ++-+--+--+ | Table | Op | Msg_type | Msg_text | ++-+--+--+ | borrame.creditLine | analyze | status | OK | ++-+--+--+ 1 row in set (0.16 sec) (14:25:09) [borrame] show index from creditLine; ++++--+--+---+-+--++--++-+---+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | ++++--+--+---+-+--++--++-+---+ | creditLine | 0 | PRIMARY | 1 | id | A | 32237680 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idClient | 1 | idClient | A | 16118840 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCreditSubTypology | 1 | idCreditTypology | A | 5050 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCountry | 1 | idCountry | A | 8161 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idAffiliate | 1 | idAffiliate | A | 1794 | NULL | NULL | | BTREE | | | | creditLine | 1 | endDate | 1 | endDate | A | 64995 | NULL | NULL | YES | BTREE | | | | creditLine | 1 | status | 1 | status | A | 4 | NULL | NULL | | BTREE | | | ++++--+--+---+-+--++--++-+---+ 7 rows in set (0.00 sec) (14:25:14) [borrame] Thanks, Rubén. El 2014-10-10 14:19, Wagner Bianchi escribió: Did you check if an ANALYZE TABLE is enough in this case? -- WAGNER BIANCHI MOBILE: +55.31.8654. [2]9510 Em 10/10/2014, às 09:06, Ruben Cardenal my...@ruben.cn escreveu: Hi, I have this problem among several different instaces of 5.6.20. I take all the steps as stated in http://dev.mysql.com/doc/refman/5.6/en/tablespace-copying.html [1] and get no errors neither warnings, neither in the cli or the mysql log. (13:23:02) [borrame] alter table creditLine discard tablespace; Query OK, 0 rows affected (0.30 sec) (copy operation of the .cfg and .ibd files from the origin server) (13:23:19) [borrame] alter table creditLine import tablespace; Query OK, 0 rows affected (44.35 sec) 2014-10-10 13:26:42 1657 [Note] InnoDB: Importing tablespace for table 'letsbonus/creditLine' that was exported from host 'dualla' 2014-10-10 13:26:42 1657 [Note] InnoDB: Phase I - Update all pages 2014-10-10 13:27:04 1657 [Note] InnoDB: Sync to disk 2014-10-10 13:27:25 1657 [Note] InnoDB: Sync to disk - done! 2014-10-10 13:27:26 1657 [Note] InnoDB: Phase III - Flush changes to disk 2014-10-10 13:27:26 1657 [Note] InnoDB: Phase IV - Flush complete 2014-10-10 13:27:26 1657 [Note] InnoDB: borrame.creditLine autoinc value set to 87313435 After this, the indexes look empty: (13:27:26) [borrame] show index from creditLine; ++++--+--+---+-+--++--++-+---+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | ++++--+--+---+-+--++--++-+---+ | creditLine | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idClient | 1 | idClient | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCreditSubTypology | 1 | idCreditTypology | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idCountry | 1 | idCountry | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | creditLine_idAffiliate | 1 | idAffiliate | A | 0 | NULL | NULL | | BTREE | | | | creditLine | 1 | endDate | 1 | endDate | A | 0 | NULL | NULL | YES | BTREE | | | | creditLine | 1 | status | 1 | status | A | 0 | NULL | NULL | | BTREE | | | ++++--+--+---+-+--++--++-+---+ 7 rows in set (0.00 sec) I have to optimize or null-alter the table to get them working: (13:27:34) [borrame] alter table
Troubles with creating indexes on float columns on MyISAM tables in MySQL 5.6.15 and MySQL 5.6.14 running on FreeBSD 8.4
I got an interesting problem with creation of indexes on MyISAM tables in MySQL 5.6.15 and MySQL 5.6.14 running on FreeBSD 8.4 for float columns - I am not able to create indexes on these columns Indexes on all other columns work just fine The problem occur while I was loading data from MySQL dump into a database. Loads would fail on the line ENABLE KEYS in a dump with ERROR 2013 (HY000): Lost connection to MySQL server during query The problem was recreated in many different scenarios and could be reconstructed with a simple test: I have a table: mysql show create table LEGAL_REGISTRATION_TWO\G; *** 1. row *** Table: LEGAL_REGISTRATION_TWO Create Table: CREATE TABLE `LEGAL_REGISTRATION_TWO` ( `legal_registration_key` int(10) unsigned NOT NULL DEFAULT '0', `company_fkey` varchar(10) NOT NULL DEFAULT '', `law_firm_fkey` varchar(10) NOT NULL DEFAULT '', `registrant_is_guarantor` int(1) NOT NULL DEFAULT '0', `plan_name` text NOT NULL, `copy_sent_to_firm` int(1) NOT NULL DEFAULT '0', `copy_sent_to_firm_name_address_text` text NOT NULL, `law_firm_opinion` int(1) NOT NULL DEFAULT '0', `law_firm_opinion_type` varchar(10) NOT NULL DEFAULT '', `law_firm_opinion_text` text NOT NULL, `law_firm_opinion_text_url` varchar(200) NOT NULL DEFAULT '', `law_firm_relationship` varchar(20) NOT NULL DEFAULT '', `legal_fees` float NOT NULL DEFAULT '0', `accounting_fees` float(10,2) NOT NULL DEFAULT '0.00', I am attempting to create an index on this field `ftp_file_name_fkey` varchar(80) NOT NULL DEFAULT '', `form_fkey` varchar(20) NOT NULL DEFAULT '', `file_date` varchar(10) NOT NULL DEFAULT '', `file_accepted` varchar(20) NOT NULL DEFAULT '', `file_size` varchar(10) NOT NULL DEFAULT '', `http_file_name_html` varchar(100) NOT NULL DEFAULT '', `http_file_name_text` varchar(100) NOT NULL DEFAULT '', `qc_check_1` int(1) NOT NULL DEFAULT '0', `qc_check_2` int(1) NOT NULL DEFAULT '0', `create_date` varchar(10) NOT NULL DEFAULT '', `change_date` varchar(10) NOT NULL DEFAULT '' ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec) With a single row in it: mysql select count(*) from LEGAL_REGISTRATION_TWO; +--+ | count(*) | +--+ |1 | +--+ 1 row in set (0.00 sec) When I attempting to alter the table to create an index on a float column I get the error: mysql alter table LEGAL_REGISTRATION_TWO add key test1dx (`accounting_fees`); ERROR 2013 (HY000): Lost connection to MySQL server during query mysql I have made a number of changes in /etc/my.cnf trying to resolve this problem and currently the following entries are in my.cnf: net_read_timeout=2400 net_write_timeout=2400 big-tables=on connect_timeout=40 myisam_sort_buffer_size=1073741824 max_allowed_packet = 128M I am not finding any talk on Internet about this being a problem for someone else. Any idea how to solve this problem are greatly appreciated -- Mikhail Berman
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
On 10/15/2012 7:15 PM, spameden wrote: Thanks a lot for all your comments! I did disable Query cache before testing with set query_cache_type=OFF for the current session. I will report this to the MySQL bugs site later. First. What are all of your logging settings? SHOW GLOBAL VARIABLES LIKE '%log%'; Next. When you physically look in the slow query log, how long does it say that it took this command to execute? And last, before you can ask MySQL to fix a bug, you must first ensure it's a MySQL bug. Please try to reproduce your results using official binaries, not those constructed by a third party. If the problem exists in our packages, do tell us about it. If the problem is not reproducible using official MySQL products, then please report it to the appropriate channel for the product you are using. MySQL Bugs - http://bugs.mysql.com/ Thanks! -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
Will do. mysql SHOW GLOBAL VARIABLES LIKE '%log%'; +-+-+ | Variable_name | Value | +-+-+ | back_log| 50 | | binlog_cache_size | 32768 | | binlog_direct_non_transactional_updates | OFF | | binlog_format | MIXED | | expire_logs_days| 5 | | general_log | OFF | | general_log_file| /var/run/mysqld/mysqld.log | | innodb_flush_log_at_trx_commit | 2 | | innodb_flush_log_at_trx_commit_session | 3 | | innodb_locks_unsafe_for_binlog | OFF | | innodb_log_block_size | 512 | | innodb_log_buffer_size | 8388608 | | innodb_log_file_size| 2145386496 | | innodb_log_files_in_group | 2 | | innodb_log_group_home_dir | ./ | | innodb_mirrored_log_groups | 1 | | innodb_overwrite_relay_log_info | OFF | | log | OFF | | log_bin | ON | | log_bin_trust_function_creators | ON | | log_bin_trust_routine_creators | ON | | log_error | /var/log/mysql-error.log | | log_output | FILE | | log_queries_not_using_indexes | ON | | log_slave_updates | OFF | | log_slow_admin_statements | OFF | | log_slow_filter | | | log_slow_queries| ON | | log_slow_rate_limit | 1 | | log_slow_slave_statements | OFF | | log_slow_sp_statements | ON | | log_slow_timestamp_every| OFF | | log_slow_verbosity | microtime | | log_warnings| 1 | | max_binlog_cache_size | 18446744073709547520 | | max_binlog_size | 104857600 | | max_relay_log_size | 0 | | relay_log | /var/log/mysql/mysqld-relay-bin | | relay_log_index | | | relay_log_info_file | relay-log.info | | relay_log_purge | ON | | relay_log_space_limit | 0 | | slow_query_log | ON | | slow_query_log_file | /var/log/mysql/mysql-slow.log | | slow_query_log_microseconds_timestamp | OFF | | sql_log_bin | ON | | sql_log_off | OFF | | sql_log_update | ON | | suppress_log_warning_1592 | OFF | | sync_binlog | 0 | | use_global_log_slow_control | none | +-+-+ 51 rows in set (0.01 sec) Here is full output, but writing happens ONLY if log_queries_not_using_indexes turned ON. Query takes: # Query_time: 0.291280 Lock_time: 0.50 Rows_sent: 0 Rows_examined: 133876 Rows_affected: 0 Rows_read: 1 # Bytes_sent: 1775 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: F229398 SET timestamp=1350389078; SELECT sql_id, momt, sender, receiver, udhdata, msgdata, time, smsc_id, service, account, id, sms_type, mclass, mwi, coding, compress, validity, deferred, dlr_mask, dlr_url, pid, alt_dcs, rpi, charset, boxc_id, binfo, meta_data, task_id, msgid FROM send_sms_test FORCE INDEX (priority_time) WHERE time = UNIX_TIMESTAMP(NOW()) ORDER by priority LIMIT 0,50; 2012/10/16 Shawn Green shawn.l.gr...@oracle.com On 10/15/2012 7:15 PM, spameden wrote: Thanks a lot for all your comments! I did disable Query cache before testing with set query_cache_type=OFF for the current session. I will report this to the MySQL bugs site later. First. What are all of your logging settings? SHOW GLOBAL VARIABLES LIKE '%log%'; Next. When you physically look in the slow query log, how long does it say that it took this command to execute? And last, before you can ask MySQL to fix a bug, you must first ensure it's a MySQL bug. Please try to reproduce your results using official binaries, not those constructed by a third party. If the problem exists in our packages, do tell us about it. If the problem is not reproducible using official MySQL products, then please report it to the appropriate channel for the product you are using. MySQL Bugs - http://bugs.mysql.com/ Thanks! -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
your now() statement is getting executed for every row on the select. try ptting the phrase up front as in: set @ut= unix_timestamp(now()) and then use that in your statement. On 2012-10-16 8:42 AM, spameden spame...@gmail.com wrote: Will do. mysql SHOW GLOBAL VARIABLES LIKE '%log%'; +-+-+ | Variable_name | Value | +-+-+ | back_log| 50 | | binlog_cache_size | 32768 | | binlog_direct_non_transactional_updates | OFF | | binlog_format | MIXED | | expire_logs_days| 5 | | general_log | OFF | | general_log_file| /var/run/mysqld/mysqld.log | | innodb_flush_log_at_trx_commit | 2 | | innodb_flush_log_at_trx_commit_session | 3 | | innodb_locks_unsafe_for_binlog | OFF | | innodb_log_block_size | 512 | | innodb_log_buffer_size | 8388608 | | innodb_log_file_size| 2145386496 | | innodb_log_files_in_group | 2 | | innodb_log_group_home_dir | ./ | | innodb_mirrored_log_groups | 1 | | innodb_overwrite_relay_log_info | OFF | | log | OFF | | log_bin | ON | | log_bin_trust_function_creators | ON | | log_bin_trust_routine_creators | ON | | log_error | /var/log/mysql-error.log | | log_output | FILE | | log_queries_not_using_indexes | ON | | log_slave_updates | OFF | | log_slow_admin_statements | OFF | | log_slow_filter | | | log_slow_queries| ON | | log_slow_rate_limit | 1 | | log_slow_slave_statements | OFF | | log_slow_sp_statements | ON | | log_slow_timestamp_every| OFF | | log_slow_verbosity | microtime | | log_warnings| 1 | | max_binlog_cache_size | 18446744073709547520 | | max_binlog_size | 104857600 | | max_relay_log_size | 0 | | relay_log | /var/log/mysql/mysqld-relay-bin | | relay_log_index | | | relay_log_info_file | relay-log.info | | relay_log_purge | ON | | relay_log_space_limit | 0 | | slow_query_log | ON | | slow_query_log_file | /var/log/mysql/mysql-slow.log | | slow_query_log_microseconds_timestamp | OFF | | sql_log_bin | ON | | sql_log_off | OFF | | sql_log_update | ON | | suppress_log_warning_1592 | OFF | | sync_binlog | 0 | | use_global_log_slow_control | none | +-+-+ 51 rows in set (0.01 sec) Here is full output, but writing happens ONLY if log_queries_not_using_indexes turned ON. Query takes: # Query_time: 0.291280 Lock_time: 0.50 Rows_sent: 0 Rows_examined: 133876 Rows_affected: 0 Rows_read: 1 # Bytes_sent: 1775 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: F229398 SET timestamp=1350389078; SELECT sql_id, momt, sender, receiver, udhdata, msgdata, time, smsc_id, service, account, id, sms_type, mclass, mwi, coding, compress, validity, deferred, dlr_mask, dlr_url, pid, alt_dcs, rpi, charset, boxc_id, binfo, meta_data, task_id, msgid FROM send_sms_test FORCE INDEX (priority_time) WHERE time = UNIX_TIMESTAMP(NOW()) ORDER by priority LIMIT 0,50; 2012/10/16 Shawn Green shawn.l.gr...@oracle.com On 10/15/2012 7:15 PM, spameden wrote: T...
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
Interesting thought, but I get the same result. # Query_time: 0.001769 Lock_time: 0.001236 Rows_sent: 0 Rows_examined: 0 use kannel; SET timestamp=1350413592; select * from send_sms FORCE INDEX (priority_time) where time=@ut order by priority limit 0,11; the MySQL i'm using is 5.5.28 from dotdeb.org, pretty sure it's close to the original except packaging scripts. I will check this on the release from MySQL site and report back. Thanks to all. 2012/10/16 Michael Dykman mdyk...@gmail.com your now() statement is getting executed for every row on the select. try ptting the phrase up front as in: set @ut= unix_timestamp(now()) and then use that in your statement. On 2012-10-16 8:42 AM, spameden spame...@gmail.com wrote: Will do. mysql SHOW GLOBAL VARIABLES LIKE '%log%'; +-+-+ | Variable_name | Value | +-+-+ | back_log| 50 | | binlog_cache_size | 32768 | | binlog_direct_non_transactional_updates | OFF | | binlog_format | MIXED | | expire_logs_days| 5 | | general_log | OFF | | general_log_file| /var/run/mysqld/mysqld.log | | innodb_flush_log_at_trx_commit | 2 | | innodb_flush_log_at_trx_commit_session | 3 | | innodb_locks_unsafe_for_binlog | OFF | | innodb_log_block_size | 512 | | innodb_log_buffer_size | 8388608 | | innodb_log_file_size| 2145386496 | | innodb_log_files_in_group | 2 | | innodb_log_group_home_dir | ./ | | innodb_mirrored_log_groups | 1 | | innodb_overwrite_relay_log_info | OFF | | log | OFF | | log_bin | ON | | log_bin_trust_function_creators | ON | | log_bin_trust_routine_creators | ON | | log_error | /var/log/mysql-error.log | | log_output | FILE | | log_queries_not_using_indexes | ON | | log_slave_updates | OFF | | log_slow_admin_statements | OFF | | log_slow_filter | | | log_slow_queries| ON | | log_slow_rate_limit | 1 | | log_slow_slave_statements | OFF | | log_slow_sp_statements | ON | | log_slow_timestamp_every| OFF | | log_slow_verbosity | microtime | | log_warnings| 1 | | max_binlog_cache_size | 18446744073709547520 | | max_binlog_size | 104857600 | | max_relay_log_size | 0 | | relay_log | /var/log/mysql/mysqld-relay-bin | | relay_log_index | | | relay_log_info_file | relay-log.info | | relay_log_purge | ON | | relay_log_space_limit | 0 | | slow_query_log | ON | | slow_query_log_file | /var/log/mysql/mysql-slow.log | | slow_query_log_microseconds_timestamp | OFF | | sql_log_bin | ON | | sql_log_off | OFF | | sql_log_update | ON | | suppress_log_warning_1592 | OFF | | sync_binlog | 0 | | use_global_log_slow_control | none | +-+-+ 51 rows in set (0.01 sec) Here is full output, but writing happens ONLY if log_queries_not_using_indexes turned ON. Query takes: # Query_time: 0.291280 Lock_time: 0.50 Rows_sent: 0 Rows_examined: 133876 Rows_affected: 0 Rows_read: 1 # Bytes_sent: 1775 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: F229398 SET timestamp=1350389078; SELECT sql_id, momt, sender, receiver, udhdata, msgdata, time, smsc_id, service, account, id, sms_type, mclass, mwi, coding, compress, validity, deferred, dlr_mask, dlr_url, pid, alt_dcs, rpi, charset, boxc_id, binfo, meta_data, task_id, msgid FROM send_sms_test FORCE INDEX (priority_time) WHERE time = UNIX_TIMESTAMP(NOW()) ORDER by priority LIMIT 0,50; 2012/10/16 Shawn Green shawn.l.gr...@oracle.com On 10/15/2012 7:15 PM, spameden wrote: T...
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
2012/10/16 12:57 -0400, Michael Dykman your now() statement is getting executed for every row on the select. try ptting the phrase up front as in: set @ut= unix_timestamp(now()) and then use that in your statement. Quote: Functions that return the current date or time each are evaluated only once per query at the start of query execution. This means that multiple references to a function such as file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_nowNOW() within a single query always produce the same result. (For our purposes, a single query also includes a call to a stored program (stored routine, trigger, or event) and all subprograms called by that program.) This principle also applies to file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_curdateCURDATE(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_curtimeCURTIME(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_utc-dateUTC_DATE(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_utc-timeUTC_TIME(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/f! unctio ns.html#function_utc-timestampUTC_TIMESTAMP(), and to any of their synonyms. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
That's exactly what I thought when reading Michael's email, but tried anyways, thanks for clarification :) 2012/10/16 h...@tbbs.net 2012/10/16 12:57 -0400, Michael Dykman your now() statement is getting executed for every row on the select. try ptting the phrase up front as in: set @ut= unix_timestamp(now()) and then use that in your statement. Quote: Functions that return the current date or time each are evaluated only once per query at the start of query execution. This means that multiple references to a function such as file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_nowNOW() within a single query always produce the same result. (For our purposes, a single query also includes a call to a stored program (stored routine, trigger, or event) and all subprograms called by that program.) This principle also applies to file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_curdateCURDATE(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_curtimeCURTIME(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_utc-dateUTC_DATE(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/functions.html#function_utc-timeUTC_TIME(), file:///C:/Program%20Files/MySQL/MySQL%20Server%205.5/HELP/f! unctio ns.html#function_utc-timestampUTC_TIMESTAMP(), and to any of their synonyms. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
| SIMPLE | send_sms_test | index | NULL | priority_time | 12 | NULL | * 11* | Using where | ++-+---+---+---+---+-+--+--+-+ 1 row in set (0.00 sec) But If I issue the query I see in the mysql-slow.log: select * from send_sms_test where time=UNIX_TIMESTAMP(NOW()) order by priority limit 0,11; If I do create INDEX time,priority (in reverse order instead of priority,time) I get still the same usage of priority_time key with the same length, but rows now are doubled): mysql *create index time_priority ON send_sms_test (time,priority);* Query OK, 0 rows affected (0.67 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql *desc select * from send_sms_test where time=UNIX_TIMESTAMP(NOW()) order by priority limit 0,11;* ++-+---+---+---+---+-+--+--+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+--+-+ | 1 | SIMPLE | send_sms_test | index | time_priority | priority_time | 12 | NULL | *22* | Using where | ++-+---+---+---+---+-+--+--+-+ And if both indexes created I do not have anymore this query in the slow-log. Of course If I disable log_queries_not_using_indexes I get none of the queries. So is it a bug inside Percona's implementation or it's generally MySQL behavior? Thanks
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
` bigint(20) DEFAULT NULL, `rpi` bigint(20) DEFAULT NULL, `charset` varchar(255) DEFAULT NULL, `boxc_id` varchar(255) DEFAULT NULL, `binfo` varchar(255) DEFAULT NULL, `meta_data` text, `task_id` bigint(20) DEFAULT NULL, `msgid` bigint(20) DEFAULT NULL, `priority` int(3) unsigned NOT NULL DEFAULT '500', PRIMARY KEY (`sql_id`), KEY `task_id` (`task_id`), KEY `receiver` (`receiver`), KEY `msgid` (`msgid`), KEY `priority_time` (`priority`,`time`) ) ENGINE=InnoDB AUTO_INCREMENT=7806318 DEFAULT CHARSET=utf8 Slow-queries turned on with an option: | log_queries_not_using_indexes | ON| mysqld --version mysqld Ver 5.1.65-rel14.0 for debian-linux-gnu on x86_64 ((Percona Server (GPL), 14.0, Revision 475)) If I check with EXPLAIN MySQL says it would use the index: mysql *desc select * from send_sms_test where time=UNIX_TIMESTAMP(NOW()) order by priority limit 0,11;* ++-+---+---+---+---+-+--+--+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+--+-+ | 1 | SIMPLE | send_sms_test | index | NULL | priority_time | 12 | NULL | * 11* | Using where | ++-+---+---+---+---+-+--+--+-+ 1 row in set (0.00 sec) But If I issue the query I see in the mysql-slow.log: select * from send_sms_test where time=UNIX_TIMESTAMP(NOW()) order by priority limit 0,11; If I do create INDEX time,priority (in reverse order instead of priority,time) I get still the same usage of priority_time key with the same length, but rows now are doubled): mysql *create index time_priority ON send_sms_test (time,priority);* Query OK, 0 rows affected (0.67 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql *desc select * from send_sms_test where time=UNIX_TIMESTAMP(NOW()) order by priority limit 0,11;* ++-+---+---+---+---+-+--+--+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+--+-+ | 1 | SIMPLE | send_sms_test | index | time_priority | priority_time | 12 | NULL | *22* | Using where | ++-+---+---+---+---+-+--+--+-+ And if both indexes created I do not have anymore this query in the slow-log. Of course If I disable log_queries_not_using_indexes I get none of the queries. So is it a bug inside Percona's implementation or it's generally MySQL behavior? Thanks
RE: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
* Rows = 11 / 22 -- don't take the numbers too seriously; they are crude approximations based on estimated cardinality. * The 11 comes from the LIMIT -- therefore useless in judging the efficiency. (The 22 may be 2*11; I don't know.) * Run the EXPLAINs without LIMIT -- that will avoid the bogus 11/22. * If the CREATE INDEX took only 0.67 sec, I surmise that you have very few rows in the table?? So this discussion is not necessarily valid in general cases. * What percentage of time values meet the WHERE? This has a big impact on the choice of explain plan and performance. * Set long_query_time = 0; to get it in the slowlog even if it is fast. Then look at the various extra values (such as filesort, on disk, temp table used, etc). * Do this (with each index): SHOW SESSION STATUS LIKE 'Handler_read%'; SELECT ... FORCE INDEX(...) ...; SHOW SESSION STATUS LIKE 'Handler_read%'; Then take the diffs of the handler counts. This will give you a pretty detailed idea of what is going on; better than the SlowLog. * INT(3) is not a 3-digit integer, it is a full 32-bit integer (4 bytes). Perhaps you should have SMALLINT UNSIGNED (2 bytes). * BIGINT takes 8 bytes -- usually over-sized. -Original Message- From: spameden [mailto:spame...@gmail.com] Sent: Monday, October 15, 2012 1:42 PM To: mysql@lists.mysql.com Subject: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order Hi, list. Sorry for the long subject, but I'm really interested in solving this and need a help: I've got a table: mysql show create table send_sms_test; +---+-- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---+ | Table | Create Table | +---+-- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---+ | send_sms_test | CREATE TABLE `send_sms_test` ( `sql_id` bigint(20) NOT NULL AUTO_INCREMENT, `momt` enum('MO','MT') DEFAULT NULL, `sender` varchar(20) DEFAULT NULL, `receiver` varchar(20) DEFAULT NULL, `udhdata` blob, `msgdata` text, `time` bigint(20) NOT NULL, `smsc_id` varchar(255) DEFAULT 'main', `service` varchar(255) DEFAULT NULL, `account` varchar(255) DEFAULT NULL, `id` bigint(20) DEFAULT NULL, `sms_type` tinyint(1) DEFAULT '2', `mclass
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
| | Handler_read_next | 576090 | | Handler_read_prev | 0 | | Handler_read_rnd | 126| | Handler_read_rnd_next | 223| +---++ 6 rows in set (0.00 sec) mysql select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.09 sec) mysql SHOW SESSION STATUS LIKE 'Handler_read%'; +---++ | Variable_name | Value | +---++ | Handler_read_first| 18 | | Handler_read_key | 244| | Handler_read_next | 719969 | | Handler_read_prev | 0 | | Handler_read_rnd | 226| | Handler_read_rnd_next | 223| +---++ 6 rows in set (0.00 sec) I don't understand much in Handler thing, could you please explain more, based on the results I've posted ? In which case it works better and how it uses the index? About BIGINT(20) and INT(3) I will look further into this later, I understand it might be oversized, but my main question is about index why it's using it so weird. Many thanks for your quick answer! 2012/10/16 Rick James rja...@yahoo-inc.com * Rows = 11 / 22 -- don't take the numbers too seriously; they are crude approximations based on estimated cardinality. * The 11 comes from the LIMIT -- therefore useless in judging the efficiency. (The 22 may be 2*11; I don't know.) * Run the EXPLAINs without LIMIT -- that will avoid the bogus 11/22. * If the CREATE INDEX took only 0.67 sec, I surmise that you have very few rows in the table?? So this discussion is not necessarily valid in general cases. * What percentage of time values meet the WHERE? This has a big impact on the choice of explain plan and performance. * Set long_query_time = 0; to get it in the slowlog even if it is fast. Then look at the various extra values (such as filesort, on disk, temp table used, etc). * Do this (with each index): SHOW SESSION STATUS LIKE 'Handler_read%'; SELECT ... FORCE INDEX(...) ...; SHOW SESSION STATUS LIKE 'Handler_read%'; Then take the diffs of the handler counts. This will give you a pretty detailed idea of what is going on; better than the SlowLog. * INT(3) is not a 3-digit integer, it is a full 32-bit integer (4 bytes). Perhaps you should have SMALLINT UNSIGNED (2 bytes). * BIGINT takes 8 bytes -- usually over-sized. -Original Message- From: spameden [mailto:spame...@gmail.com] Sent: Monday, October 15, 2012 1:42 PM To: mysql@lists.mysql.com Subject: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order Hi, list. Sorry for the long subject, but I'm really interested in solving this and need a help: I've got a table: mysql show create table send_sms_test; +---+-- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---+ | Table | Create Table
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
Sorry, forgot to say: mysql show variables like 'long_query_time%'; +-+---+ | Variable_name | Value | +-+---+ | long_query_time | 10.00 | +-+---+ 1 row in set (0.00 sec) It's getting in the log only due: mysql show variables like '%indexes%'; +---+---+ | Variable_name | Value | +---+---+ | log_queries_not_using_indexes | ON| +---+---+ 1 row in set (0.00 sec) If I turn it off - it's all fine My initial question was why MySQL logs it in the slow log if the query uses an INDEX? And why it's not logging if I create an INDEX (time, priority) (but in the query there is FORCE INDEX (priority,time) specified, so MySQL shouldn't use newly created INDEX (time, priority) at all). 2012/10/16 spameden spame...@gmail.com Sorry, my previous e-mail was a test on MySQL-5.5.28 on an empty table. Here is the MySQL-5.1 Percona testing table: mysql select count(*) from send_sms_test; +--+ | count(*) | +--+ | 143879 | +--+ 1 row in set (0.03 sec) Without LIMIT: mysql desc select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--+---+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+---+-+ | 1 | SIMPLE | send_sms_test | range | time_priority | time_priority | 8 | NULL | 73920 | Using where; Using filesort | ++-+---+---+---+---+-+--+---+-+ 1 row in set (0.00 sec) mysql desc select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--++-+ | 1 | SIMPLE | send_sms_test | index | NULL | priority_time | 12 | NULL | 147840 | Using where | ++-+---+---+---+---+-+--++-+ 1 row in set (0.00 sec) But I actually need to use LIMIT, because client uses this to limit the number of records returned to process. mysql select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.00 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.53 | | Opening tables | 0.09 | | System lock| 0.05 | | Table lock | 0.04 | | init | 0.37 | | optimizing | 0.05 | | statistics | 0.07 | | preparing | 0.05 | | executing | 0.01 | | Sorting result | 0.03 | | Sending data | 0.000856 | | end| 0.03 | | query end | 0.01 | | freeing items | 0.15 | | logging slow query | 0.01 | | logging slow query | 0.47 | | cleaning up| 0.02 | ++--+ 17 rows in set (0.00 sec) mysql select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.08 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.48 | | Opening tables | 0.09 | | System lock| 0.02 | | Table lock | 0.04 | | init | 0.47 | | optimizing | 0.06 | | statistics | 0.43 | | preparing | 0.18 | | executing | 0.01 | | Sorting result | 0.076725 | | Sending data | 0.001406 | | end| 0.03 | | query end | 0.01 | | freeing items | 0.12 | | logging slow query | 0.01 | | cleaning up| 0.02 | ++--+ 16 rows in set (0.00 sec) As you can see latter query takes more time, because it's using filesort as well. Now, handler: mysql SHOW SESSION STATUS LIKE 'Handler_read%';select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100;SHOW SESSION STATUS LIKE
RE: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
I don't fully understand Handler numbers, either. But note the vast difference in Handler_read_next, as if the second test had to read (sequentially scan) a lot more stuff (in the index or the data). Summary: INDEX(time, priority) -- slower; bigger Handler numbers; shorter key_len; filesort INDEX(priority, time) -- faster; smaller; seems to use both keys of the index (key_len=12); avoids filesort (because INDEX(priority, ...) agrees with ORDER BY priority). The Optimizer has (at some level) two choices: * Start with the WHERE * Start with the ORDER BY Since the ORDER BY matches one of the indexes, it can avoid the sort and stop with the LIMIT. However, if most of the rows failed the WHERE clause, this could be the wrong choice. That is, it is hard for the optimizer to get a query like this right every time. To see what I mean, flip the inequality in WHERE time = ... around; I think the results will be disappointing. If you had more than a million rows, I would bring up PARTITIONing as a assist to this 2-dimensional type of problem. From: spameden [mailto:spame...@gmail.com] Sent: Monday, October 15, 2012 3:23 PM To: Rick James Cc: mysql@lists.mysql.com Subject: Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order Sorry, my previous e-mail was a test on MySQL-5.5.28 on an empty table. Here is the MySQL-5.1 Percona testing table: mysql select count(*) from send_sms_test; +--+ | count(*) | +--+ | 143879 | +--+ 1 row in set (0.03 sec) Without LIMIT: mysql desc select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--+---+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+---+-+ | 1 | SIMPLE | send_sms_test | range | time_priority | time_priority | 8 | NULL | 73920 | Using where; Using filesort | ++-+---+---+---+---+-+--+---+-+ 1 row in set (0.00 sec) mysql desc select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--++-+ | 1 | SIMPLE | send_sms_test | index | NULL | priority_time | 12 | NULL | 147840 | Using where | ++-+---+---+---+---+-+--++-+ 1 row in set (0.00 sec) But I actually need to use LIMIT, because client uses this to limit the number of records returned to process. mysql select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.00 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.53 | | Opening tables | 0.09 | | System lock| 0.05 | | Table lock | 0.04 | | init | 0.37 | | optimizing | 0.05 | | statistics | 0.07 | | preparing | 0.05 | | executing | 0.01 | | Sorting result | 0.03 | | Sending data | 0.000856 | | end| 0.03 | | query end | 0.01 | | freeing items | 0.15 | | logging slow query | 0.01 | | logging slow query | 0.47 | | cleaning up| 0.02 | ++--+ 17 rows in set (0.00 sec) mysql select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.08 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.48 | | Opening tables | 0.09 | | System lock| 0.02 | | Table lock | 0.04 | | init | 0.47 | | optimizing | 0.06 | | statistics | 0.43 | | preparing | 0.18 | | executing | 0.01 | | Sorting result | 0.076725 | | Sending data | 0.001406 | | end| 0.03 | | query end | 0.01 | | freeing items | 0.12 | | logging slow query | 0.01 | | cleaning up| 0.02
RE: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
Ø My initial question was why MySQL logs it in the slow log if the query uses an INDEX? That _may_ be worth a bug report. A _possible_ answer... EXPLAIN presents what the optimizer is in the mood for at that moment. It does not necessarily reflect what it was in the mood for when it ran the query. When timing things, run them twice (and be sure not to hit the Query cache). The first time freshens the cache (buffer_pool, etc); the second time gives you a 'reproducible' time. I believe (without proof) that the cache contents can affect the optimizer's choice. From: spameden [mailto:spame...@gmail.com] Sent: Monday, October 15, 2012 3:29 PM To: Rick James Cc: mysql@lists.mysql.com Subject: Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order Sorry, forgot to say: mysql show variables like 'long_query_time%'; +-+---+ | Variable_name | Value | +-+---+ | long_query_time | 10.00 | +-+---+ 1 row in set (0.00 sec) It's getting in the log only due: mysql show variables like '%indexes%'; +---+---+ | Variable_name | Value | +---+---+ | log_queries_not_using_indexes | ON| +---+---+ 1 row in set (0.00 sec) If I turn it off - it's all fine My initial question was why MySQL logs it in the slow log if the query uses an INDEX? And why it's not logging if I create an INDEX (time, priority) (but in the query there is FORCE INDEX (priority,time) specified, so MySQL shouldn't use newly created INDEX (time, priority) at all). 2012/10/16 spameden spame...@gmail.commailto:spame...@gmail.com Sorry, my previous e-mail was a test on MySQL-5.5.28 on an empty table. Here is the MySQL-5.1 Percona testing table: mysql select count(*) from send_sms_test; +--+ | count(*) | +--+ | 143879 | +--+ 1 row in set (0.03 sec) Without LIMIT: mysql desc select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--+---+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+---+-+ | 1 | SIMPLE | send_sms_test | range | time_priority | time_priority | 8 | NULL | 73920 | Using where; Using filesort | ++-+---+---+---+---+-+--+---+-+ 1 row in set (0.00 sec) mysql desc select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--++-+ | 1 | SIMPLE | send_sms_test | index | NULL | priority_time | 12 | NULL | 147840 | Using where | ++-+---+---+---+---+-+--++-+ 1 row in set (0.00 sec) But I actually need to use LIMIT, because client uses this to limit the number of records returned to process. mysql select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.00 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.53 | | Opening tables | 0.09 | | System lock| 0.05 | | Table lock | 0.04 | | init | 0.37 | | optimizing | 0.05 | | statistics | 0.07 | | preparing | 0.05 | | executing | 0.01 | | Sorting result | 0.03 | | Sending data | 0.000856 | | end| 0.03 | | query end | 0.01 | | freeing items | 0.15 | | logging slow query | 0.01 | | logging slow query | 0.47 | | cleaning up| 0.02 | ++--+ 17 rows in set (0.00 sec) mysql select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.08 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.48 | | Opening tables | 0.09 | | System lock| 0.02 | | Table lock
Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order
Thanks a lot for all your comments! I did disable Query cache before testing with set query_cache_type=OFF for the current session. I will report this to the MySQL bugs site later. 2012/10/16 Rick James rja...@yahoo-inc.com **Ø **My initial question was why MySQL logs it in the slow log if the query uses an INDEX? That _may_ be worth a bug report. ** ** A _possible_ answer... EXPLAIN presents what the optimizer is in the mood for at that moment. It does not necessarily reflect what it was in the mood for when it ran the query. ** ** When timing things, run them twice (and be sure not to hit the Query cache). The first time freshens the cache (buffer_pool, etc); the second time gives you a 'reproducible' time. I believe (without proof) that the cache contents can affect the optimizer's choice. ** ** *From:* spameden [mailto:spame...@gmail.com] *Sent:* Monday, October 15, 2012 3:29 PM *To:* Rick James *Cc:* mysql@lists.mysql.com *Subject:* Re: mysql logs query with indexes used to the slow-log and not logging if there is index in reverse order ** ** Sorry, forgot to say: mysql show variables like 'long_query_time%'; +-+---+ | Variable_name | Value | +-+---+ | long_query_time | 10.00 | +-+---+ 1 row in set (0.00 sec) It's getting in the log only due: mysql show variables like '%indexes%'; +---+---+ | Variable_name | Value | +---+---+ | log_queries_not_using_indexes | ON| +---+---+ 1 row in set (0.00 sec) If I turn it off - it's all fine My initial question was why MySQL logs it in the slow log if the query uses an INDEX? And why it's not logging if I create an INDEX (time, priority) (but in the query there is FORCE INDEX (priority,time) specified, so MySQL shouldn't use newly created INDEX (time, priority) at all). 2012/10/16 spameden spame...@gmail.com Sorry, my previous e-mail was a test on MySQL-5.5.28 on an empty table. Here is the MySQL-5.1 Percona testing table: mysql select count(*) from send_sms_test; +--+ | count(*) | +--+ | 143879 | +--+ 1 row in set (0.03 sec) Without LIMIT: mysql desc select * from send_sms_test FORCE INDEX (time_priority) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--+---+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--+---+-+ | 1 | SIMPLE | send_sms_test | range | time_priority | time_priority | 8 | NULL | 73920 | Using where; Using filesort | ++-+---+---+---+---+-+--+---+-+ 1 row in set (0.00 sec) mysql desc select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority; ++-+---+---+---+---+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+---+-+--++-+ | 1 | SIMPLE | send_sms_test | index | NULL | priority_time | 12 | NULL | 147840 | Using where | ++-+---+---+---+---+-+--++-+ 1 row in set (0.00 sec) But I actually need to use LIMIT, because client uses this to limit the number of records returned to process. mysql select * from send_sms_test FORCE INDEX (priority_time) where time=UNIX_TIMESTAMP(NOW()) order by priority LIMIT 0,100; 100 rows in set (0.00 sec) mysql show profile; ++--+ | Status | Duration | ++--+ | starting | 0.53 | | Opening tables | 0.09 | | System lock| 0.05 | | Table lock | 0.04 | | init | 0.37 | | optimizing | 0.05 | | statistics | 0.07 | | preparing | 0.05 | | executing | 0.01 | | Sorting result | 0.03 | | Sending data | 0.000856 | | end| 0.03 | | query end | 0.01 | | freeing items | 0.15 | | logging slow query | 0.01 | | logging slow query | 0.47 | | cleaning up| 0.02 | ++--+ 17 rows in set (0.00
RE: Are Single Column Indexes are sufficient
WHERE (t0.job_id = '006-120613043532587-o-C') AND t0.bean_type = 'ActionItems'; Begs for INDEX(job_id, bean_type) -- in either order where job_id = '0043189-120805203721153-o-C' and nominal_time = '2012-09-07 07:16:00' and nominal_time '2012-09-07 08:06:00' group by status; Begs for INDEX(job_id, normal_time) -- in THIS order It cannot make effective use of `status` in an index. WHERE (t0.pending 0 AND (t0.status = 'SUSPENDED' OR t0.status = 'KILLED' OR t0.status = 'RUNNING') AND t0.last_modified_time = '2012-09-07 08:08:34') AND t0.bean_type = 'ActionItems'; Change the `status` check to AND t0.status IN ('SUSPENDED', 'KILLED', 'RUNNING') Other compound indexes might work, but I guess this is the best: INDEX(bean_type, status, last_modified_time) -- in THIS order Note that I put the '=' field(s) first in the INDEX. Then I put _one_ range field next. (IN is sort of in between = and range.) Single-field indexes _might_ use the index merge feature -- but rarely. And almost always an appropriate compound index will out-perform it. Usually I ignore cardinality when picking an index. A single-field index on a 'flag' or low cardinality field is almost never chosen by the optimizer. Don't bother having such indexes. When asking performance questions, please provide SHOW CREATE TABLE (not DESCRIBE) SHOW TABLE STATUS (for size info) EXPLAIN SELECT ... To figure out how many fields of the index are being used, look at the len field of EXPLAIN, then look at the sizes of the fields in the chosen index. (Add 1 for NULLable fields.) -Original Message- From: Adarsh Sharma [mailto:eddy.ada...@gmail.com] Sent: Monday, September 17, 2012 10:06 PM To: mysql@lists.mysql.com Subject: Are Single Column Indexes are sufficient Hi all, Currently i am doing performance level tuning of some queries that are running very slow in my slow -query log. Below are the sample of some queries the cardinality of indexes :- --- Below queries take more than 15 minutes to complete on a table scd_table of size 7 GB SELECT t0.id, t0.bean_type, t0.action_number, t0.action_xml, t0.console_url, t0.created_conf, t0.error_code, t0.error_message, t0.external_status, t0.missing_dependencies, t0.run_conf, t0.time_out, t0.tracker_uri, t0.job_type, t0.created_time, t0.external_id, t0.job_id, t0.last_modified_time, t0.nominal_time, t0.pending, t0.rerun_time, t0.sla_xml, t0.status FROM scd_table t0 WHERE (t0.job_id = '006-120613043532587-o-C') AND t0.bean_type = 'ActionItems'; select status, count(*) as cnt from scd_table where job_id = '0043189- 120805203721153-o-C' and nominal_time = '2012-09-07 07:16:00' and nominal_time '2012-09-07 08:06:00' group by status; SELECT t0.id, t0.bean_type, t0.action_number, t0.action_xml, t0.console_url, t0.created_conf, t0.error_code, t0.error_message, t0.external_status, t0.missing_dependencies, t0.run_conf, t0.time_out, t0.tracker_uri, t0.job_type, t0.created_time, t0.external_id, t0.job_id, t0.last_modified_time, t0.nominal_time, t0.pending, t0.rerun_time, t0.sla_xml, t0.status FROM scd_table t0 WHERE (t0.pending 0 AND (t0.status = 'SUSPENDED' OR t0.status = 'KILLED' OR t0.status = 'RUNNING') AND t0.last_modified_time = '2012-09-07 08:08:34') AND t0.bean_type = 'ActionItems'; mysql show indexes from scd_table; +---++--+-- ++---+-+--+ +--++-+ | Table | Non_unique | Key_name | Seq_in_index | Column_name| Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---++--+-- ++---+-+--+ +--++-+ | scd_table | 0 | PRIMARY |1 | id | A | 188908 | NULL | NULL | | BTREE | | | scd_table | 1 | I_CRD_TNS_CREATED_TIME |1 | created_time | A | 188908 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_DTYPE |1 | bean_type | A | 14 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_EXTERNAL_ID|1 | external_id| A | 188908 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_JOB_ID |1 | job_id | A | 365 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_LAST_MODIFIED_TIME |1 | last_modified_time | A | 188908 | NULL | NULL | YES | BTREE
Are Single Column Indexes are sufficient
Hi all, Currently i am doing performance level tuning of some queries that are running very slow in my slow -query log. Below are the sample of some queries the cardinality of indexes :- --- Below queries take more than 15 minutes to complete on a table scd_table of size 7 GB SELECT t0.id, t0.bean_type, t0.action_number, t0.action_xml, t0.console_url, t0.created_conf, t0.error_code, t0.error_message, t0.external_status, t0.missing_dependencies, t0.run_conf, t0.time_out, t0.tracker_uri, t0.job_type, t0.created_time, t0.external_id, t0.job_id, t0.last_modified_time, t0.nominal_time, t0.pending, t0.rerun_time, t0.sla_xml, t0.status FROM scd_table t0 WHERE (t0.job_id = '006-120613043532587-o-C') AND t0.bean_type = 'ActionItems'; select status, count(*) as cnt from scd_table where job_id = '0043189-120805203721153-o-C' and nominal_time = '2012-09-07 07:16:00' and nominal_time '2012-09-07 08:06:00' group by status; SELECT t0.id, t0.bean_type, t0.action_number, t0.action_xml, t0.console_url, t0.created_conf, t0.error_code, t0.error_message, t0.external_status, t0.missing_dependencies, t0.run_conf, t0.time_out, t0.tracker_uri, t0.job_type, t0.created_time, t0.external_id, t0.job_id, t0.last_modified_time, t0.nominal_time, t0.pending, t0.rerun_time, t0.sla_xml, t0.status FROM scd_table t0 WHERE (t0.pending 0 AND (t0.status = 'SUSPENDED' OR t0.status = 'KILLED' OR t0.status = 'RUNNING') AND t0.last_modified_time = '2012-09-07 08:08:34') AND t0.bean_type = 'ActionItems'; mysql show indexes from scd_table; +---++--+--++---+-+--++--++-+ | Table | Non_unique | Key_name | Seq_in_index | Column_name| Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---++--+--++---+-+--++--++-+ | scd_table | 0 | PRIMARY |1 | id | A | 188908 | NULL | NULL | | BTREE | | | scd_table | 1 | I_CRD_TNS_CREATED_TIME |1 | created_time | A | 188908 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_DTYPE |1 | bean_type | A | 14 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_EXTERNAL_ID|1 | external_id| A | 188908 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_JOB_ID |1 | job_id | A | 365 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_LAST_MODIFIED_TIME |1 | last_modified_time | A | 188908 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_RERUN_TIME |1 | rerun_time | A | 14 | NULL | NULL | YES | BTREE | | | scd_table | 1 | I_CRD_TNS_STATUS |1 | status | A | 14 | NULL | NULL | YES | BTREE | | +---++--+--++---+-+--++--++-+ Whenever i explain the query it takes the index with low cardinality. Can I remove all the indexes and create only 1-2 multi column index or any other tuning that i can do for the above queries. Please let me know if any other info is reqd. ( table schema has the same columns mentioned in select clause ). Thanks
Re: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
On 05/07/2012 12:30 PM, Zhangzhigang wrote: Thanks, i thought about this answer in the past, and i appreciate your reply. How about the omelet? What's your method? -- RMA. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
One more wrinkle... When adding a UNIQUE index, MySQL must build the BTree, and cannot do a sortmerge. (I think this is a true statement, but I am not positive.) The reason... A UNIQUE index is two things: an INDEX, and a UNUQUEness constraint. In order to enforce the constraint, it must check each record as it is inserted into the table. This means that the index must exist, and contain all the row that have been added so far. Conclusion: Do not make an INDEX UNIQUE unless you really need it. Example: PRIMARY KEY (a,b) UNIQUE (b,a) -- The UNIQUEness constraint here adds nothing useful; make it just INDEX. There are still more wrinkles... * InnoDB benefits from inserting rows in the PRIMARY KEY order. * INSERT ... (SELECT ... ORDER BY ...) -- Sometimes it is useful to do a sortmerge in the SELECT in order to make the INSERT more efficient. * ALTER TABLE ... ORDER BY ... -- For MyISAM (not for InnoDB), this can cluster the rows to make certain SELECTs do fewer disk hits. Massive data warehouse fact tables often should have nothing but a PRIMARY KEY. All big SELECTs should hit summary tables that aggregate data to make reports more efficient. (I have seen 10x to 1000x performance improvement.) Should we discuss this? -Original Message- From: Karen Abgarian [mailto:a...@apple.com] Sent: Monday, May 07, 2012 8:37 PM To: mysql@lists.mysql.com Subject: Re: 回复: 回复: Why is creating indexes faster after inserting massive data rows? Honestly, I did not understand that. I did not say anything about being complicated. What does mysql not use, caching?? Judging by experience, creating a unique index on say, a 200G table could be a bitter one. On 07.05.2012, at 19:26, Zhangzhigang wrote: Karen... The mysql does not use this approach what you said which is complicated. I agree with ohan De Meersman. 发件人: Karen Abgarian a...@apple.com 收件人: mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 1:30 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? Hi, A couple cents to this. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. In both cases, the actual writing does not take place and does not slow down the process.What does however happen for each operation, is processing the statement, locating the entries to update in the index, index block splits and , for good reason, committing. When it comes to creating an index, what needs to happen, is to read the whole table and to sort all rows by the index key. The latter process will be the most determining factor in answering the original question, because for the large tables the sort will have to do a lot of disk I/O.The point I am trying to make is there will be situations when creating indexes and then inserting the rows will be faster than creating an index afterwards. If we try to determine such situations, we could notice that the likelihood of the sort going to disk increases with the amount of distinct values to be sorted. For this reason, my choice would be to create things like primary/unique keys beforehand unless I am certain that everything will fit in the available memory. Peace Karen On May 7, 2012, at 8:05 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible
RE: 回复: Why is creating indexes faster after inserting massive data rows?
A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting – bigger than can be cached in RAM – is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together – take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O – no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential – fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. · You have an INDEX on some random value, such as a GUID or MD5. · The INDEX will be 5 times as big as you can fit in RAM. · MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert – Terrible! Sortmerge is likely to average over 10,000. From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复: Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table. As for the original question... * Updating the index(es) for one row often involves random BTree traversals. When the index(es) are too big to be cached, this can involve disk hit(s) for each row inserted. * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.bemailto:vegiv...@tuxera.be] Sent: Monday, May 07, 2012 1:29 AM To: Zhangzhigang Cc: mysql@lists.mysql.commailto:mysql@lists.mysql.com Subject: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn
Re: 回复: Why is creating indexes faster after inserting massive data rows?
This thread is going on and on and on and on, does anyone have time to actually measure I/O? Let's make numbers talk. Claudio 2012/5/9 Rick James rja...@yahoo-inc.com A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting �C bigger than can be cached in RAM �C is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together �C take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O �C no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential �C fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. ・ You have an INDEX on some random value, such as a GUID or MD5. ・ The INDEX will be 5 times as big as you can fit in RAM. ・ MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert �C Terrible! Sortmerge is likely to average over 10,000. From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复: Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table. As for the original question... * Updating the index(es) for one row often involves random BTree traversals. When the index(es) are too big to be cached, this can involve disk hit(s) for each row inserted. * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.bemailto: vegiv...@tuxera.be] Sent: Monday, May 07, 2012 1:29 AM To: Zhangzhigang Cc: mysql
RE: 回复: Why is creating indexes faster after inserting massive data rows?
Disagree all the way, numbers are numbers, and better than words, always. Claudio On May 9, 2012 7:22 PM, Rick James rja...@yahoo-inc.com wrote: Numbers can be misleading �C one benchmark will show no difference; another will show 10x difference. Recommend you benchmark _*your*_ case. ** ** *From:* Claudio Nanni [mailto:claudio.na...@gmail.com] *Sent:* Wednesday, May 09, 2012 8:34 AM *To:* Rick James *Cc:* Zhangzhigang; mysql@lists.mysql.com *Subject:* Re: 回复: Why is creating indexes faster after inserting massive data rows? ** ** This thread is going on and on and on and on, does anyone have time to actually measure I/O? Let's make numbers talk. ** ** Claudio ** ** 2012/5/9 Rick James rja...@yahoo-inc.com A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting �C bigger than can be cached in RAM �C is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together �C take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O �C no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential �C fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. ・ You have an INDEX on some random value, such as a GUID or MD5. ・ The INDEX will be 5 times as big as you can fit in RAM. ・ MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert �C Terrible! Sortmerge is likely to average over 10,000. From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复: Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental
回复: 回复: Why is creating indexes faster after inserting massive data rows?
The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential – fill the first block, move on to the next block, and never have to go back. James... Thanks for your answer, so clearly. Firstly: I thought that the block split for building of the BTree has to been done to do random I/O before accepting this answer. Now, i have known that the mysql do the optimization to keep from block split by sort merge for building BTree, so it does not do more random I/O. Secondly: It bypass BTree traversals, When the index are too big to be cached which involves disk hit(s) fro each row inserted. Thank you very much. Sincerely yours Zhigang Zhang 发件人: Rick James rja...@yahoo-inc.com 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com mysql@lists.mysql.com 发送日期: 2012年5月9日, 星期三, 下午 11:21 主题: RE: 回复: Why is creating indexes faster after inserting massive data rows? A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting – bigger than can be cached in RAM – is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together – take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O – no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential – fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. · You have an INDEX on some random value, such as a GUID or MD5. · The INDEX will be 5 times as big as you can fit in RAM. · MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert – Terrible! Sortmerge is likely to average over 10,000. From:Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复:Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人:Rick James rja...@yahoo-inc.com 收件人:Johan De Meersman vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cn 抄送:mysql@lists.mysql.com mysql@lists.mysql.com 发送日期:2012年5月8日, 星期二, 上午12:35 主题:RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table
回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
Ok, OS cache. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. As i known, the mysql writes the data to disk directly but does not use the Os cache when the table is updating. If it writes to the Os cache, which leads to massive system invoking, when the table is inserted a lot of rows one by one. 发件人: Karen Abgarian a...@apple.com 收件人: mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 11:37 主题: Re: 回复: 回复: Why is creating indexes faster after inserting massive data rows? Honestly, I did not understand that. I did not say anything about being complicated. What does mysql not use, caching?? Judging by experience, creating a unique index on say, a 200G table could be a bitter one. On 07.05.2012, at 19:26, Zhangzhigang wrote: Karen... The mysql does not use this approach what you said which is complicated. I agree with ohan De Meersman. 发件人: Karen Abgarian a...@apple.com 收件人: mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 1:30 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? Hi, A couple cents to this. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. In both cases, the actual writing does not take place and does not slow down the process. What does however happen for each operation, is processing the statement, locating the entries to update in the index, index block splits and , for good reason, committing. When it comes to creating an index, what needs to happen, is to read the whole table and to sort all rows by the index key. The latter process will be the most determining factor in answering the original question, because for the large tables the sort will have to do a lot of disk I/O. The point I am trying to make is there will be situations when creating indexes and then inserting the rows will be faster than creating an index afterwards. If we try to determine such situations, we could notice that the likelihood of the sort going to disk increases with the amount of distinct values to be sorted. For this reason, my choice would be to create things like primary/unique keys beforehand unless I am certain that everything will fit in the available memory. Peace Karen On May 7, 2012, at 8:05 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql
Re: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
- Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn The mysql does not use this approach what you said which is complicated. I agree with ohan De Meersman. Umm... It's not a matter of who you agree with :-) Karen's technical detail is quite correct; I merely presented a simplified picture for easier understanding of the basics. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
- Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn As i known, the mysql writes the data to disk directly but does not use the Os cache when the table is updating. If it were to use the OS cache for reading but not writing, then the OS cache would be inconsistent with the underlying filesystem as soon as you wrote a block, and you'd need some complicated logic to figure out which of the two was correct. No, the MyISAM engine will simply yield to whatever the kernel/VFS wants to do with the blocks; whereas InnoDB explicitly opens the files with O_SYNC and bypasses the OS cache entirely, because it manages it's own buffer cache. If it writes to the Os cache, which leads to massive system invoking, when the table is inserted a lot of rows one by one. From the code's point of view, you simply request a read or a write. Wether or not the OS cache gets in between is entirely a matter for the kernel to decide, assuming you specified no specific options at file open time. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
回复: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
Ok, thanks for your help. 发件人: Johan De Meersman vegiv...@tuxera.be 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com; Karen Abgarian a...@apple.com 发送日期: 2012年5月8日, 星期二, 下午 6:07 主题: Re: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn As i known, the mysql writes the data to disk directly but does not use the Os cache when the table is updating. If it were to use the OS cache for reading but not writing, then the OS cache would be inconsistent with the underlying filesystem as soon as you wrote a block, and you'd need some complicated logic to figure out which of the two was correct. No, the MyISAM engine will simply yield to whatever the kernel/VFS wants to do with the blocks; whereas InnoDB explicitly opens the files with O_SYNC and bypasses the OS cache entirely, because it manages it's own buffer cache. If it writes to the Os cache, which leads to massive system invoking, when the table is inserted a lot of rows one by one. From the code's point of view, you simply request a read or a write. Wether or not the OS cache gets in between is entirely a matter for the kernel to decide, assuming you specified no specific options at file open time. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql
Re: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
Hi, If MyISAM tables were being written directly to disk, the MyISAM tables would be so slow that nobody would ever use them.That's the cornerstone of their performance, that the writes do not wait for the physical I/O to complete! On May 8, 2012, at 3:07 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn As i known, the mysql writes the data to disk directly but does not use the Os cache when the table is updating. If it were to use the OS cache for reading but not writing, then the OS cache would be inconsistent with the underlying filesystem as soon as you wrote a block, and you'd need some complicated logic to figure out which of the two was correct. No, the MyISAM engine will simply yield to whatever the kernel/VFS wants to do with the blocks; whereas InnoDB explicitly opens the files with O_SYNC and bypasses the OS cache entirely, because it manages it's own buffer cache. If it writes to the Os cache, which leads to massive system invoking, when the table is inserted a lot of rows one by one. From the code's point of view, you simply request a read or a write. Wether or not the OS cache gets in between is entirely a matter for the kernel to decide, assuming you specified no specific options at file open time. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
回复: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
Oh... I thought that it uses it's own buffer cache as same as the InnoDB. I have got a mistake for this, thanks! 发件人: Karen Abgarian a...@apple.com 收件人: mysql@lists.mysql.com 发送日期: 2012年5月9日, 星期三, 上午 2:51 主题: Re: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows? Hi, If MyISAM tables were being written directly to disk, the MyISAM tables would be so slow that nobody would ever use them. That's the cornerstone of their performance, that the writes do not wait for the physical I/O to complete! On May 8, 2012, at 3:07 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn As i known, the mysql writes the data to disk directly but does not use the Os cache when the table is updating. If it were to use the OS cache for reading but not writing, then the OS cache would be inconsistent with the underlying filesystem as soon as you wrote a block, and you'd need some complicated logic to figure out which of the two was correct. No, the MyISAM engine will simply yield to whatever the kernel/VFS wants to do with the blocks; whereas InnoDB explicitly opens the files with O_SYNC and bypasses the OS cache entirely, because it manages it's own buffer cache. If it writes to the Os cache, which leads to massive system invoking, when the table is inserted a lot of rows one by one. From the code's point of view, you simply request a read or a write. Wether or not the OS cache gets in between is entirely a matter for the kernel to decide, assuming you specified no specific options at file open time. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql
回复: Why is creating indexes faster after inserting massive data rows?
James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been createdseparately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table. As for the original question... * Updating the index(es) for one row often involves random BTree traversals. When the index(es) are too big to be cached, this can involve disk hit(s) for each row inserted. * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.be] Sent: Monday, May 07, 2012 1:29 AM To: Zhangzhigang Cc: mysql@lists.mysql.com Subject: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why. Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient. I seem to recall that inside of a transaction (thus, InnoDB or so) the difference is markedly less; I might be wrong, though. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql
Re: Why is creating indexes faster after inserting massive data rows?
which version of mysql are you using. Is this secondary index.? On Mon, May 7, 2012 at 12:07 PM, Zhangzhigang zzgang_2...@yahoo.com.cnwrote: hi all: I have a question: Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why.
回复: Why is creating indexes faster after inserting massive data rows?
Version : Mysql 5.1 Engine : MyISAM. The indexes are normal but neither primary key or unique key. I should describe mysql question clearly. When inserting massive data rows to table which need to be created indexes, i can create indexes before inserting data rows, anther way is that i can insert all data rows firstly and then create indexes. Normally, the sum using time(inserting data rows and creating indexes) of first way is longer than the second way. Please tell me why? 发件人: Ananda Kumar anan...@gmail.com 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 3:31 主题: Re: Why is creating indexes faster after inserting massive data rows? which version of mysql 燼re you using. Is this secondary index.? On Mon, May 7, 2012 at 12:07 PM, Zhangzhigang zzgang_2...@yahoo.com.cn wrote: hi all: I have a question: Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why.
Re: Why is creating indexes faster after inserting massive data rows?
- Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why. Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient. I seem to recall that inside of a transaction (thus, InnoDB or so) the difference is markedly less; I might be wrong, though. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
回复: Why is creating indexes faster after inserting massive data rows?
johan Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient.. Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? So i can not find the different overhead about two ways. 发件人: Johan De Meersman vegiv...@tuxera.be 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 4:28 主题: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why. Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient. I seem to recall that inside of a transaction (thus, InnoDB or so) the difference is markedly less; I might be wrong, though. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: 回复: Why is creating indexes faster after inserting massive data rows?
On 2012/05/07 10:53, Zhangzhigang wrote: johan Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient.. Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? So i can not find the different overhead about two ways. My simplified 2c. When inserting rows with active indexes one by one (insert), mysql has to 1) lock the space for the data to be added, 2) write the data, 3) lock the index, 4) write the index key(s), 5) unlock the index, 6)unlock the data This happens for each row When first doing all data without index, only 1, 2, and 6 happen. When you then create an index, it can lock the index, read all the data and write all index keys in one go and then unlock the index. If you make an omelet, do you fetch your eggs from the fridge one by one, or all at the same time? :) HTH, Alex -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: 回复: Why is creating indexes faster after inserting massive data rows?
Creating the index in one time is one macro-sort operation, updating the index at every row is doing the operation on and on again. If you do not understand the difference I recommend you to read some basics about sorting algorithms, very interesting read anyway. Claudio 2012/5/7 Zhangzhigang zzgang_2...@yahoo.com.cn johan Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient.. Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? So i can not find the different overhead about two ways. 发件人: Johan De Meersman vegiv...@tuxera.be 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 4:28 主题: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why. Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient. I seem to recall that inside of a transaction (thus, InnoDB or so) the difference is markedly less; I might be wrong, though. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Claudio
回复: 回复: Why is creating indexes faster after inserting massive data rows?
Ok, but my opinion is that the sorting algorithms is not impact this difference, two ways all do B+ tree inserts. 发件人: Claudio Nanni claudio.na...@gmail.com 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: Johan De Meersman vegiv...@tuxera.be; mysql@lists.mysql.com mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 5:01 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? Creating the index in one time is one macro-sort operation, updating the index at every row is doing the operation on and on again. If you do not understand the difference I recommend you to read some basics about sorting algorithms, very interesting read anyway. Claudio 2012/5/7 Zhangzhigang zzgang_2...@yahoo.com.cn johan Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient.. Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? So i can not find the different overhead about two ways. 发件人: Johan De Meersman vegiv...@tuxera.be 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 4:28 主题: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why. Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient. I seem to recall that inside of a transaction (thus, InnoDB or so) the difference is markedly less; I might be wrong, though. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- Claudio
回复: 回复: Why is creating indexes faster after inserting massive data rows?
Thanks, i thought about this answer in the past, and i appreciate your reply. 发件人: Alex Schaft al...@quicksoftware.co.za 收件人: mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 4:59 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? On 2012/05/07 10:53, Zhangzhigang wrote: johan Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient.. Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? So i can not find the different overhead about two ways. My simplified 2c. When inserting rows with active indexes one by one (insert), mysql has to 1) lock the space for the data to be added, 2) write the data, 3) lock the index, 4) write the index key(s), 5) unlock the index, 6)unlock the data This happens for each row When first doing all data without index, only 1, 2, and 6 happen. When you then create an index, it can lock the index, read all the data and write all index keys in one go and then unlock the index. If you make an omelet, do you fetch your eggs from the fridge one by one, or all at the same time? :) HTH, Alex -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql
Re: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
too nice not to share it! http://www.youtube.com/watch?v=INHF_5RIxTE 2012/5/7 Zhangzhigang zzgang_2...@yahoo.com.cn Thanks, i thought about this answer in the past, and i appreciate your reply. 发件人: Alex Schaft al...@quicksoftware.co.za 收件人: mysql@lists.mysql.com 发送日期: 2012年5月7日, 星期一, 下午 4:59 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? On 2012/05/07 10:53, Zhangzhigang wrote: johan Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient.. Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? So i can not find the different overhead about two ways. My simplified 2c. When inserting rows with active indexes one by one (insert), mysql has to 1) lock the space for the data to be added, 2) write the data, 3) lock the index, 4) write the index key(s), 5) unlock the index, 6)unlock the data This happens for each row When first doing all data without index, only 1, 2, and 6 happen. When you then create an index, it can lock the index, read all the data and write all index keys in one go and then unlock the index. If you make an omelet, do you fetch your eggs from the fridge one by one, or all at the same time? :) HTH, Alex -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- Claudio
Re: 回复: Why is creating indexes faster after inserting massive data rows?
- Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
RE: Why is creating indexes faster after inserting massive data rows?
* Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table. As for the original question... * Updating the index(es) for one row often involves random BTree traversals. When the index(es) are too big to be cached, this can involve disk hit(s) for each row inserted. * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.be] Sent: Monday, May 07, 2012 1:29 AM To: Zhangzhigang Cc: mysql@lists.mysql.com Subject: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Creating indexes after inserting massive data rows is faster than before inserting data rows. Please tell me why. Plain and simple: the indices get updated after every insert statement, whereas if you only create the index *after* the inserts, the index gets created in a single operation, which is a lot more efficient. I seem to recall that inside of a transaction (thus, InnoDB or so) the difference is markedly less; I might be wrong, though. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: 回复: Why is creating indexes faster after inserting massive data rows?
As a side note, TokuDB uses what it calls fractal technology to somewhat improve the performance of incremental INDEXing. They delay some of the BTree work so that they can better batch stuff. While waiting for that to finish, queries are smart enough to look in more than one place for the index info. InnoDB does something similar, but it is limited to the size of the buffer_pool. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.be] Sent: Monday, May 07, 2012 8:06 AM To: Zhangzhigang Cc: mysql@lists.mysql.com Subject: Re: 回复: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: 回复: Why is creating indexes faster after inserting massive data rows?
Hi, A couple cents to this. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. In both cases, the actual writing does not take place and does not slow down the process.What does however happen for each operation, is processing the statement, locating the entries to update in the index, index block splits and , for good reason, committing. When it comes to creating an index, what needs to happen, is to read the whole table and to sort all rows by the index key. The latter process will be the most determining factor in answering the original question, because for the large tables the sort will have to do a lot of disk I/O.The point I am trying to make is there will be situations when creating indexes and then inserting the rows will be faster than creating an index afterwards. If we try to determine such situations, we could notice that the likelihood of the sort going to disk increases with the amount of distinct values to be sorted. For this reason, my choice would be to create things like primary/unique keys beforehand unless I am certain that everything will fit in the available memory. Peace Karen On May 7, 2012, at 8:05 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: 回复: Why is creating indexes faster after inserting massive data rows?
Good point about key buffer. I was only thinking about the table updates for MyISAM, not indexes. The being stuck waiting for buffer flush could also happen. However, for the table blocks this would be the same issue as with load followed by index rebuild, and for the indexes, it will have to be compared, performance-wise, with an expense of sorting an equally sized index. On May 7, 2012, at 10:40 AM, Rick James wrote: (Correction to Karen's comments) * MyISAM does all its index operations in the key_buffer, similar to InnoDB and its buffer_pool. * Yes, writes are delayed (in both engines), but not forever. If the table is huge, you will eventually be stuck waiting for blocks to be flushed from cache. * If the table is small enough, all the I/O can be delayed, and done only once. So yes, the in-memory cache may be faster. Based on this discussion, you should note that random indexes, such as GUIDs, MD5s, etc, tend to -Original Message- From: Karen Abgarian [mailto:a...@apple.com] Sent: Monday, May 07, 2012 10:31 AM To: mysql@lists.mysql.com Subject: Re: 回复: Why is creating indexes faster after inserting massive data rows? Hi, A couple cents to this. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. In both cases, the actual writing does not take place and does not slow down the process.What does however happen for each operation, is processing the statement, locating the entries to update in the index, index block splits and , for good reason, committing. When it comes to creating an index, what needs to happen, is to read the whole table and to sort all rows by the index key. The latter process will be the most determining factor in answering the original question, because for the large tables the sort will have to do a lot of disk I/O.The point I am trying to make is there will be situations when creating indexes and then inserting the rows will be faster than creating an index afterwards. If we try to determine such situations, we could notice that the likelihood of the sort going to disk increases with the amount of distinct values to be sorted. For this reason, my choice would be to create things like primary/unique keys beforehand unless I am certain that everything will fit in the available memory. Peace Karen On May 7, 2012, at 8:05 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
回复: 回复: Why is creating indexes faster after inserting massive data rows?
Karen... The mysql does not use this approach what you said which is complicated. I agree with ohan De Meersman. 发件人: Karen Abgarian a...@apple.com 收件人: mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 1:30 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? Hi, A couple cents to this. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. In both cases, the actual writing does not take place and does not slow down the process. What does however happen for each operation, is processing the statement, locating the entries to update in the index, index block splits and , for good reason, committing. When it comes to creating an index, what needs to happen, is to read the whole table and to sort all rows by the index key. The latter process will be the most determining factor in answering the original question, because for the large tables the sort will have to do a lot of disk I/O. The point I am trying to make is there will be situations when creating indexes and then inserting the rows will be faster than creating an index afterwards. If we try to determine such situations, we could notice that the likelihood of the sort going to disk increases with the amount of distinct values to be sorted. For this reason, my choice would be to create things like primary/unique keys beforehand unless I am certain that everything will fit in the available memory. Peace Karen On May 7, 2012, at 8:05 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql
Re: 回复: 回复: Why is creating indexes faster after inserting massive data rows?
Honestly, I did not understand that. I did not say anything about being complicated. What does mysql not use, caching?? Judging by experience, creating a unique index on say, a 200G table could be a bitter one. On 07.05.2012, at 19:26, Zhangzhigang wrote: Karen... The mysql does not use this approach what you said which is complicated. I agree with ohan De Meersman. 发件人: Karen Abgarian a...@apple.com 收件人: mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 1:30 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows? Hi, A couple cents to this. There isn't really a million of block writes. The record gets added to the block, but that gets modified in OS cache if we assume MyISAM tables and in the Innodb buffer if we assume InnoDB tables. In both cases, the actual writing does not take place and does not slow down the process.What does however happen for each operation, is processing the statement, locating the entries to update in the index, index block splits and , for good reason, committing. When it comes to creating an index, what needs to happen, is to read the whole table and to sort all rows by the index key. The latter process will be the most determining factor in answering the original question, because for the large tables the sort will have to do a lot of disk I/O.The point I am trying to make is there will be situations when creating indexes and then inserting the rows will be faster than creating an index afterwards. If we try to determine such situations, we could notice that the likelihood of the sort going to disk increases with the amount of distinct values to be sorted. For this reason, my choice would be to create things like primary/unique keys beforehand unless I am certain that everything will fit in the available memory. Peace Karen On May 7, 2012, at 8:05 AM, Johan De Meersman wrote: - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cn Ok, Creating the index *after* the inserts, the index gets created in a single operation. But the indexes has to be updating row by row after the data rows has all been inserted. Does it work in this way? No, when you create an index on an existing table (like after a mass insert), what happens is that the engine does a single full tablescan and builds the index in a single pass, which is a lot more performant than updating a single disk block for every record, for the simple reason that a single disk block can contain dozens of index entries. Imagine that you insert one million rows, and you have 100 index entries in a disk block (random numbers, to make a point. Real numbers will depend on storage, file system, index, et cetera). Obviously there's no way to write less than a single block to disk - that's how it works. You can update your index for each record in turn. That means you will need to do 1 million index - and thus block - writes; plus additional reads for those blocks you don't have in memory - that's the index cache. Now, if you create a new index on an existing table, you are first of all bypassing any index read operations - there *is* no index to read, yet. Then the system is going to do a full tablescan - considered slow, but you need all the data, so there's no better way anyway. The index will be built - in-memory as much as possible - and the system will automatically prefer to write only complete blocks - 10.000 of them. That's the exact same number of index blocks, but you only write each block once, so that's only 10.000 writes instead of 1.000.000. Now there's a lot more at play, things like B-tree balancing and whatnot, but that's the basic picture. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: MySQL Indexes
Is it normal practice for a heavily queried MYSQL tables to have a index file bigger than the data file ? On Fri, Oct 7, 2011 at 12:22 AM, Michael Dykman mdyk...@gmail.com wrote: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykman mdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.commdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.comwrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: MySQL Indexes
When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelps bphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.com neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.com mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.comhttp://lists.mysql.com/mysql?unsub=mdyk...@gmail.com -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
How heavily a given table is queried does not directly affect the index size, only the number and depth of the indexes. No, it is not that unusual to have the index file bigger. Just make sure that every index you have is justified by the queries you are making against the table. - md On Fri, Oct 7, 2011 at 4:26 AM, Tompkins Neil neil.tompk...@googlemail.comwrote: Is it normal practice for a heavily queried MYSQL tables to have a index file bigger than the data file ? On Fri, Oct 7, 2011 at 12:22 AM, Michael Dykman mdyk...@gmail.com wrote: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykman mdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.commdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
but could this not be called a bug? Am 07.10.2011 18:08, schrieb Michael Dykman: When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelps bphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.com neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.com mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.comhttp://lists.mysql.com/mysql?unsub=mdyk...@gmail.com -- Mit besten Grüßen, Reindl Harald the lounge interactive design GmbH A-1060 Vienna, Hofmühlgasse 17 CTO / software-development / cms-solutions p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40 icq: 154546673, http://www.thelounge.net/ http://www.thelounge.net/signature.asc.what.htm signature.asc Description: OpenPGP digital signature
Re: MySQL Indexes
No, I don't think it can be called. It is a direct consequence of the relational paradigm. Any implementation of an RDBMS has the same characteristic. - md On Fri, Oct 7, 2011 at 12:20 PM, Reindl Harald h.rei...@thelounge.netwrote: but could this not be called a bug? Am 07.10.2011 18:08, schrieb Michael Dykman: When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelps bphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.com neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.com mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.com http://lists.mysql.com/mysql?unsub=mdyk...@gmail.com -- Mit besten Grüßen, Reindl Harald the lounge interactive design GmbH A-1060 Vienna, Hofmühlgasse 17 CTO / software-development / cms-solutions p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40 icq: 154546673, http://www.thelounge.net/ http://www.thelounge.net/signature.asc.what.htm -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
Can you give more information as to why the second index would be of no use ? On 7 Oct 2011, at 18:24, Michael Dykman mdyk...@gmail.com wrote: No, I don't think it can be called. It is a direct consequence of the relational paradigm. Any implementation of an RDBMS has the same characteristic. - md On Fri, Oct 7, 2011 at 12:20 PM, Reindl Harald h.rei...@thelounge.netwrote: but could this not be called a bug? Am 07.10.2011 18:08, schrieb Michael Dykman: When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelps bphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.com neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.com mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.com http://lists.mysql.com/mysql?unsub=mdyk...@gmail.com -- Mit besten Grüßen, Reindl Harald the lounge interactive design GmbH A-1060 Vienna, Hofmühlgasse 17 CTO / software-development / cms-solutions p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40 icq: 154546673, http://www.thelounge.net/ http://www.thelounge.net/signature.asc.what.htm -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: MySQL Indexes
Do you have any good documentation with regards creating indexes. Also information for explain statement and what would be the desired result of the explain statement? On 7 Oct 2011, at 17:10, Michael Dykman mdyk...@gmail.com wrote: How heavily a given table is queried does not directly affect the index size, only the number and depth of the indexes. No, it is not that unusual to have the index file bigger. Just make sure that every index you have is justified by the queries you are making against the table. - md On Fri, Oct 7, 2011 at 4:26 AM, Tompkins Neil neil.tompk...@googlemail.com wrote: Is it normal practice for a heavily queried MYSQL tables to have a index file bigger than the data file ? On Fri, Oct 7, 2011 at 12:22 AM, Michael Dykman mdyk...@gmail.com wrote: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykman mdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
The second index you specified '(field_b, field_a)' would be usable when querying on field_b alone, or both fields in conjunction. This particular index is of no value should you be querying 'field_a' alone. Then that first index '(field_a, field_b)' would apply. - md On Fri, Oct 7, 2011 at 2:55 PM, Neil Tompkins neil.tompk...@googlemail.comwrote: Can you give more information as to why the second index would be of no use ? On 7 Oct 2011, at 18:24, Michael Dykman mdyk...@gmail.com wrote: No, I don't think it can be called. It is a direct consequence of the relational paradigm. Any implementation of an RDBMS has the same characteristic. - md On Fri, Oct 7, 2011 at 12:20 PM, Reindl Harald h.rei...@thelounge.net wrote: but could this not be called a bug? Am 07.10.2011 18:08, schrieb Michael Dykman: When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelps bphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.com neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.com mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.com http://lists.mysql.com/mysql?unsub=mdyk...@gmail.com -- Mit besten Grüßen, Reindl Harald the lounge interactive design GmbH A-1060 Vienna, Hofmühlgasse 17 CTO / software-development / cms-solutions p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40 icq: 154546673, http://www.thelounge.net/ http://www.thelounge.net/signature.asc.what.htm -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- - michael dykman
FW: MySQL Indexes
-Original Message- From: Reindl Harald [mailto:h.rei...@thelounge.net] Sent: Friday, October 07, 2011 12:21 PM To: mysql@lists.mysql.com Subject: Re: MySQL Indexes but could this not be called a bug? [JS] No. Think of two telephone books: one is sorted by first name, last name and the other is sorted by last name, first name. (Those are like your two keys, f1/f2 and f2/f1.) If you want to find someone by their first name, you use the first book. If you want to find somebody by their last name, you use the second book. If you want to find someone by their last name, the first book (key f1/f2) is useless. If you want to find someone by their first name, the second book (f2/f1) is useless. Does that help explain it? Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.giiresearch.com Am 07.10.2011 18:08, schrieb Michael Dykman: When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelps bphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext- natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext- natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.com neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.com mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.comhttp://lists.mysql.com/m ysql?unsub=mdyk...@gmail.com -- Mit besten Grüßen, Reindl Harald the lounge interactive design GmbH A-1060 Vienna, Hofmühlgasse 17 CTO / software-development / cms-solutions p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40 icq: 154546673, http://www.thelounge.net/ http://www.thelounge.net/signature.asc.what.htm -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch
Re: FW: MySQL Indexes
That cleared it up for me. Thanks! On 10/07/2011 03:06 PM, Jerry Schwartz wrote: -Original Message- From: Reindl Harald [mailto:h.rei...@thelounge.net] Sent: Friday, October 07, 2011 12:21 PM To: mysql@lists.mysql.com Subject: Re: MySQL Indexes but could this not be called a bug? [JS] No. Think of two telephone books: one is sorted by first name, last name and the other is sorted by last name, first name. (Those are like your two keys, f1/f2 and f2/f1.) If you want to find someone by their first name, you use the first book. If you want to find somebody by their last name, you use the second book. If you want to find someone by their last name, the first book (key f1/f2) is useless. If you want to find someone by their first name, the second book (f2/f1) is useless. Does that help explain it? Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.giiresearch.com Am 07.10.2011 18:08, schrieb Michael Dykman: When a query selects on field_a and field_b, that index can be used. If querying on field_a alone, the index again is useful. Query on field_b alone however, that first index is of no use to you. On Fri, Oct 7, 2011 at 10:49 AM, Brandon Phelpsbphe...@gls.com wrote: This thread has sparked my interest. What is the difference between an index on (field_a, field_b) and an index on (field_b, field_a)? On 10/06/2011 07:43 PM, Nuno Tavares wrote: Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.com**wrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykmanmdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext- natural-language.html http://dev.mysql.com/doc/**refman/5.5/en/fulltext-** natural-language.htmlhttp://dev.mysql.com/doc/refman/5.5/en/fulltext- natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neilneil.tompkins@** googlemail.comneil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman -mdyk...@gmail.commdykman@**gmail.commdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?**unsub=mdyk...@gmail.comhttp://lists.mysql.com/m ysql?unsub=mdyk...@gmail.com -- Mit besten Grüßen, Reindl Harald the lounge interactive design GmbH A-1060 Vienna, Hofmühlgasse 17 CTO / software-development / cms-solutions p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40 icq: 154546673, http://www.thelounge.net/ http://www.thelounge.net/signature.asc.what.htm -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http
Re: MySQL Indexes
At 01:58 PM 10/7/2011, you wrote: Do you have any good documentation with regards creating indexes. Also information for explain statement and what would be the desired result of the explain statement? This might help: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html http://www.sitepoint.com/optimizing-mysql-application/ http://hackmysql.com/case2 There is one more advantage about compound indexes like index on (field_a, field_b). If you are retrieving just field_a and field_b, in a select statement : select field_a, field_b from table1 where field_a=abc even though you only references field_a in the Where clause, it of course uses that index to find rows with field_a=abc, but it also retrieves field_b from the SAME index so MySQL doesn't have to go to the data file to get field_b. This can dramatically reduce disk I/O in heavily used queries and occasionally you may want to create a compound index even though the second field won't be used in a Where clause. There is a yin and yang approach to creating indexes. Newbies will try and index all possible columns that are used in a Where clause which results in a huge index file and very slow table updates. The more indexes you have on a table, the slower it takes to add or update a row. You really only want to index the columns of the most frequent queries. As to which fields to index, on a test database I would remove all indexes from the table except for the primary keys and have the slow query log turned on. Run your queries for an hour and examine the slow query log to see which queries are slow. Copy and paste the slow Select query to a MySQL administrator like SqlYog Community Edition v9.2 (http://code.google.com/p/sqlyog/downloads/list) and do an explain on the query to see what indexes it is (not) using. Then alter the table and add an index to try and speed up the query. You may have to repeat this several times to finally get the proper index defined. Remember to Reset Query Cache between tests. Only by judiciously adding indexes one by one and testing the performance, will you have the proper yin and yang so your tables are in harmony. Mike (If you can't achieve harmony, then buy more hardware.) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
MySQL Indexes
Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil
Re: MySQL Indexes
For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykman mdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.comwrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykman mdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.commdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: MySQL Indexes
Neil, whenever you see multiple fields you'd like to index, you should consider, at least: * The frequency of each query; * The occurrences of the same field in multiple queries; * The cardinality of each field; There is a tool Index Analyzer that may give you some hints, and I think it's maatkit that has a tool to run a query log to find good candidates - I've seen it somewhere, I believe Just remember that idx_a(field_a,field_b) is not the same, and is not considered for use, the same way as idx_b(field_b,field_a). -NT Em 07-10-2011 00:22, Michael Dykman escreveu: Only one index at a time can be used per query, so neither strategy is optimal. You need at look at the queries you intend to run against the system and construct indexes which support them. - md On Thu, Oct 6, 2011 at 2:35 PM, Neil Tompkins neil.tompk...@googlemail.comwrote: Maybe that was a bad example. If the query was name = 'Red' what index should I create ? Should I create a index of all columns used in each query or have a index on individual column ? On 6 Oct 2011, at 17:28, Michael Dykman mdyk...@gmail.com wrote: For the first query, the obvious index on score will give you optimal results. The second query is founded on this phrase: Like '%Red%' and no index will help you there. This is an anti-pattern, I am afraid. The only way your database can satisfy that expression is to test each and every record in the that database (the test itself being expensive as infix finding is iterative). Perhaps you should consider this approach instead: http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html On Thu, Oct 6, 2011 at 10:59 AM, Tompkins Neil neil.tompk...@googlemail.com neil.tompk...@googlemail.com wrote: Hi, Can anyone help and offer some advice with regards MySQL indexes. Basically we have a number of different tables all of which have the obviously primary keys. We then have some queries using JOIN statements that run slowly than we wanted. How many indexes are recommended per table ? For example should I have a index on all fields that will be used in a WHERE statement ? Should the indexes be created with multiple fields ? A example of two basic queries SELECT auto_id, name, score FROM test_table WHERE score 10 ORDER BY score DESC SELECT auto_id, name, score FROM test_table WHERE score 10 AND name Like '%Red%' ORDER BY score DESC How many indexes should be created for these two queries ? Thanks, Neil -- - michael dykman - mdyk...@gmail.commdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Dropping ALL indexes from a database / not just a table?
How does one do that? How do you drop auto-increment attribute then drop the index then restart the auto increment value where it was before you dropped it?? I did not know you could do that. The reason I ask is because the dbf_UID is a unique id tag the coders use to identify a product by manufacturer kinda like a upc code for their internal db. Can't have dups and don't want to have non-used id's in the db. Any help, direction is much appreciated. TIA... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 5:03:44 PM Subject: Re: Dropping ALL indexes from a database / not just a table? auto_increment is only allowed on primary-keyed columns. I expect it is not allowing you to drop the primary key because that column has the auto_increment attribute. Drop that manually, and the primary key should be able to let go. - md On Tue, Aug 10, 2010 at 5:58 PM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hi Micheal and all, ok so I did some digging around and I still can't find why I cant drop the last few indexes. mysql SELECT COUNT(1) FROM INFORMATION_SCHEMA.STATISTICS WHERE table_schema = 'db_Market' AND table_name = 'dbt_Fruit' and index_name = 'PRIMARY'; +--+ | COUNT(1) | +--+ |1 | +--+ 1 row in set (0.00 sec) mysql DESCRIBE dbt_Fruit; +--+--+--+-+-++ | Field| Type | Null | Key | Default | Extra | +--+--+--+-+-++ | dbf_UID | int(10) unsigned | NO | PRI | NULL| auto_increment | | dbf_Vendor | varchar(30) | NO | | || | dbf_Code | varchar(30) | NO | | || | dbf_Notes| text | YES | | NULL || +--+--+--+-+-++ mysql ALTER TABLE dbt_Fruit DROP FOREIGN KEY dbf_UID; Query OK, 2947 rows affected (0.05 sec) Records: 2947 Duplicates: 0 Warnings: 0 mysql ALTER TABLE dbt_Fruit DROP PRIMARY KEY; ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key mysql ALTER TABLE dbt_Fruit DROP PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 Any ideas??? I am wondering if it has something to do with the fact that dbf_UID is a primary AND auto_increment? TIA... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 4:10:37 PM Subject: Re: Dropping ALL indexes from a database / not just a table? It's not a completely solution and will need some tweaking.. You might have to run the PRIMARY KEYS distinctly from the rest. - michael dykman On Tue, Aug 10, 2010 at 4:43 PM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Michael, thanks for the one liner. I ran it BUT I started to get errors after I ran it the first time, this is what I got the 2nd time I ran it (first time I ran it I had 63 rows in the query, the 2nd time I have 9). I ran it twice to make sure it got rid of the indexed. I verified the index size dropped from 850 mb to 65 mb. +-+ | concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') | +-+ | ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; | | ALTER TABLE dbt_Veggies DROP INDEX PRIMARY; | . . . | ALTER TABLE dbt_Logs DROP INDEX PRIMARY; | +-+ 9 rows in set (0.01 sec) mysql ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql ALTER TABLE dbt_Logs DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql Thanks again... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 3:17:48 PM Subject: Re: Dropping ALL indexes from a database / not just a table? This should give
Re: Dropping ALL indexes from a database / not just a table?
Hello Nunzio, Instead of Dropping a index, you can disable the indexes and get the work done and re-enable them. If you are ok with this then run the below as a shell script :- MUSER=username MPASS=password DATABASE=dbname for db in $DATABASE do echo starting disabling indexes for database -- $db echo -- TABLES=`mysql -u $MUSER -p$MPASS $db -e show tables` for table in $TABLES do mysql -u $MUSER -p$MPASS $db -e Alter table $table disable keys done echo completed disabling indexes for database -- $db done Cheers, Anirudh Sundar On Tue, Aug 10, 2010 at 1:33 AM, Nunzio Daveri nunziodav...@yahoo.comwrote: Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio
Re: Dropping ALL indexes from a database / not just a table?
At 01:06 AM 8/10/2010, you wrote: Hello Nunzio, Instead of Dropping a index, you can disable the indexes and get the work done and re-enable them. Disabling keys will NOT disable Primary or Unique keys. They will still be active. Mike If you are ok with this then run the below as a shell script :- MUSER=username MPASS=password DATABASE=dbname for db in $DATABASE do echo starting disabling indexes for database -- $db echo -- TABLES=`mysql -u $MUSER -p$MPASS $db -e show tables` for table in $TABLES do mysql -u $MUSER -p$MPASS $db -e Alter table $table disable keys done echo completed disabling indexes for database -- $db done Cheers, Anirudh Sundar On Tue, Aug 10, 2010 at 1:33 AM, Nunzio Daveri nunziodav...@yahoo.comwrote: Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Dropping ALL indexes from a database / not just a table?
Thanks for the feedback. What I am trying to do is two things: 1. Remove all indexes and make the database smaller to copy and move to another prod box. Currently my indexes are in the double digit GB! Yikes ;-) 2. Remove all indexes so I can find out which ones are needed then tell mysql to recreate them and apparently it lessen data fragmentation if it starts from scratch vs. turning on and off. I was hoping to just remove all and then start from scratch so I know the data is not fragmented on the drives. Thanks again... Nunzio From: Anirudh Sundar sundar.anir...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: mysql@lists.mysql.com Sent: Tue, August 10, 2010 1:06:41 AM Subject: Re: Dropping ALL indexes from a database / not just a table? Hello Nunzio, Instead of Dropping a index, you can disable the indexes and get the work done and re-enable them. If you are ok with this then run the below as a shell script :- MUSER=username MPASS=password DATABASE=dbname for db in $DATABASE do echo starting disabling indexes for database -- $db echo -- TABLES=`mysql -u $MUSER -p$MPASS $db -e show tables` for table in $TABLES do mysql -u $MUSER -p$MPASS $db -e Alter table $table disable keys done echo completed disabling indexes for database -- $db done Cheers, Anirudh Sundar On Tue, Aug 10, 2010 at 1:33 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio
Re: Dropping ALL indexes from a database / not just a table?
This should give you a good starting point (not tested): select distinct concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') from information_schema.key_column_usage where TABLE_SCHEMA='mydatabase'; - md On Tue, Aug 10, 2010 at 10:43 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Thanks for the feedback. What I am trying to do is two things: 1. Remove all indexes and make the database smaller to copy and move to another prod box. Currently my indexes are in the double digit GB! Yikes ;-) 2. Remove all indexes so I can find out which ones are needed then tell mysql to recreate them and apparently it lessen data fragmentation if it starts from scratch vs. turning on and off. I was hoping to just remove all and then start from scratch so I know the data is not fragmented on the drives. Thanks again... Nunzio From: Anirudh Sundar sundar.anir...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: mysql@lists.mysql.com Sent: Tue, August 10, 2010 1:06:41 AM Subject: Re: Dropping ALL indexes from a database / not just a table? Hello Nunzio, Instead of Dropping a index, you can disable the indexes and get the work done and re-enable them. If you are ok with this then run the below as a shell script :- MUSER=username MPASS=password DATABASE=dbname for db in $DATABASE do echo starting disabling indexes for database -- $db echo -- TABLES=`mysql -u $MUSER -p$MPASS $db -e show tables` for table in $TABLES do mysql -u $MUSER -p$MPASS $db -e Alter table $table disable keys done echo completed disabling indexes for database -- $db done Cheers, Anirudh Sundar On Tue, Aug 10, 2010 at 1:33 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Dropping ALL indexes from a database / not just a table?
Hello Michael, thanks for the one liner. I ran it BUT I started to get errors after I ran it the first time, this is what I got the 2nd time I ran it (first time I ran it I had 63 rows in the query, the 2nd time I have 9). I ran it twice to make sure it got rid of the indexed. I verified the index size dropped from 850 mb to 65 mb. +-+ | concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') | +-+ | ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; | | ALTER TABLE dbt_Veggies DROP INDEX PRIMARY; | . . . | ALTER TABLE dbt_Logs DROP INDEX PRIMARY; | +-+ 9 rows in set (0.01 sec) mysql ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql ALTER TABLE dbt_Logs DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql Thanks again... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 3:17:48 PM Subject: Re: Dropping ALL indexes from a database / not just a table? This should give you a good starting point (not tested): select distinct concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') from information_schema.key_column_usage where TABLE_SCHEMA='mydatabase'; - md On Tue, Aug 10, 2010 at 10:43 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Thanks for the feedback. What I am trying to do is two things: 1. Remove all indexes and make the database smaller to copy and move to another prod box. Currently my indexes are in the double digit GB! Yikes ;-) 2. Remove all indexes so I can find out which ones are needed then tell mysql to recreate them and apparently it lessen data fragmentation if it starts from scratch vs. turning on and off. I was hoping to just remove all and then start from scratch so I know the data is not fragmented on the drives. Thanks again... Nunzio From: Anirudh Sundar sundar.anir...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: mysql@lists.mysql.com Sent: Tue, August 10, 2010 1:06:41 AM Subject: Re: Dropping ALL indexes from a database / not just a table? Hello Nunzio, Instead of Dropping a index, you can disable the indexes and get the work done and re-enable them. If you are ok with this then run the below as a shell script :- MUSER=username MPASS=password DATABASE=dbname for db in $DATABASE do echo starting disabling indexes for database -- $db echo -- TABLES=`mysql -u $MUSER -p$MPASS $db -e show tables` for table in $TABLES do mysql -u $MUSER -p$MPASS $db -e Alter table $table disable keys done echo completed disabling indexes for database -- $db done Cheers, Anirudh Sundar On Tue, Aug 10, 2010 at 1:33 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio -- - michael dykman - mdyk...@gmail.com May the Source be with you.
Re: Dropping ALL indexes from a database / not just a table?
It's not a completely solution and will need some tweaking.. You might have to run the PRIMARY KEYS distinctly from the rest. - michael dykman On Tue, Aug 10, 2010 at 4:43 PM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Michael, thanks for the one liner. I ran it BUT I started to get errors after I ran it the first time, this is what I got the 2nd time I ran it (first time I ran it I had 63 rows in the query, the 2nd time I have 9). I ran it twice to make sure it got rid of the indexed. I verified the index size dropped from 850 mb to 65 mb. +-+ | concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') | +-+ | ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; | | ALTER TABLE dbt_Veggies DROP INDEX PRIMARY; | . . . | ALTER TABLE dbt_Logs DROP INDEX PRIMARY; | +-+ 9 rows in set (0.01 sec) mysql ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql ALTER TABLE dbt_Logs DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql Thanks again... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 3:17:48 PM Subject: Re: Dropping ALL indexes from a database / not just a table? This should give you a good starting point (not tested): select distinct concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') from information_schema.key_column_usage where TABLE_SCHEMA='mydatabase'; - md On Tue, Aug 10, 2010 at 10:43 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Thanks for the feedback. What I am trying to do is two things: 1. Remove all indexes and make the database smaller to copy and move to another prod box. Currently my indexes are in the double digit GB! Yikes ;-) 2. Remove all indexes so I can find out which ones are needed then tell mysql to recreate them and apparently it lessen data fragmentation if it starts from scratch vs. turning on and off. I was hoping to just remove all and then start from scratch so I know the data is not fragmented on the drives. Thanks again... Nunzio From: Anirudh Sundar sundar.anir...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: mysql@lists.mysql.com Sent: Tue, August 10, 2010 1:06:41 AM Subject: Re: Dropping ALL indexes from a database / not just a table? Hello Nunzio, Instead of Dropping a index, you can disable the indexes and get the work done and re-enable them. If you are ok with this then run the below as a shell script :- MUSER=username MPASS=password DATABASE=dbname for db in $DATABASE do echo starting disabling indexes for database -- $db echo -- TABLES=`mysql -u $MUSER -p$MPASS $db -e show tables` for table in $TABLES do mysql -u $MUSER -p$MPASS $db -e Alter table $table disable keys done echo completed disabling indexes for database -- $db done Cheers, Anirudh Sundar On Tue, Aug 10, 2010 at 1:33 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Dropping ALL indexes from a database / not just a table?
Hi Micheal and all, ok so I did some digging around and I still can't find why I cant drop the last few indexes. mysql SELECT COUNT(1) FROM INFORMATION_SCHEMA.STATISTICS WHERE table_schema = 'db_Market' AND table_name = 'dbt_Fruit' and index_name = 'PRIMARY'; +--+ | COUNT(1) | +--+ |1 | +--+ 1 row in set (0.00 sec) mysql DESCRIBE dbt_Fruit; +--+--+--+-+-++ | Field| Type | Null | Key | Default | Extra | +--+--+--+-+-++ | dbf_UID | int(10) unsigned | NO | PRI | NULL| auto_increment | | dbf_Vendor | varchar(30) | NO | | || | dbf_Code | varchar(30) | NO | | || | dbf_Notes| text | YES | | NULL|| +--+--+--+-+-++ mysql ALTER TABLE dbt_Fruit DROP FOREIGN KEY dbf_UID; Query OK, 2947 rows affected (0.05 sec) Records: 2947 Duplicates: 0 Warnings: 0 mysql ALTER TABLE dbt_Fruit DROP PRIMARY KEY; ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key mysql ALTER TABLE dbt_Fruit DROP PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 Any ideas??? I am wondering if it has something to do with the fact that dbf_UID is a primary AND auto_increment? TIA... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 4:10:37 PM Subject: Re: Dropping ALL indexes from a database / not just a table? It's not a completely solution and will need some tweaking.. You might have to run the PRIMARY KEYS distinctly from the rest. - michael dykman On Tue, Aug 10, 2010 at 4:43 PM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Michael, thanks for the one liner. I ran it BUT I started to get errors after I ran it the first time, this is what I got the 2nd time I ran it (first time I ran it I had 63 rows in the query, the 2nd time I have 9). I ran it twice to make sure it got rid of the indexed. I verified the index size dropped from 850 mb to 65 mb. +-+ | concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') | +-+ | ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; | | ALTER TABLE dbt_Veggies DROP INDEX PRIMARY; | . . . | ALTER TABLE dbt_Logs DROP INDEX PRIMARY; | +-+ 9 rows in set (0.01 sec) mysql ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql ALTER TABLE dbt_Logs DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql Thanks again... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 3:17:48 PM Subject: Re: Dropping ALL indexes from a database / not just a table? This should give you a good starting point (not tested): select distinct concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') from information_schema.key_column_usage where TABLE_SCHEMA='mydatabase'; - md On Tue, Aug 10, 2010 at 10:43 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Thanks for the feedback. What I am trying to do is two things: 1. Remove all indexes and make the database smaller to copy and move to another prod box. Currently my indexes are in the double digit GB! Yikes ;-) 2. Remove all indexes so I can find out which ones are needed then tell mysql to recreate them and apparently it lessen data fragmentation if it starts from scratch vs. turning on and off. I was hoping to just remove all and then start from scratch so I know the data is not fragmented on the drives. Thanks again... Nunzio From: Anirudh Sundar sundar.anir...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: mysql@lists.mysql.com Sent: Tue, August 10, 2010 1:06:41 AM Subject: Re: Dropping ALL indexes from a database
Re: Dropping ALL indexes from a database / not just a table?
auto_increment is only allowed on primary-keyed columns. I expect it is not allowing you to drop the primary key because that column has the auto_increment attribute. Drop that manually, and the primary key should be able to let go. - md On Tue, Aug 10, 2010 at 5:58 PM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hi Micheal and all, ok so I did some digging around and I still can't find why I cant drop the last few indexes. mysql SELECT COUNT(1) FROM INFORMATION_SCHEMA.STATISTICS WHERE table_schema = 'db_Market' AND table_name = 'dbt_Fruit' and index_name = 'PRIMARY'; +--+ | COUNT(1) | +--+ | 1 | +--+ 1 row in set (0.00 sec) mysql DESCRIBE dbt_Fruit; +--+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +--+--+--+-+-++ | dbf_UID | int(10) unsigned | NO | PRI | NULL | auto_increment | | dbf_Vendor | varchar(30) | NO | | | | | dbf_Code | varchar(30) | NO | | | | | dbf_Notes | text | YES | | NULL | | +--+--+--+-+-++ mysql ALTER TABLE dbt_Fruit DROP FOREIGN KEY dbf_UID; Query OK, 2947 rows affected (0.05 sec) Records: 2947 Duplicates: 0 Warnings: 0 mysql ALTER TABLE dbt_Fruit DROP PRIMARY KEY; ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key mysql ALTER TABLE dbt_Fruit DROP PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 Any ideas??? I am wondering if it has something to do with the fact that dbf_UID is a primary AND auto_increment? TIA... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 4:10:37 PM Subject: Re: Dropping ALL indexes from a database / not just a table? It's not a completely solution and will need some tweaking.. You might have to run the PRIMARY KEYS distinctly from the rest. - michael dykman On Tue, Aug 10, 2010 at 4:43 PM, Nunzio Daveri nunziodav...@yahoo.com wrote: Hello Michael, thanks for the one liner. I ran it BUT I started to get errors after I ran it the first time, this is what I got the 2nd time I ran it (first time I ran it I had 63 rows in the query, the 2nd time I have 9). I ran it twice to make sure it got rid of the indexed. I verified the index size dropped from 850 mb to 65 mb. +-+ | concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') | +-+ | ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; | | ALTER TABLE dbt_Veggies DROP INDEX PRIMARY; | . . . | ALTER TABLE dbt_Logs DROP INDEX PRIMARY; | +-+ 9 rows in set (0.01 sec) mysql ALTER TABLE dbt_Fruits DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql ALTER TABLE dbt_Logs DROP INDEX PRIMARY; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY' at line 1 mysql Thanks again... Nunzio From: Michael Dykman mdyk...@gmail.com To: Nunzio Daveri nunziodav...@yahoo.com Cc: Anirudh Sundar sundar.anir...@gmail.com; mysql@lists.mysql.com Sent: Tue, August 10, 2010 3:17:48 PM Subject: Re: Dropping ALL indexes from a database / not just a table? This should give you a good starting point (not tested): select distinct concat('ALTER TABLE ', TABLE_NAME, ' DROP INDEX ', CONSTRAINT_NAME,';') from information_schema.key_column_usage where TABLE_SCHEMA='mydatabase'; - md On Tue, Aug 10, 2010 at 10:43 AM, Nunzio Daveri nunziodav...@yahoo.com wrote: Thanks for the feedback. What I am trying to do is two things: 1. Remove all indexes and make the database smaller to copy and move to another prod box. Currently my indexes are in the double digit GB! Yikes ;-) 2. Remove all indexes so I can find out which ones are needed then tell mysql to recreate them and apparently it lessen data fragmentation if it starts from scratch vs. turning on and off. I was hoping to just remove all and then start from scratch so I
Dropping ALL indexes from a database / not just a table?
Hello Gurus, is there a way / script that will let me DROP ALL the indexes in a single database? for example, lets say my database is call db_Animals, and inside db_Animals there are 97 tables, is there a SINGLE command or a perl script of some kind that can read all the MYI files, remove the .MYI from the file name then proceed to deleting whatever indexes it finds? I am doing this to debug a server that seems to be slow and sluggish. After I am done deleting I will review the slow query logs and then re-index to get the best performance? TIA... Nunzio
Re: Indexes larger than RAM (was: Do you know who can answer this question I posted yesterday please?)
Hi! I am no InnoDB and tuning expert, so I had intended to stay away from this question. Ok, I'll give some general remarks: Nunzio Daveri schrieb: [[...]] All, I was running slamdb against one of our QA boxes and noticed that the innodb database is 190Gb in size BUT the worrying issue is that the indexes are 30GB in size!!! When I hit this server hard, it tanks on memory but still performs, slower of course ;-) Having indexes which are larger than RAM is (in itself) not critical. IMO, it becomes bad only when accesses to these indexes are spread so wide that even the index pages become subject to frequent IO. Any suggestions on what I should do? I am thinking of doing one of these: Whether any action is needed, and which, depends on the problem you experience: - If the system as a whole (both CPU and disk) has a significant idle percentage, it isn't yet maxed out, and I don't expect that adding resources would improve performance significantly. - If your CPUs have significant waiting for IO percentage, then data accesses need speedup. This could be done by faster disks, but I would expect more results from adding RAM for larger caches. This holds especially if your disk throughput is close to the possible maximum. (Assuming your bulk work is read/select. If it is insert/update, then *removing* indexes might reduce the workload, as there are fewer indexes to maintain.) - If your CPUs are busy, then I don't expect any increase of caching would help. 1. Remove all queries, run for a few days, look at the slow query logs and then find those queries that really need them and index those specificially for performance. Makes sense (only) if you have indexes which aren't really helpful for accesses, so they just add maintenance load. If you do few inserts/updates, an unused index should be paged out and not do much harm. Comes with the cost of reduced performance during that test time, and the need to rebuild the essential indexes afterwards. Has the benefit of getting rid of unused indexes (which just cause maintenance load). 2. Split the single server into two servers both with 16 gb and 2 quad core cpu's. One master the other a slave. Makes sense if your CPUs are busy, *and* you can distribute the read accesses to the two servers (= most accesses are select). If most load is insert/update, I don't expect a real improvement. Biggest cost in hardware and admin effort, so I would do this only after a decent analysis. OTOH, it gives you some (hardware) fault tolerance, this could be an important argument depending on your requirements. 3. Just add another 16gb (32GB total) and that should take care of the indexing issue. Makes sense if the disks are the bottleneck (CPUs are in waiting for IO), so that larger caches will avoid disk accesses. Assumes your machine supports that amount of RAM (many mainboards have a limit at 16 GB, AIUI). Anyone had this problem before??? Oh this is a single box, 100% mysql only and it talks to 3 front end iPlanet web servers that hit it with a few hundread queries per second. For a specific answer, the distribution of accesses between read and write is needed, as well as information which resource is close to the limit. HTH, Jörg -- Joerg Bruehe, MySQL Build Team, joerg.bru...@oracle.com (+49 30) 417 01 487 ORACLE Deutschland B.V. Co. KG, Komturstrasse 18a, D-12099 Berlin Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven Amtsgericht Muenchen: HRA 95603 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Indexes larger than RAM (was: Do you know who can answer this question I posted yesterday please?)
Thanks again :-) Nunzio From: Joerg Bruehe joerg.bru...@oracle.com To: Nunzio Daveri nunziodav...@yahoo.com; mysQL General List mysql@lists.mysql.com Sent: Fri, July 30, 2010 1:31:54 PM Subject: Re: Indexes larger than RAM (was: Do you know who can answer this question I posted yesterday please?) Hi! I am no InnoDB and tuning expert, so I had intended to stay away from this question. Ok, I'll give some general remarks: Nunzio Daveri schrieb: [[...]] All, I was running slamdb against one of our QA boxes and noticed that the innodb database is 190Gb in size BUT the worrying issue is that the indexes are 30GB in size!!! When I hit this server hard, it tanks on memory but still performs, slower of course ;-) Having indexes which are larger than RAM is (in itself) not critical. IMO, it becomes bad only when accesses to these indexes are spread so wide that even the index pages become subject to frequent IO. Any suggestions on what I should do? I am thinking of doing one of these: Whether any action is needed, and which, depends on the problem you experience: - If the system as a whole (both CPU and disk) has a significant idle percentage, it isn't yet maxed out, and I don't expect that adding resources would improve performance significantly. - If your CPUs have significant waiting for IO percentage, then data accesses need speedup. This could be done by faster disks, but I would expect more results from adding RAM for larger caches. This holds especially if your disk throughput is close to the possible maximum. (Assuming your bulk work is read/select. If it is insert/update, then *removing* indexes might reduce the workload, as there are fewer indexes to maintain.) - If your CPUs are busy, then I don't expect any increase of caching would help. 1. Remove all queries, run for a few days, look at the slow query logs and then find those queries that really need them and index those specificially for performance. Makes sense (only) if you have indexes which aren't really helpful for accesses, so they just add maintenance load. If you do few inserts/updates, an unused index should be paged out and not do much harm. Comes with the cost of reduced performance during that test time, and the need to rebuild the essential indexes afterwards. Has the benefit of getting rid of unused indexes (which just cause maintenance load). 2. Split the single server into two servers both with 16 gb and 2 quad core cpu's. One master the other a slave. Makes sense if your CPUs are busy, *and* you can distribute the read accesses to the two servers (= most accesses are select). If most load is insert/update, I don't expect a real improvement. Biggest cost in hardware and admin effort, so I would do this only after a decent analysis. OTOH, it gives you some (hardware) fault tolerance, this could be an important argument depending on your requirements. 3. Just add another 16gb (32GB total) and that should take care of the indexing issue. Makes sense if the disks are the bottleneck (CPUs are in waiting for IO), so that larger caches will avoid disk accesses. Assumes your machine supports that amount of RAM (many mainboards have a limit at 16 GB, AIUI). Anyone had this problem before??? Oh this is a single box, 100% mysql only and it talks to 3 front end iPlanet web servers that hit it with a few hundread queries per second. For a specific answer, the distribution of accesses between read and write is needed, as well as information which resource is close to the limit. HTH, Jörg -- Joerg Bruehe, MySQL Build Team, joerg.bru...@oracle.com (+49 30) 417 01 487 ORACLE Deutschland B.V. Co. KG, Komturstrasse 18a, D-12099 Berlin Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven Amtsgericht Muenchen: HRA 95603
Re: Indexes larger than RAM (was: Do you know who can answer this question I posted yesterday please?)
Nunzio Daveri, Joerg Bruehe gave you a lot of good tips to try and speed things up. A few hundred queries per second seem to be a relatively small number to cause the server to crawl. I don't have the rest of your thread, but can you publish some of the slow queries (see Slow Query Log) and the table structure? Mike At 01:31 PM 7/30/2010, you wrote: Hi! I am no InnoDB and tuning expert, so I had intended to stay away from this question. Ok, I'll give some general remarks: Nunzio Daveri schrieb: [[...]] All, I was running slamdb against one of our QA boxes and noticed that the innodb database is 190Gb in size BUT the worrying issue is that the indexes are 30GB in size!!! When I hit this server hard, it tanks on memory but still performs, slower of course ;-) Having indexes which are larger than RAM is (in itself) not critical. IMO, it becomes bad only when accesses to these indexes are spread so wide that even the index pages become subject to frequent IO. Any suggestions on what I should do? I am thinking of doing one of these: Whether any action is needed, and which, depends on the problem you experience: - If the system as a whole (both CPU and disk) has a significant idle percentage, it isn't yet maxed out, and I don't expect that adding resources would improve performance significantly. - If your CPUs have significant waiting for IO percentage, then data accesses need speedup. This could be done by faster disks, but I would expect more results from adding RAM for larger caches. This holds especially if your disk throughput is close to the possible maximum. (Assuming your bulk work is read/select. If it is insert/update, then *removing* indexes might reduce the workload, as there are fewer indexes to maintain.) - If your CPUs are busy, then I don't expect any increase of caching would help. 1. Remove all queries, run for a few days, look at the slow query logs and then find those queries that really need them and index those specificially for performance. Makes sense (only) if you have indexes which aren't really helpful for accesses, so they just add maintenance load. If you do few inserts/updates, an unused index should be paged out and not do much harm. Comes with the cost of reduced performance during that test time, and the need to rebuild the essential indexes afterwards. Has the benefit of getting rid of unused indexes (which just cause maintenance load). 2. Split the single server into two servers both with 16 gb and 2 quad core cpu's. One master the other a slave. Makes sense if your CPUs are busy, *and* you can distribute the read accesses to the two servers (= most accesses are select). If most load is insert/update, I don't expect a real improvement. Biggest cost in hardware and admin effort, so I would do this only after a decent analysis. OTOH, it gives you some (hardware) fault tolerance, this could be an important argument depending on your requirements. 3. Just add another 16gb (32GB total) and that should take care of the indexing issue. Makes sense if the disks are the bottleneck (CPUs are in waiting for IO), so that larger caches will avoid disk accesses. Assumes your machine supports that amount of RAM (many mainboards have a limit at 16 GB, AIUI). Anyone had this problem before??? Oh this is a single box, 100% mysql only and it talks to 3 front end iPlanet web servers that hit it with a few hundread queries per second. For a specific answer, the distribution of accesses between read and write is needed, as well as information which resource is close to the limit. HTH, Jörg -- Joerg Bruehe, MySQL Build Team, joerg.bru...@oracle.com (+49 30) 417 01 487 ORACLE Deutschland B.V. Co. KG, Komturstrasse 18a, D-12099 Berlin Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven Amtsgericht Muenchen: HRA 95603 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=mo...@fastmail.fm -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
On 7/21/2010 1:02 PM, Tompkins Neil wrote: Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Why assume when the manual is right there to remove all doubt? http://dev.mysql.com/doc/refman/5.1/en/explain-output.html ALL A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables. -- Shawn Green MySQL Principle Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
Thanks for your reply, and sorry for not verifying in the manual. Another couple of questions I have : If I run a EXPLAIN query and SELECT against a primary key and SELECT fields which are not indexed, I assume that returned EXPLAIN statement as below, means I don't need to index additional fields providing the PRIMARY KEY is included in the SELECT statement ? table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, Also, if I want to add a index to an existing table containing 9000 records, how long should I expect this to take ? Is it instant ? Cheers Neil On Thu, Jul 22, 2010 at 5:20 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 7/21/2010 1:02 PM, Tompkins Neil wrote: Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Why assume when the manual is right there to remove all doubt? http://dev.mysql.com/doc/refman/5.1/en/explain-output.html ALL A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables. -- Shawn Green MySQL Principle Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
RE: combined or single indexes?
-Original Message- From: Tompkins Neil [mailto:neil.tompk...@googlemail.com] Sent: Thursday, July 22, 2010 3:39 PM To: Shawn Green (MySQL) Cc: mysql@lists.mysql.com Subject: Re: combined or single indexes? Thanks for your reply, and sorry for not verifying in the manual. Another couple of questions I have : If I run a EXPLAIN query and SELECT against a primary key and SELECT fields which are not indexed, I assume that returned EXPLAIN statement as below, means I don't need to index additional fields providing the PRIMARY KEY is included in the SELECT statement ? table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, [JS] Your posts will be more legible if you use \G instead of ; at the end of an EXPLAIN. As for the indexing, if you only need one key then you only need one key. Just remember that when you test things with sample data, MySQL might make surprising decisions based upon the amount of data. You'll only really know what will happen if you have a substantial data set. Also, if I want to add a index to an existing table containing 9000 records, how long should I expect this to take ? Is it instant ? [JS] Faster than you can type, I should think. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com Cheers Neil On Thu, Jul 22, 2010 at 5:20 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 7/21/2010 1:02 PM, Tompkins Neil wrote: Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Why assume when the manual is right there to remove all doubt? http://dev.mysql.com/doc/refman/5.1/en/explain-output.html ALL A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables. -- Shawn Green MySQL Principle Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
Thanks for the information Jerry. Just to confirm, you mentioned *if you only need one key then you only need one key*. My question was that this particular query was using SELECT against a primary key and other fields which are NOT indexed. The EXPLAIN result was table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, So from this do I assume that if I'm always searching the PRIMARY KEY, that I don't need to index the other fields ? Cheers Neil On Thu, Jul 22, 2010 at 9:27 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: Tompkins Neil [mailto:neil.tompk...@googlemail.com] Sent: Thursday, July 22, 2010 3:39 PM To: Shawn Green (MySQL) Cc: mysql@lists.mysql.com Subject: Re: combined or single indexes? Thanks for your reply, and sorry for not verifying in the manual. Another couple of questions I have : If I run a EXPLAIN query and SELECT against a primary key and SELECT fields which are not indexed, I assume that returned EXPLAIN statement as below, means I don't need to index additional fields providing the PRIMARY KEY is included in the SELECT statement ? table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, [JS] Your posts will be more legible if you use \G instead of ; at the end of an EXPLAIN. As for the indexing, if you only need one key then you only need one key. Just remember that when you test things with sample data, MySQL might make surprising decisions based upon the amount of data. You'll only really know what will happen if you have a substantial data set. Also, if I want to add a index to an existing table containing 9000 records, how long should I expect this to take ? Is it instant ? [JS] Faster than you can type, I should think. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com Cheers Neil On Thu, Jul 22, 2010 at 5:20 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 7/21/2010 1:02 PM, Tompkins Neil wrote: Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Why assume when the manual is right there to remove all doubt? http://dev.mysql.com/doc/refman/5.1/en/explain-output.html ALL A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables. -- Shawn Green MySQL Principle Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
RE: combined or single indexes?
From: Tompkins Neil [mailto:neil.tompk...@googlemail.com] Sent: Thursday, July 22, 2010 4:50 PM To: Jerry Schwartz Cc: Shawn Green (MySQL); mysql@lists.mysql.com Subject: Re: combined or single indexes? Thanks for the information Jerry. Just to confirm, you mentioned if you only need one key then you only need one key. My question was that this particular query was using SELECT against a primary key and other fields which are NOT indexed. The EXPLAIN result was table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, So from this do I assume that if I'm always searching the PRIMARY KEY, that I don't need to index the other fields ? [JS] I think I must have missed the start of this thread, because I don’t remember seeing the original query. The answer lies in your WHERE clause, and in the number of records that would potentially qualify. MySQL will ignore keys and do a full table scan if it decides that none of the keys would eliminate a big portion of the records. (This is why I warned about small sample datasets.) If your query looks like … WHERE `account_num` = 17 … and account numbers are unique, then an index on `account_num` should be enough. If you are always and ONLY searching on the primary key, then the primary key is all you need. That’s usually not the case, though. You’re probably going to want to search on other things, sooner or later. I’m not an expert on optimizing queries in MySQL, and there are probably differences between the storage engines, but I hope this helps. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com http://www.the-infoshop.com/ Cheers Neil On Thu, Jul 22, 2010 at 9:27 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: Tompkins Neil [mailto:neil.tompk...@googlemail.com] Sent: Thursday, July 22, 2010 3:39 PM To: Shawn Green (MySQL) Cc: mysql@lists.mysql.com Subject: Re: combined or single indexes? Thanks for your reply, and sorry for not verifying in the manual. Another couple of questions I have : If I run a EXPLAIN query and SELECT against a primary key and SELECT fields which are not indexed, I assume that returned EXPLAIN statement as below, means I don't need to index additional fields providing the PRIMARY KEY is included in the SELECT statement ? table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, [JS] Your posts will be more legible if you use \G instead of ; at the end of an EXPLAIN. As for the indexing, if you only need one key then you only need one key. Just remember that when you test things with sample data, MySQL might make surprising decisions based upon the amount of data. You'll only really know what will happen if you have a substantial data set. Also, if I want to add a index to an existing table containing 9000 records, how long should I expect this to take ? Is it instant ? [JS] Faster than you can type, I should think. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com Cheers Neil On Thu, Jul 22, 2010 at 5:20 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 7/21/2010 1:02 PM, Tompkins Neil wrote: Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Why assume when the manual is right there to remove all doubt? http://dev.mysql.com/doc/refman/5.1/en/explain-output.html ALL A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables. -- Shawn Green MySQL Principle Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
Re: combined or single indexes?
Thanks for the useful information. This is the answer I was. Looking for. Neil On 22 Jul 2010, at 22:25, Jerry Schwartz je...@gii.co.jp wrote: From: Tompkins Neil [mailto:neil.tompk...@googlemail.com] Sent: Thursday, July 22, 2010 4:50 PM To: Jerry Schwartz Cc: Shawn Green (MySQL); mysql@lists.mysql.com Subject: Re: combined or single indexes? Thanks for the information Jerry. Just to confirm, you mentioned if you only need one key then you only need one key. My question was that this particular query was using SELECT against a primary key and other fields which are NOT indexed. The EXPLAIN result was table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, So from this do I assume that if I'm always searching the PRIMARY KEY, that I don't need to index the other fields ? [JS] I think I must have missed the start of this thread, because I don’t remember seeing the original query. The answer lies in your WH ERE clause, and in the number of records that would potentially qual ify. MySQL will ignore keys and do a full table scan if it decides t hat none of the keys would eliminate a big portion of the records. ( This is why I warned about small sample datasets.) If your query loo ks like … WHERE `account_num` = 17 … and account numbers are unique, then an index on `account_num` should be enough. If you are always and ONLY searching on the primary key, then the primary key is all you need. That’s usually no t the case, though. You’re probably going to want to search on other things, sooner or later. I’m not an expert on optimizing queries in MySQL, and there are prob ably differences between the storage engines, but I hope this helps. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com Cheers Neil On Thu, Jul 22, 2010 at 9:27 PM, Jerry Schwartz je...@gii.co.jp wrote: -Original Message- From: Tompkins Neil [mailto:neil.tompk...@googlemail.com] Sent: Thursday, July 22, 2010 3:39 PM To: Shawn Green (MySQL) Cc: mysql@lists.mysql.com Subject: Re: combined or single indexes? Thanks for your reply, and sorry for not verifying in the manual. Another couple of questions I have : If I run a EXPLAIN query and SELECT against a primary key and SELECT fields which are not indexed, I assume that returned EXPLAIN statement as below, means I don't need to index additional fields providing the PRIMARY KEY is included in the SELECT statement ? table,type,possible_keys,key,key_len,ref,rows,Extra, Products,const,PRIMARY,PRIMARY,8,const,1,, [JS] Your posts will be more legible if you use \G instead of ; at the end of an EXPLAIN. As for the indexing, if you only need one key then you only need one key. Just remember that when you test things with sample data, MySQL might make surprising decisions based upon the amount of data. You'll only really know what will happen if you have a substantial data set. Also, if I want to add a index to an existing table containing 9000 records, how long should I expect this to take ? Is it instant ? [JS] Faster than you can type, I should think. Regards, Jerry Schwartz Global Information Incorporated 195 Farmington Ave. Farmington, CT 06032 860.674.8796 / FAX: 860.674.8341 E-mail: je...@gii.co.jp Web site: www.the-infoshop.com Cheers Neil On Thu, Jul 22, 2010 at 5:20 PM, Shawn Green (MySQL) shawn.l.gr...@oracle.com wrote: On 7/21/2010 1:02 PM, Tompkins Neil wrote: Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Why assume when the manual is right there to remove all doubt? http://dev.mysql.com/doc/refman/5.1/en/explain-output.html ALL A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables. -- Shawn Green MySQL Principle Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
Re: combined or single indexes?
Hi So Just running a basic query I get returned the following : table,type,possible_keys,key,key_len,ref,rows,Extra, Products,ALL,9884,where used, Therefore, I assume *ALL* is the worst possible type and should look at adding a an index to this particular field ? Cheers Neil On Fri, Jul 9, 2010 at 3:31 PM, mos mo...@fastmail.fm wrote: At 12:43 AM 7/9/2010, you wrote: On Fri, Jul 9, 2010 at 7:30 AM, Neil Tompkins neil.tompk...@googlemail.comwrote: Should we be looking to create a index for all fields that we might be perform a select condition in a where clause for ? Neil, Keep in mind that the more indexes you add to a table, the slower your inserts will become because it needs to maintain the indexes. I would only consider adding indexes to the slow queries. You can use the Explain on your Select statement to see which indexes are being used. See http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htmfor an explanation on how to use EXPLAIN. BTW, if you are indexing text columns, you may want to look at FullText indexing. Mike It's a bit of trial and error, you have to weight the number of times you select based on a potential index vs. the impact that index might have for the amount of updates you do on the table. Generally speaking, though, it's indeed a good idea to find the most-frequently used where-clauses and index them. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=neil.tompk...@googlemail.com
Re: combined or single indexes?
At 12:43 AM 7/9/2010, you wrote: On Fri, Jul 9, 2010 at 7:30 AM, Neil Tompkins neil.tompk...@googlemail.comwrote: Should we be looking to create a index for all fields that we might be perform a select condition in a where clause for ? Neil, Keep in mind that the more indexes you add to a table, the slower your inserts will become because it needs to maintain the indexes. I would only consider adding indexes to the slow queries. You can use the Explain on your Select statement to see which indexes are being used. See http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm for an explanation on how to use EXPLAIN. BTW, if you are indexing text columns, you may want to look at FullText indexing. Mike It's a bit of trial and error, you have to weight the number of times you select based on a potential index vs. the impact that index might have for the amount of updates you do on the table. Generally speaking, though, it's indeed a good idea to find the most-frequently used where-clauses and index them. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
How many indexes are recommended per table ?? On 7 Jul 2010, at 06:06, Octavian Rasnita octavian.rasn...@ssifbroker.ro wrote: Hi, MySQL can use a single index in a query as you've seen in the result of explain. Of course it is better to have an index made of 2 or more columns because it will match better the query. But if I remember well, the in() function can't use an index. And I think it also can't use an index if you use OR operators like: select foo from table where a=1 or a=2; So for your query the single-column index for the second column is enough. I've seen some tricks for using a faster method by using union and 2- column index, something like: select foo from table where a=1 and b1234 union select foo from table where a=2 and b1234 union select foo from table where a=3 and b1234 This might be faster in some cases because the query would be able to use the 2-column index, and especially if the content of those columns is made only of numbers, because in that case the query will use only the index, without getting data from the table. -- Octavian - Original Message - From: Bryan Cantwell bcantw...@firescope.com To: mysql@lists.mysql.com Sent: Tuesday, July 06, 2010 6:41 PM Subject: combined or single indexes? Is there a benefit to a combined index on a table? Or is multiple single column indexes better? If I have table 'foo' with columns a, b, and c. I will have a query like: select c from foo where a in (1,2,3) and b 12345; Is index on a,b better in any way than an a index and a b index? An explain with one index sees it but doesn't use it (only the where) and having 2 indexes sees both and uses the one on b. Am I right to think that 2 indexes are better than one combined one? thx, Bryancan -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=octavian.rasn...@ssifbroker.ro __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=neil.tompk...@googlemail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
As many as you need, but no more :-) The right indexes give you a boost in select performance, but every index also needs to be updated when your data changes. On Thu, Jul 8, 2010 at 11:25 PM, Neil Tompkins neil.tompk...@googlemail.com wrote: How many indexes are recommended per table ?? On 7 Jul 2010, at 06:06, Octavian Rasnita octavian.rasn...@ssifbroker.ro wrote: Hi, MySQL can use a single index in a query as you've seen in the result of explain. Of course it is better to have an index made of 2 or more columns because it will match better the query. But if I remember well, the in() function can't use an index. And I think it also can't use an index if you use OR operators like: select foo from table where a=1 or a=2; So for your query the single-column index for the second column is enough. I've seen some tricks for using a faster method by using union and 2-column index, something like: select foo from table where a=1 and b1234 union select foo from table where a=2 and b1234 union select foo from table where a=3 and b1234 This might be faster in some cases because the query would be able to use the 2-column index, and especially if the content of those columns is made only of numbers, because in that case the query will use only the index, without getting data from the table. -- Octavian - Original Message - From: Bryan Cantwell bcantw...@firescope.com To: mysql@lists.mysql.com Sent: Tuesday, July 06, 2010 6:41 PM Subject: combined or single indexes? Is there a benefit to a combined index on a table? Or is multiple single column indexes better? If I have table 'foo' with columns a, b, and c. I will have a query like: select c from foo where a in (1,2,3) and b 12345; Is index on a,b better in any way than an a index and a b index? An explain with one index sees it but doesn't use it (only the where) and having 2 indexes sees both and uses the one on b. Am I right to think that 2 indexes are better than one combined one? thx, Bryancan -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=octavian.rasn...@ssifbroker.ro __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=neil.tompk...@googlemail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: combined or single indexes?
Should we be looking to create a index for all fields that we might be perform a select condition in a where clause for ? On 9 Jul 2010, at 05:59, Johan De Meersman vegiv...@tuxera.be wrote: As many as you need, but no more :-) The right indexes give you a boost in select performance, but every index also needs to be updated when your data changes. On Thu, Jul 8, 2010 at 11:25 PM, Neil Tompkins neil.tompk...@googlemail.com wrote: How many indexes are recommended per table ?? On 7 Jul 2010, at 06:06, Octavian Rasnita octavian.rasn...@ssifbroker.ro wrote: Hi, MySQL can use a single index in a query as you've seen in the result of explain. Of course it is better to have an index made of 2 or more columns because it will match better the query. But if I remember well, the in() function can't use an index. And I think it also can't use an index if you use OR operators like: select foo from table where a=1 or a=2; So for your query the single-column index for the second column is enough. I've seen some tricks for using a faster method by using union and 2- column index, something like: select foo from table where a=1 and b1234 union select foo from table where a=2 and b1234 union select foo from table where a=3 and b1234 This might be faster in some cases because the query would be able to use the 2-column index, and especially if the content of those columns is made only of numbers, because in that case the query will use only the index, without getting data from the table. -- Octavian - Original Message - From: Bryan Cantwell bcantw...@firescope.com To: mysql@lists.mysql.com Sent: Tuesday, July 06, 2010 6:41 PM Subject: combined or single indexes? Is there a benefit to a combined index on a table? Or is multiple single column indexes better? If I have table 'foo' with columns a, b, and c. I will have a query like: select c from foo where a in (1,2,3) and b 12345; Is index on a,b better in any way than an a index and a b index? An explain with one index sees it but doesn't use it (only the where) and having 2 indexes sees both and uses the one on b. Am I right to think that 2 indexes are better than one combined one? thx, Bryancan -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=octavian.rasn...@ssifbroker.ro __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=neil.tompk...@googlemail.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: combined or single indexes?
On Fri, Jul 9, 2010 at 7:30 AM, Neil Tompkins neil.tompk...@googlemail.comwrote: Should we be looking to create a index for all fields that we might be perform a select condition in a where clause for ? It's a bit of trial and error, you have to weight the number of times you select based on a potential index vs. the impact that index might have for the amount of updates you do on the table. Generally speaking, though, it's indeed a good idea to find the most-frequently used where-clauses and index them. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
combined or single indexes?
Is there a benefit to a combined index on a table? Or is multiple single column indexes better? If I have table 'foo' with columns a, b, and c. I will have a query like: select c from foo where a in (1,2,3) and b 12345; Is index on a,b better in any way than an a index and a b index? An explain with one index sees it but doesn't use it (only the where) and having 2 indexes sees both and uses the one on b. Am I right to think that 2 indexes are better than one combined one? thx, Bryancan -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
Hi Bryan, all! Bryan Cantwell wrote: Is there a benefit to a combined index on a table? Or is multiple single column indexes better? This is a FAQ, but I'm not aware of a place to point you for the answer. If I have table 'foo' with columns a, b, and c. I will have a query like: select c from foo where a in (1,2,3) and b 12345; Is index on a,b better in any way than an a index and a b index? Any multi-column index can only be used when the values for the leading column(s) is/are known (in your example, they are). My standard example is a phone book: It is sorted by last name, first name; you cannot use this order when the last name in unknown (you have to sequentially scan it). An explain with one index sees it but doesn't use it (only the where) and having 2 indexes sees both and uses the one on b. Testing select strategies requires that you have a meaningful amount of data, and a close-to-real distribution of values: If your tables hold too few rows, the system will notice that it is wasteful to access them via the index, a scan is faster. And if your value distribution differs too much from later real data, the strategy selected will also differ. Am I right to think that 2 indexes are better than one combined one? It depends: AFAIK, MySQL will not yet combine several indexes, but evaluate only one per table access. If you have a usable multi-column index, it will provide better selectivity than a single-column index, so it is better if all the leading values are given. I cannot specifically comment on conditions using in and . HTH, Jörg -- Joerg Bruehe, MySQL Build Team, joerg.bru...@sun.com ORACLE Deutschland B.V. Co. KG, Komturstrasse 18a, D-12099 Berlin Geschaeftsfuehrer: Juergen Kunz, Marcel v.d. Molen, Alexander v.d. Ven Amtsgericht Muenchen: HRA 95603 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: combined or single indexes?
Hi, MySQL can use a single index in a query as you've seen in the result of explain. Of course it is better to have an index made of 2 or more columns because it will match better the query. But if I remember well, the in() function can't use an index. And I think it also can't use an index if you use OR operators like: select foo from table where a=1 or a=2; So for your query the single-column index for the second column is enough. I've seen some tricks for using a faster method by using union and 2-column index, something like: select foo from table where a=1 and b1234 union select foo from table where a=2 and b1234 union select foo from table where a=3 and b1234 This might be faster in some cases because the query would be able to use the 2-column index, and especially if the content of those columns is made only of numbers, because in that case the query will use only the index, without getting data from the table. -- Octavian - Original Message - From: Bryan Cantwell bcantw...@firescope.com To: mysql@lists.mysql.com Sent: Tuesday, July 06, 2010 6:41 PM Subject: combined or single indexes? Is there a benefit to a combined index on a table? Or is multiple single column indexes better? If I have table 'foo' with columns a, b, and c. I will have a query like: select c from foo where a in (1,2,3) and b 12345; Is index on a,b better in any way than an a index and a b index? An explain with one index sees it but doesn't use it (only the where) and having 2 indexes sees both and uses the one on b. Am I right to think that 2 indexes are better than one combined one? thx, Bryancan -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=octavian.rasn...@ssifbroker.ro __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5257 (20100707) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: The query doesn't use the specified indexes
If cardinality is high (i.e large number of rows returned in the set for your query), then mysql may need to resort to filesort. - Ashish 2010/6/21 Octavian Rasnita octavian.rasn...@ssifbroker.ro Hi, I have made an InnoDB table and I am trying to search using some keys, but they are not used, and the query takes a very long time. Here is a test table: CREATE TABLE `test` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `symbol` varchar(20) NOT NULL, `market` varchar(20) NOT NULL, `id_symbol` int(10) unsigned NOT NULL, `id_market` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `symbol` (`symbol`), KEY `market` (`market`), KEY `id_symbol` (`id_symbol`), KEY `id_market` (`id_market`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 The search query is: mysql explain select * from test where symbol='etc' order by market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: symbol key: symbol key_len: 62 ref: const rows: 1 Extra: Using where; Using filesort The bad part is Using filesort, and I thought that this is because it doesn't like varchar or char columns for indexes, so I tried to use columns that contain integers: mysql explain select * from test where id_symbol=2 order by id_market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: id_symbol key: id_symbol key_len: 4 ref: const rows: 1 Extra: Using where; Using filesort It still uses Using filesort and it doesn't use the index id_market in the query. So I tried to force using the indexes: mysql explain select * from test force index(symbol, market) where symbol='etc' order by market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: symbol key: symbol key_len: 62 ref: const rows: 1 Extra: Using where; Using filesort So, no matter I do, the query doesn't want to use the specified index. Please tell me what am I doing wrong. Or it is a MySQL/InnoDB bug? The current table I am testing has no records. I have also tried this on a table that has more than 10 million records, with exactly the same results. Please tell me what can I do. Thanks. -- Octavian __ Information from ESET NOD32 Antivirus, version of virus signature database 5214 (20100621) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
The query doesn't use the specified indexes
Hi, I have made an InnoDB table and I am trying to search using some keys, but they are not used, and the query takes a very long time. Here is a test table: CREATE TABLE `test` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `symbol` varchar(20) NOT NULL, `market` varchar(20) NOT NULL, `id_symbol` int(10) unsigned NOT NULL, `id_market` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `symbol` (`symbol`), KEY `market` (`market`), KEY `id_symbol` (`id_symbol`), KEY `id_market` (`id_market`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 The search query is: mysql explain select * from test where symbol='etc' order by market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: symbol key: symbol key_len: 62 ref: const rows: 1 Extra: Using where; Using filesort The bad part is Using filesort, and I thought that this is because it doesn't like varchar or char columns for indexes, so I tried to use columns that contain integers: mysql explain select * from test where id_symbol=2 order by id_market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: id_symbol key: id_symbol key_len: 4 ref: const rows: 1 Extra: Using where; Using filesort It still uses Using filesort and it doesn't use the index id_market in the query. So I tried to force using the indexes: mysql explain select * from test force index(symbol, market) where symbol='etc' order by market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: symbol key: symbol key_len: 62 ref: const rows: 1 Extra: Using where; Using filesort So, no matter I do, the query doesn't want to use the specified index. Please tell me what am I doing wrong. Or it is a MySQL/InnoDB bug? The current table I am testing has no records. I have also tried this on a table that has more than 10 million records, with exactly the same results. Please tell me what can I do. Thanks. -- Octavian __ Information from ESET NOD32 Antivirus, version of virus signature database 5214 (20100621) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
Re: The query doesn't use the specified indexes
Hi Octavian, all! Octavian Rasnita wrote: Hi, I have made an InnoDB table and I am trying to search using some keys, but they are not used, and the query takes a very long time. Here is a test table: CREATE TABLE `test` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `symbol` varchar(20) NOT NULL, `market` varchar(20) NOT NULL, `id_symbol` int(10) unsigned NOT NULL, `id_market` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `symbol` (`symbol`), KEY `market` (`market`), KEY `id_symbol` (`id_symbol`), KEY `id_market` (`id_market`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 So you have a table with 5 columns, one being the primary key, and separate single-column indexes on the other 4 columns. The search query is: mysql explain select * from test where symbol='etc' order by market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: symbol key: symbol key_len: 62 ref: const rows: 1 Extra: Using where; Using filesort The bad part is Using filesort, No, it works as designed: What I take from the output is that it will use the index (key) on column symbol (to find all rows that contain the constant value 'etc' in that column), and then it will sort those rows (order by market) to return the first 20. and I thought that this is because it doesn't like varchar or char columns for indexes, so I tried to use columns that contain integers: mysql explain select * from test where id_symbol=2 order by id_market limit 20\G *** 1. row *** id: 1 select_type: SIMPLE table: test type: ref possible_keys: id_symbol key: id_symbol key_len: 4 ref: const rows: 1 Extra: Using where; Using filesort It still uses Using filesort and it doesn't use the index id_market in the query. This query cannot use the index on id_market because using that index would ignore the condition id_symbol=2. So I tried to force using the indexes: mysql explain select * from test force index(symbol, market) where symbol='etc' order by market limit 20\G Unless you changed your table definition, there is no index combining these two fields - you didn't create any. [[...]] So, no matter I do, the query doesn't want to use the specified index. Please tell me what am I doing wrong. Or it is a MySQL/InnoDB bug? See above. If you expect the system to use an index on two columns, you should first create it. The current table I am testing has no records. I have also tried this on a table that has more than 10 million records, with exactly the same results. You cannot test execution strategies on empty tables - it doesn't make any sense. The moment the optimizer uses statistical information (cost estimates, aka cost-based optimizer), it will detect that the table is empty, so there is no use in going through an index because that will not reduce the number of rows (to check) any further. If you want to test execution strategies, you should first make sure that your test tables contain data which are roughly realistic, with a distribution of values that is roughly realistic, and that your indexes will provide a decent selectivity (I'd guess, at the very least you need 20 different values per column). It is a separate question whether that sorting is critical: You mention 10 million records, but you don't tell us the distribution of values. If there are 10,000 different values of symbol, on average such a query would have to sort 1000 records, which shouldn't be too bad. Please tell me what can I do. Apart from the hints above: Make your mail client break long lines. HTH, Jörg -- Joerg Bruehe, MySQL Build Team, joerg.bru...@sun.com Sun Microsystems GmbH, Komturstrasse 18a, D-12099 Berlin Geschaeftsfuehrer: Juergen Kunz Amtsgericht Muenchen: HRB161028 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Any faster building primary/unique indexes after Load Data Infile?
Hi, On Sun, Feb 21, 2010 at 1:42 PM, mos mo...@fastmail.fm wrote: I am loading 35 million rows of data into an empty MyISAM table. This table has 1 primary key (AutoInc) and 1 unique index and 2 non-unique indexes. Is it going to be any faster if I remove the indexes from the table before loading the data, load the data, then do an Alter Table .. add index for all of the indexes? Or is it faster to just leave the indexes in place prior to loading the data. I know if the table is empty and optimized, the non-unique indexes will be built AFTER the data is loaded using Load Data Infile, but the unique and primary indexes will be built as the data is being loaded and this is going to slow down the import. There is no point doing a Disable Indexes on the table because this only affects non-unique indexes and that is already taken care of since the table is already empty and optimized. But if I remove the indexes from the empty table then load the data, then execute the Alter Table Add Index ... for all 4 indexes at one time, isn't the Alter Table going to create a copy of the table so it is just going to reload the data all over again? Yes. It is going to create a new table, copy the rows into it, and then delete the old one. Is there any way to add a primary or unique index without copy the data all over again? Create Index ... can't be used to create a primary index. Dirty hacks with .frm files and REPAIR TABLE have sometimes been known to help in cases like this. But it's not for the faint of heart. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org