Re: implicit cast forces table scan?
Dear, Michael! Did you read my earlier reply? This is neither an optimizer nor version issue. I've sent my e-mail before I've read your. The problem is that I was testing my guessing on the table with a bit different structure than your (mine didn't have a primary key field 'id'). And optimizer has shown that it uses indexes! There is a verified bug: http://bugs.mysql.com/?id=14220 Thank you for good explanations! Michael Stassen wrote: Gleb Paharenko wrote: Hello. I suggest you to check you query with MySQL 4.1.14 which might has a bit clever optimizer. Did you read my earlier reply? This is neither an optimizer nor version issue. Consider: mysql SELECT VERSION(); +---+ | VERSION() | +---+ | 4.1.15| +---+ 1 row in set (0.01 sec) CREATE TABLE ict (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, vcf VARCHAR(30), INDEX (vcf) ); INSERT INTO ict (vcf) VALUES ('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('10'), ('11'),('12'),('13'),('14'),('15'),('16'),('17'),('18'),('19'),('20'); EXPLAIN SELECT * FROM ict WHERE vcf = '17'; ++-+---+--++-+---+--+-+ | id | select_type | table | type | p_keys | key | ref | rows | Extra | ++-+---+--++-+---+--+-+ | 1 | SIMPLE | ict | ref | vcf| vcf | const |1 | Using where | ++-+---+--++-+---+--+-+ 1 row in set (0.01 sec) EXPLAIN SELECT * FROM ict WHERE vcf = 17; ++-+---+--++--+--+--+-+ | id | select_type | table | type | p_keys | key | ref | rows | Extra | ++-+---+--++--+--+--+-+ | 1 | SIMPLE | ict | ALL | vcf| NULL | NULL | 20 | Using where | ++-+---+--++--+--+--+-+ 1 row in set (0.01 sec) Why? Because there are many strings which evaluate to a given number. For example: INSERT INTO ict (vcf) VALUES (' 17'), ('+17'), ('17.0'), ('17,34'), ('17 dogs'); SELECT * FROM ict WHERE vcf = '17'; ++--+ | id | vcf | ++--+ | 17 | 17 | ++--+ 1 row in set (0.00 sec) SELECT * FROM ict WHERE vcf = 17; ++-+ | id | vcf | ++-+ | 17 | 17 | | 21 | 17 | | 22 | +17 | | 23 | 17.0| | 24 | 17,34 | | 25 | 17 dogs | ++-+ 6 rows in set (0.00 sec) You see? The two WHERE clauses are actually different, so the optimizer must treat them differently. No amount of optimizer cleverness can change that. The moral of the story: Use numeric types to store numbers, not string types. Michael -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: implicit cast forces table scan?
Gleb Paharenko wrote: Dear, Michael! Did you read my earlier reply? This is neither an optimizer nor version issue. I've sent my e-mail before I've read your. The problem is that I was testing my guessing on the table with a bit different structure than your (mine didn't have a primary key field 'id'). And optimizer has shown that it uses indexes! There is a verified bug: http://bugs.mysql.com/?id=14220 Thank you for good explanations! I've added a comment to bug 14220 explaining why I do not believe this is a bug. The short version is that without the id column, SELECT * is the same as SELECT vcf. In that case, the index on vcf is a covering index, so mysql reads the data from the index, rather than the table. You will note in the EXPLAIN output that it is still doing a full scan. It's just that it's a full index scan rather than a full table scan. Michael -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: implicit cast forces table scan?
Hello. I suggest you to check you query with MySQL 4.1.14 which might has a bit clever optimizer. Olaf Faaland wrote: Hi, I am currently using mysql 4.0.18 as distributed with red hat Linux. I find when I perform a select on a varchar(30) field, the index is used only if I have quoted the value in the where clause. Otherwise, mysql performs a table scan. The queries in question are: This query uses the index: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This query performs a table scan: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; My question is this: is the issue here that mysql is converting every single itran_log_actionid value, from all 1.5 million rows, and hence the index is not useful and not used? My initial assumption was that the constant value 170807 in the second query, would be converted to text before the query was executed, and so the index could be used. This does not seem to be the case. I ask both for my own edification, and also because it seems to me this should be mentioned in the manual for newbies like myself. thanks, Olaf Details on versions, table structures, indexes, etc. below == == $ rpm -qa | grep -i mysql MySQL-shared-compat-4.0.15-0 MySQL-client-4.0.18-0 php-mysql-4.1.2-7.2.6 MySQL-server-4.0.18-0 $ /usr/bin/mysql -V /usr/bin/mysql Ver 12.22 Distrib 4.0.18, for pc-linux (i686) == mysql describe itran_log; ++--+--+-++- ---+ | Field | Type | Null | Key | Default| Extra | ++--+--+-++- ---+ | itran_user_id | varchar(100) | | || | | itran_log_date | date | | MUL | -00-00 | | | itran_log_time | time | | | 00:00:00 | | | itran_log_filename | varchar(100) | | || | | itran_log_action | varchar(25) | | MUL || | | itran_log_actionid | varchar(30) | | MUL || | | itran_site_id | varchar(100) | YES | MUL | NULL | | | itran_log_instructions | text | | || | | itran_log_id | bigint(20) | | PRI | NULL | auto_increment | ++--+--+-++- ---+ mysql show indexes from itran_log; +---++---+--+--- -+---+-+--++--++ -+ | Table | Non_unique | Key_name | Seq_in_index | Column_name| Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---++---+--+--- -+---+-+--++--++ -+ | itran_log | 0 | PRIMARY |1 | itran_log_id | A | 1500793 | NULL | NULL | | BTREE | | | itran_log | 1 | itran_site_id_ix |1 | itran_site_id | A |NULL | 15 | NULL | YES | BTREE | | | itran_log | 1 | itran_log_action_ix |1 | itran_log_action | A |NULL |3 | NULL | | BTREE | | | itran_log | 1 | itran_log_actionid_ix |1 | itran_log_actionid | A |NULL | NULL | NULL | | BTREE | | | itran_log | 1 | itran_log_date_ix |1 | itran_log_date | A |NULL | NULL | NULL | | BTREE | | +---++---+--+--- -+---+-+--++--++ -+ mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; +---+--+---+--+-+--+ -+-+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---+--+---+--+-+--+ -+-+ | itran_log | ALL | itran_log_actionid_ix | NULL |NULL | NULL | 1500775 | Using where | +---+--+---+--+-+--+ -+-+ 1 row in set (0.02 sec) mysql explain - select itran_log_date, itran_log_actionid from itran_log
RE: implicit cast forces table scan?
Hi, You are correct; it was an error that the column was defined as a VARCHAR. I started out without quotes in the SELECT because the application that uses this db has been running successfully, so I believed the column types to be OK. Now I know better. Once I realized the problem was with the data types, I verified the data was all numeric, and changed the column to an INT. I was still mystified by the cast being done on each row, instead of being done once on the value given in the select statement. Thank you very much for the careful explanation. -Olaf -Original Message- From: Michael Stassen [mailto:[EMAIL PROTECTED] Sent: Thursday, October 20, 2005 9:42 PM To: Olaf Faaland Cc: 'mysql@lists.mysql.com' Subject: Re: implicit cast forces table scan? Olaf Faaland wrote: Hi, I am currently using mysql 4.0.18 as distributed with red hat Linux. I find when I perform a select on a varchar(30) field, the index is used only if I have quoted the value in the where clause. Otherwise, mysql performs a table scan. A varchar(30) field is a string, so only comparisons to strings really make sense. If you don't quote the comparison value in the WHERE clause, you're asking for trouble. First, the unquoted value must make sense. There are 3 possibilities: it's a number, it's a function, or it's a column name. Hence, you can get away with WHERE varchar_field = 170807 but probably not WHERE varachar_field = dogs (unless you have a column named dogs). I expect you know all that. My point is, why aren't you quoting the comparison string? The queries in question are: This query uses the index: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This is a string column compared to a string constant, so the index can be used. This query performs a table scan: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This is a string compared to a number. You are asking mysql to make an implicit conversion so it can make the comparison. You expected the number to be converted to a string, but that's not how it works -- it's the other way around, the string is converted to a number. Why? Because many strings convert to the same number. Consider: mysql SELECT '17' + 0, '17.0' + 0, '+17' + 0, '17,345' + 0, '17 dogs' + 0; +--++---+--+---+ | '17' + 0 | '17.0' + 0 | '+17' + 0 | '17,345' + 0 | '17 dogs' + 0 | +--++---+--+---+ | 17 | 17 |17 | 17 |17 | +--++---+--+---+ 1 row in set (0.00 sec) The map from string to number is many-to-one, so the only safe course is to convert the string to a number, then do a numeric comparison. Of course, that renders the string index useless, so the full table scan is required. My question is this: is the issue here that mysql is converting every single itran_log_actionid value, from all 1.5 million rows, and hence the index is not useful and not used? Yes. My initial assumption was that the constant value 170807 in the second query, would be converted to text before the query was executed, and so the index could be used. This does not seem to be the case. Right, it's not. I ask both for my own edification, and also because it seems to me this should be mentioned in the manual for newbies like myself. I thought it was mentioned in the manual, though I must admit I didn't find it just now after a brief search. I'm wondering, however, if we're really approaching this from the right angle. Your column is named itran_log_actionid, and it seems to contain numbers such as 170807. The obvious question, then, is why is itran_log_actionid a VARCHAR(30) instead of one of the INT types? If it were, you wouldn't need to quote the constant in the WHERE clause, and the comparison would be numeric (faster than string) with no type conversion thanks, Olaf Michael -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: implicit cast forces table scan?
Gleb Paharenko wrote: Hello. I suggest you to check you query with MySQL 4.1.14 which might has a bit clever optimizer. Did you read my earlier reply? This is neither an optimizer nor version issue. Consider: mysql SELECT VERSION(); +---+ | VERSION() | +---+ | 4.1.15| +---+ 1 row in set (0.01 sec) CREATE TABLE ict (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, vcf VARCHAR(30), INDEX (vcf) ); INSERT INTO ict (vcf) VALUES ('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('10'), ('11'),('12'),('13'),('14'),('15'),('16'),('17'),('18'),('19'),('20'); EXPLAIN SELECT * FROM ict WHERE vcf = '17'; ++-+---+--++-+---+--+-+ | id | select_type | table | type | p_keys | key | ref | rows | Extra | ++-+---+--++-+---+--+-+ | 1 | SIMPLE | ict | ref | vcf| vcf | const |1 | Using where | ++-+---+--++-+---+--+-+ 1 row in set (0.01 sec) EXPLAIN SELECT * FROM ict WHERE vcf = 17; ++-+---+--++--+--+--+-+ | id | select_type | table | type | p_keys | key | ref | rows | Extra | ++-+---+--++--+--+--+-+ | 1 | SIMPLE | ict | ALL | vcf| NULL | NULL | 20 | Using where | ++-+---+--++--+--+--+-+ 1 row in set (0.01 sec) Why? Because there are many strings which evaluate to a given number. For example: INSERT INTO ict (vcf) VALUES (' 17'), ('+17'), ('17.0'), ('17,34'), ('17 dogs'); SELECT * FROM ict WHERE vcf = '17'; ++--+ | id | vcf | ++--+ | 17 | 17 | ++--+ 1 row in set (0.00 sec) SELECT * FROM ict WHERE vcf = 17; ++-+ | id | vcf | ++-+ | 17 | 17 | | 21 | 17 | | 22 | +17 | | 23 | 17.0| | 24 | 17,34 | | 25 | 17 dogs | ++-+ 6 rows in set (0.00 sec) You see? The two WHERE clauses are actually different, so the optimizer must treat them differently. No amount of optimizer cleverness can change that. The moral of the story: Use numeric types to store numbers, not string types. Michael -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: implicit cast forces table scan?
Olaf Faaland wrote: Hi, I am currently using mysql 4.0.18 as distributed with red hat Linux. I find when I perform a select on a varchar(30) field, the index is used only if I have quoted the value in the where clause. Otherwise, mysql performs a table scan. A varchar(30) field is a string, so only comparisons to strings really make sense. If you don't quote the comparison value in the WHERE clause, you're asking for trouble. First, the unquoted value must make sense. There are 3 possibilities: it's a number, it's a function, or it's a column name. Hence, you can get away with WHERE varchar_field = 170807 but probably not WHERE varachar_field = dogs (unless you have a column named dogs). I expect you know all that. My point is, why aren't you quoting the comparison string? The queries in question are: This query uses the index: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This is a string column compared to a string constant, so the index can be used. This query performs a table scan: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This is a string compared to a number. You are asking mysql to make an implicit conversion so it can make the comparison. You expected the number to be converted to a string, but that's not how it works -- it's the other way around, the string is converted to a number. Why? Because many strings convert to the same number. Consider: mysql SELECT '17' + 0, '17.0' + 0, '+17' + 0, '17,345' + 0, '17 dogs' + 0; +--++---+--+---+ | '17' + 0 | '17.0' + 0 | '+17' + 0 | '17,345' + 0 | '17 dogs' + 0 | +--++---+--+---+ | 17 | 17 |17 | 17 |17 | +--++---+--+---+ 1 row in set (0.00 sec) The map from string to number is many-to-one, so the only safe course is to convert the string to a number, then do a numeric comparison. Of course, that renders the string index useless, so the full table scan is required. My question is this: is the issue here that mysql is converting every single itran_log_actionid value, from all 1.5 million rows, and hence the index is not useful and not used? Yes. My initial assumption was that the constant value 170807 in the second query, would be converted to text before the query was executed, and so the index could be used. This does not seem to be the case. Right, it's not. I ask both for my own edification, and also because it seems to me this should be mentioned in the manual for newbies like myself. I thought it was mentioned in the manual, though I must admit I didn't find it just now after a brief search. I'm wondering, however, if we're really approaching this from the right angle. Your column is named itran_log_actionid, and it seems to contain numbers such as 170807. The obvious question, then, is why is itran_log_actionid a VARCHAR(30) instead of one of the INT types? If it were, you wouldn't need to quote the constant in the WHERE clause, and the comparison would be numeric (faster than string) with no type conversion thanks, Olaf Michael -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
implicit cast forces table scan?
Hi, I am currently using mysql 4.0.18 as distributed with red hat Linux. I find when I perform a select on a varchar(30) field, the index is used only if I have quoted the value in the where clause. Otherwise, mysql performs a table scan. The queries in question are: This query uses the index: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This query performs a table scan: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; My question is this: is the issue here that mysql is converting every single itran_log_actionid value, from all 1.5 million rows, and hence the index is not useful and not used? My initial assumption was that the constant value 170807 in the second query, would be converted to text before the query was executed, and so the index could be used. This does not seem to be the case. I ask both for my own edification, and also because it seems to me this should be mentioned in the manual for newbies like myself. thanks, Olaf Details on versions, table structures, indexes, etc. below == == $ rpm -qa | grep -i mysql MySQL-shared-compat-4.0.15-0 MySQL-client-4.0.18-0 php-mysql-4.1.2-7.2.6 MySQL-server-4.0.18-0 $ /usr/bin/mysql -V /usr/bin/mysql Ver 12.22 Distrib 4.0.18, for pc-linux (i686) == mysql describe itran_log; ++--+--+-++- ---+ | Field | Type | Null | Key | Default| Extra | ++--+--+-++- ---+ | itran_user_id | varchar(100) | | || | | itran_log_date | date | | MUL | -00-00 | | | itran_log_time | time | | | 00:00:00 | | | itran_log_filename | varchar(100) | | || | | itran_log_action | varchar(25) | | MUL || | | itran_log_actionid | varchar(30) | | MUL || | | itran_site_id | varchar(100) | YES | MUL | NULL | | | itran_log_instructions | text | | || | | itran_log_id | bigint(20) | | PRI | NULL | auto_increment | ++--+--+-++- ---+ mysql show indexes from itran_log; +---++---+--+--- -+---+-+--++--++ -+ | Table | Non_unique | Key_name | Seq_in_index | Column_name| Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---++---+--+--- -+---+-+--++--++ -+ | itran_log | 0 | PRIMARY |1 | itran_log_id | A | 1500793 | NULL | NULL | | BTREE | | | itran_log | 1 | itran_site_id_ix |1 | itran_site_id | A |NULL | 15 | NULL | YES | BTREE | | | itran_log | 1 | itran_log_action_ix |1 | itran_log_action | A |NULL |3 | NULL | | BTREE | | | itran_log | 1 | itran_log_actionid_ix |1 | itran_log_actionid | A |NULL | NULL | NULL | | BTREE | | | itran_log | 1 | itran_log_date_ix |1 | itran_log_date | A |NULL | NULL | NULL | | BTREE | | +---++---+--+--- -+---+-+--++--++ -+ mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; +---+--+---+--+-+--+ -+-+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---+--+---+--+-+--+ -+-+ | itran_log | ALL | itran_log_actionid_ix | NULL |NULL | NULL | 1500775 | Using where | +---+--+---+--+-+--+ -+-+ 1 row in set (0.02 sec) mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; +---+--+---+---+ -+---+--+-+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---+--+---+---+ -+---+--+-+ |
Re: implicit cast forces table scan?
On Wednesday 19 October 2005 01:15 pm, Olaf Faaland wrote: The queries in question are: This query uses the index: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; This query performs a table scan: mysql explain - select itran_log_date, itran_log_actionid from itran_log where - itran_log_actionid = 170807; My question is this: is the issue here that mysql is converting every single itran_log_actionid value, from all 1.5 million rows, and hence the index is not useful and not used? My initial assumption was that the constant value 170807 in the second query, would be converted to text before the query was executed, and so the index could be used. This does not seem to be the case. I ask both for my own edification, and also because it seems to me this should be mentioned in the manual for newbies like myself. It doesnt know what value your giving it. If it thought to assume converting the data, you could have 17h120, and it would fail converting the data. Mysql, nor any DB for that matter, should not, and do not, assume anything. It just happens in the case your dealing with numeric data. If thats the case, you should have made the column numeric in type. (int whatever) Jeff Jeff pgpNnLJGNfYAz.pgp Description: PGP signature