RE: LIKE sql optimization
Done. Thand you very much! Zhigang _ From: Jesper Wisborg Krogh [mailto:my...@wisborg.dk] Sent: Wednesday, February 12, 2014 5:30 PM To: Morgan Tocker; Zhigang Zhang Cc: mysql@lists.mysql.com Subject: Re: LIKE sql optimization On 12/02/2014 13:16, Morgan Tocker wrote: Hi Zhigang, On Feb 11, 2014, at 8:48 PM, Zhigang Zhang <mailto:zzgang2...@gmail.com> wrote: I want to know the reason, in my opinion, to scan the smaller index data has better performance than to scan the whole table data. I think I understand the question - you are asking why MySQL will not index scan, find matching records, and then look them up rather than table scan? I believe the answer is that there is no way of knowing if 1 row matches, or all rows match. In the worst case (where all rows match), it is much more expensive to traverse between index and data rows for-each-record. So a table scan is a “safe choice" / has less variance. In addition to what Morgan writes, then with an index scan you will end up doing a lot of random I/O: even if the index scan itself is one sequential scan (which is not guaranteed) then for each match, it will be necessary to look up the actual row. On the other hand a table scan will generally be more of a sequential read as you already have all the data available for each match. Random I/O is more expensive than sequential I/O - particularly on spinning disks - so in general the optimizer will try to reduce the amount of random I/O. In some cases though, you may see the index scan be preferred. Assume you have a query like: SELECT val FROM table WHERE condition LIKE '%abcd'; and you have an index (condition, val) or (val, condition) then the whole query can be satisfied from the index (it's called a covering index). In that case the index scan is usually preferred over the table scan. For the purpose of using an index to do index lookups to find the matching rows rather than doing either a table or index scan for WHERE clauses like "LIKE '%abcd'" you can do a couple of things: * Duplicate the column used in the WHERE clause, but reverse the string. That way the above WHERE clause becomes: WHERE condition_revers LIKE 'dcba%' This can use an index as it is a left prefix. * If you always look for around the same number of characters at the end in your WHERE clause, you can create a column with just those last characters, e.g. so the WHERE clause becomes: WHERE condition_suffix = 'abcd' Do however be careful that you ensure you have enough selectivity that way. If for example 90% of the rows ends in 'abcd' an index will not do you much good (unless you are looking for the last 10% of the rows). Best regards, Jesper Krogh MySQL Support
Re: LIKE sql optimization
On 12/02/2014 13:16, Morgan Tocker wrote: > Hi Zhigang, > > On Feb 11, 2014, at 8:48 PM, Zhigang Zhang wrote: > >> I want to know the reason, in my opinion, to scan the smaller index data has >> better performance than to scan the whole table data. > I think I understand the question - you are asking why MySQL will not index > scan, find matching records, and then look them up rather than table scan? > > I believe the answer is that there is no way of knowing if 1 row matches, or > all rows match. In the worst case (where all rows match), it is much more > expensive to traverse between index and data rows for-each-record. > > So a table scan is a “safe choice" / has less variance. In addition to what Morgan writes, then with an index scan you will end up doing a lot of random I/O: even if the index scan itself is one sequential scan (which is not guaranteed) then for each match, it will be necessary to look up the actual row. On the other hand a table scan will generally be more of a sequential read as you already have all the data available for each match. Random I/O is more expensive than sequential I/O - particularly on spinning disks - so in general the optimizer will try to reduce the amount of random I/O. In some cases though, you may see the index scan be preferred. Assume you have a query like: SELECT val FROM table WHERE condition LIKE '%abcd'; and you have an index (condition, val) or (val, condition) then the whole query can be satisfied from the index (it's called a covering index). In that case the index scan is usually preferred over the table scan. For the purpose of using an index to do index lookups to find the matching rows rather than doing either a table or index scan for WHERE clauses like "LIKE '%abcd'" you can do a couple of things: * Duplicate the column used in the WHERE clause, but reverse the string. That way the above WHERE clause becomes: WHERE condition_revers LIKE 'dcba%' This can use an index as it is a left prefix. * If you always look for around the same number of characters at the end in your WHERE clause, you can create a column with just those last characters, e.g. so the WHERE clause becomes: WHERE condition_suffix = 'abcd' Do however be careful that you ensure you have enough selectivity that way. If for example 90% of the rows ends in 'abcd' an index will not do you much good (unless you are looking for the last 10% of the rows). Best regards, Jesper Krogh MySQL Support
RE: LIKE sql optimization
Thank you very much! Zhigang -Original Message- From: Morgan Tocker [mailto:morgan.toc...@oracle.com] Sent: Wednesday, February 12, 2014 10:16 AM To: Zhigang Zhang Cc: mysql@lists.mysql.com Subject: Re: LIKE sql optimization Hi Zhigang, On Feb 11, 2014, at 8:48 PM, Zhigang Zhang wrote: > I want to know the reason, in my opinion, to scan the smaller index data has > better performance than to scan the whole table data. I think I understand the question - you are asking why MySQL will not index scan, find matching records, and then look them up rather than table scan? I believe the answer is that there is no way of knowing if 1 row matches, or all rows match. In the worst case (where all rows match), it is much more expensive to traverse between index and data rows for-each-record. So a table scan is a “safe choice" / has less variance.=
Re: LIKE sql optimization
Hi Zhigang, On Feb 11, 2014, at 8:48 PM, Zhigang Zhang wrote: > I want to know the reason, in my opinion, to scan the smaller index data has > better performance than to scan the whole table data. I think I understand the question - you are asking why MySQL will not index scan, find matching records, and then look them up rather than table scan? I believe the answer is that there is no way of knowing if 1 row matches, or all rows match. In the worst case (where all rows match), it is much more expensive to traverse between index and data rows for-each-record. So a table scan is a “safe choice" / has less variance. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: LIKE sql optimization
I checked a myisam table index, the index is a copy of the whole field. Zhigang -Original Message- From: Reindl Harald [mailto:h.rei...@thelounge.net] Sent: Wednesday, February 12, 2014 10:02 AM To: mysql@lists.mysql.com Subject: Re: LIKE sql optimization because a index is not just a dumb copy of the whole field and you simply can't seek in the middle of it? http://en.wikipedia.org/wiki/B-tree http://mattfleming.com/node/192 Am 12.02.2014 02:48, schrieb Zhigang Zhang: > I want to know the reason, in my opinion, to scan the smaller index data has > better performance than to scan the whole table data. > > From: Mathieu Desharnais [mailto:mdesharn...@diffusion.cc] > Sent: Wednesday, February 12, 2014 9:41 AM > To: Zhigang Zhang; mysql@lists.mysql.com > Subject: Re: LIKE sql optimization > > Sql database doesn't use index in like statement if it starts with % .. > > like 'abcd%' would work though... > > To use an index you can store your value using reverse function and index it > .. then your like would use the index. > > 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : > > For example: > > Select * from T where col like ‘%abcd’; > > The table T is myisam table and we created a index on col. > > As we known, this like sql does not use the index created on col, it confuse > me, why? > > I think in mysiam engine, the index data is smaller, it can use index link > list to optimize it so as to reduce the disk scan than to the whole table > scan.
Re: LIKE sql optimization
Same reason as why composite index works only if you supply first field or fields .. example index on a,b,c if you have a query : select * from tbl where a = 'whatever' and b = 'something it will use the index .. but a query like this one : select * from tbl where b = 'something' and c = 'something else' won't use the index .. - Just like an index in a book ... 2014-02-11 21:03 GMT-05:00 louis liu : > MySQL can't use index when '%' condition gives even oracle and you > can try full-text search > > > 2014-02-12 9:55 GMT+08:00 kitlenv : > >> *read how a index works technically* >> >> >> >> On Wed, Feb 12, 2014 at 12:48 PM, Zhigang Zhang > >wrote: >> >> > I want to know the reason, in my opinion, to scan the smaller index data >> > has >> > better performance than to scan the whole table data. >> > >> > >> > >> > >> > >> > zhigang >> > >> > >> > >> > _ >> > >> > From: Mathieu Desharnais [mailto:mdesharn...@diffusion.cc] >> > Sent: Wednesday, February 12, 2014 9:41 AM >> > To: Zhigang Zhang; mysql@lists.mysql.com >> > Subject: Re: LIKE sql optimization >> > >> > >> > >> > Sql database doesn't use index in like statement if it starts with % .. >> > >> > >> > >> > like 'abcd%' would work though... >> > >> > >> > >> > To use an index you can store your value using reverse function and >> index >> > it >> > .. then your like would use the index. >> > >> > >> > >> > 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : >> > >> > For example: >> > >> > >> > >> > Select * from T where col like '%abcd'; >> > >> > >> > >> > The table T is myisam table and we created a index on col. >> > >> > >> > >> > As we known, this like sql does not use the index created on col, it >> > confuse >> > me, why? >> > >> > >> > >> > I think in mysiam engine, the index data is smaller, it can use index >> link >> > list to optimize it so as to reduce the disk scan than to the whole >> table >> > scan. >> > >> > >> > >> > Thanks. >> > >> > >> > >> > Zhigang >> > >> > >> > >> > >> > > > > -- > Phone: +86 1868061 > Email & Gtalk: yloui...@gmail.com > Personal Blog: http://www.vmcd.org >
Re: LIKE sql optimization
MySQL can't use index when '%' condition gives even oracle and you can try full-text search 2014-02-12 9:55 GMT+08:00 kitlenv : > *read how a index works technically* > > > On Wed, Feb 12, 2014 at 12:48 PM, Zhigang Zhang >wrote: > > > I want to know the reason, in my opinion, to scan the smaller index data > > has > > better performance than to scan the whole table data. > > > > > > > > > > > > zhigang > > > > > > > > _ > > > > From: Mathieu Desharnais [mailto:mdesharn...@diffusion.cc] > > Sent: Wednesday, February 12, 2014 9:41 AM > > To: Zhigang Zhang; mysql@lists.mysql.com > > Subject: Re: LIKE sql optimization > > > > > > > > Sql database doesn't use index in like statement if it starts with % .. > > > > > > > > like 'abcd%' would work though... > > > > > > > > To use an index you can store your value using reverse function and index > > it > > .. then your like would use the index. > > > > > > > > 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : > > > > For example: > > > > > > > > Select * from T where col like '%abcd'; > > > > > > > > The table T is myisam table and we created a index on col. > > > > > > > > As we known, this like sql does not use the index created on col, it > > confuse > > me, why? > > > > > > > > I think in mysiam engine, the index data is smaller, it can use index > link > > list to optimize it so as to reduce the disk scan than to the whole table > > scan. > > > > > > > > Thanks. > > > > > > > > Zhigang > > > > > > > > > -- Phone: +86 1868061 Email & Gtalk: yloui...@gmail.com Personal Blog: http://www.vmcd.org
Re: LIKE sql optimization
because a index is not just a dumb copy of the whole field and you simply can't seek in the middle of it? http://en.wikipedia.org/wiki/B-tree http://mattfleming.com/node/192 Am 12.02.2014 02:48, schrieb Zhigang Zhang: > I want to know the reason, in my opinion, to scan the smaller index data has > better performance than to scan the whole table data. > > From: Mathieu Desharnais [mailto:mdesharn...@diffusion.cc] > Sent: Wednesday, February 12, 2014 9:41 AM > To: Zhigang Zhang; mysql@lists.mysql.com > Subject: Re: LIKE sql optimization > > Sql database doesn't use index in like statement if it starts with % .. > > like 'abcd%' would work though... > > To use an index you can store your value using reverse function and index it > .. then your like would use the index. > > 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : > > For example: > > Select * from T where col like ‘%abcd’; > > The table T is myisam table and we created a index on col. > > As we known, this like sql does not use the index created on col, it confuse > me, why? > > I think in mysiam engine, the index data is smaller, it can use index link > list to optimize it so as to reduce the disk scan than to the whole table > scan. signature.asc Description: OpenPGP digital signature
Re: LIKE sql optimization
*read how a index works technically* On Wed, Feb 12, 2014 at 12:48 PM, Zhigang Zhang wrote: > I want to know the reason, in my opinion, to scan the smaller index data > has > better performance than to scan the whole table data. > > > > > > zhigang > > > > _ > > From: Mathieu Desharnais [mailto:mdesharn...@diffusion.cc] > Sent: Wednesday, February 12, 2014 9:41 AM > To: Zhigang Zhang; mysql@lists.mysql.com > Subject: Re: LIKE sql optimization > > > > Sql database doesn't use index in like statement if it starts with % .. > > > > like 'abcd%' would work though... > > > > To use an index you can store your value using reverse function and index > it > .. then your like would use the index. > > > > 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : > > For example: > > > > Select * from T where col like '%abcd'; > > > > The table T is myisam table and we created a index on col. > > > > As we known, this like sql does not use the index created on col, it > confuse > me, why? > > > > I think in mysiam engine, the index data is smaller, it can use index link > list to optimize it so as to reduce the disk scan than to the whole table > scan. > > > > Thanks. > > > > Zhigang > > > >
RE: LIKE sql optimization
I want to know the reason, in my opinion, to scan the smaller index data has better performance than to scan the whole table data. zhigang _ From: Mathieu Desharnais [mailto:mdesharn...@diffusion.cc] Sent: Wednesday, February 12, 2014 9:41 AM To: Zhigang Zhang; mysql@lists.mysql.com Subject: Re: LIKE sql optimization Sql database doesn't use index in like statement if it starts with % .. like 'abcd%' would work though... To use an index you can store your value using reverse function and index it .. then your like would use the index. 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : For example: Select * from T where col like ‘%abcd’; The table T is myisam table and we created a index on col. As we known, this like sql does not use the index created on col, it confuse me, why? I think in mysiam engine, the index data is smaller, it can use index link list to optimize it so as to reduce the disk scan than to the whole table scan. Thanks. Zhigang
Re: LIKE sql optimization
Sql database doesn't use index in like statement if it starts with % .. like 'abcd%' would work though... To use an index you can store your value using reverse function and index it .. then your like would use the index. 2014-02-11 20:23 GMT-05:00 Zhigang Zhang : > For example: > > > > Select * from T where col like '%abcd'; > > > > The table T is myisam table and we created a index on col. > > > > As we known, this like sql does not use the index created on col, it > confuse > me, why? > > > > I think in mysiam engine, the index data is smaller, it can use index link > list to optimize it so as to reduce the disk scan than to the whole table > scan. > > > > Thanks. > > > > Zhigang > >
Re: LIKE sql optimization
Am 12.02.2014 02:23, schrieb Zhigang Zhang: > For example: > > Select * from T where col like ‘%abcd’; > > The table T is myisam table and we created a index on col. > > As we known, this like sql does not use the index created on col, it confuse > me, why? > > I think in mysiam engine, the index data is smaller, it can use index link > list to optimize it so as to reduce the disk scan than to the whole table > scan because ‘%abcd’ can't work in case of a index how do you imagine that? read how a index works technically 'abcd%' may work but '%abcd' is impossible independent what engine, this don't work and won't ever work you may have luck with fulltext search (and it's other drawbacks) https://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html signature.asc Description: OpenPGP digital signature