Re: One table gets locked by itself
- Original Message - From: nixofortune nixofort...@gmail.com Few more things. You can't have a deadlock on Mylsam table. You can You *can* have deadlocks in MyISAM; the concept is not related to any specific engine - or even databases. What you can't have, is deadlock on a single table :-) -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: One table gets locked by itself
Yes indeed, but I think we are talking about MySQL level deadlocks, that can happen only with row level locking and transactions. If the deadlock is generated at application level then you can have it on anything, also blackhole :-) Claudio 2012/5/9 Johan De Meersman vegiv...@tuxera.be - Original Message - From: nixofortune nixofort...@gmail.com Few more things. You can't have a deadlock on Mylsam table. You can You *can* have deadlocks in MyISAM; the concept is not related to any specific engine - or even databases. What you can't have, is deadlock on a single table :-) -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- Claudio
Re: One table gets locked by itself
- Original Message - From: Claudio Nanni claudio.na...@gmail.com Yes indeed, but I think we are talking about MySQL level deadlocks, that can happen only with row level locking and transactions. If the deadlock is generated at application level then you can have it on anything, also blackhole :-) Yup, but you don't know where the lock comes from, so that's possible. However, I just checked the documentation, and he *is* right after all: MySQL claims to be deadlock-free for table-locking engines, they simply lock all tables at once (I suppose that means any lock statement will invalidate the previous one) and always lock tables in the same order. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: One table gets locked by itself
On 5/9/2012 6:17 AM, Johan De Meersman wrote: - Original Message - From: Claudio Nanniclaudio.na...@gmail.com Yes indeed, but I think we are talking about MySQL level deadlocks, that can happen only with row level locking and transactions. If the deadlock is generated at application level then you can have it on anything, also blackhole :-) Yup, but you don't know where the lock comes from, so that's possible. However, I just checked the documentation, and he *is* right after all: MySQL claims to be deadlock-free for table-locking engines, they simply lock all tables at once (I suppose that means any lock statement will invalidate the previous one) and always lock tables in the same order. Actually, we do not lock all tables at the same time. Each table is locked as it is needed by the connection that needs it. As long as this is an implicit lock (created by the SQL command to be executed) and not a lock held by an explicit LOCK command then there is no way for any two connections to deadlock on a MyISAM (or another non-transactional table). It is only when a lock extends beyond the single-object scenario that makes a deadlock is possible. As each statement against a MyISAM table generally only needs to lock one object (the table) no two MySQL-based statements can deadlock. Even in those cases where one statement uses two objects (such as a INSERT...SELECT... or multi-table UPDATE command) the lock prioritization process for MyISAM prevents any two threads from cross-locking at the same time. -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: 回复: Why is creating indexes faster after inserting massive data rows?
A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting – bigger than can be cached in RAM – is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together – take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O – no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential – fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. · You have an INDEX on some random value, such as a GUID or MD5. · The INDEX will be 5 times as big as you can fit in RAM. · MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert – Terrible! Sortmerge is likely to average over 10,000. From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复: Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table. As for the original question... * Updating the index(es) for one row often involves random BTree traversals. When the index(es) are too big to be cached, this can involve disk hit(s) for each row inserted. * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.bemailto:vegiv...@tuxera.be] Sent: Monday, May 07, 2012 1:29 AM To: Zhangzhigang Cc: mysql@lists.mysql.commailto:mysql@lists.mysql.com Subject: Re: Why is creating indexes faster after inserting massive data rows? - Original Message - From: Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn
Re: 回复: Why is creating indexes faster after inserting massive data rows?
This thread is going on and on and on and on, does anyone have time to actually measure I/O? Let's make numbers talk. Claudio 2012/5/9 Rick James rja...@yahoo-inc.com A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting �C bigger than can be cached in RAM �C is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together �C take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O �C no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential �C fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. ・ You have an INDEX on some random value, such as a GUID or MD5. ・ The INDEX will be 5 times as big as you can fit in RAM. ・ MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert �C Terrible! Sortmerge is likely to average over 10,000. From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复: Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table. As for the original question... * Updating the index(es) for one row often involves random BTree traversals. When the index(es) are too big to be cached, this can involve disk hit(s) for each row inserted. * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. -Original Message- From: Johan De Meersman [mailto:vegiv...@tuxera.bemailto: vegiv...@tuxera.be] Sent: Monday, May 07, 2012 1:29 AM To: Zhangzhigang Cc:
RE: 回复: Why is creating indexes faster after inserting massive data rows?
Disagree all the way, numbers are numbers, and better than words, always. Claudio On May 9, 2012 7:22 PM, Rick James rja...@yahoo-inc.com wrote: Numbers can be misleading �C one benchmark will show no difference; another will show 10x difference. Recommend you benchmark _*your*_ case. ** ** *From:* Claudio Nanni [mailto:claudio.na...@gmail.com] *Sent:* Wednesday, May 09, 2012 8:34 AM *To:* Rick James *Cc:* Zhangzhigang; mysql@lists.mysql.com *Subject:* Re: 回复: Why is creating indexes faster after inserting massive data rows? ** ** This thread is going on and on and on and on, does anyone have time to actually measure I/O? Let's make numbers talk. ** ** Claudio ** ** 2012/5/9 Rick James rja...@yahoo-inc.com A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting �C bigger than can be cached in RAM �C is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together �C take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O �C no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential �C fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. ・ You have an INDEX on some random value, such as a GUID or MD5. ・ The INDEX will be 5 times as big as you can fit in RAM. ・ MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert �C Terrible! Sortmerge is likely to average over 10,000. From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复: Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com mysql@lists.mysql.commailto:mysql@lists.mysql.com 发送日期: 2012年5月8日, 星期二, 上午 12:35 主题: RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental
回复: 回复: Why is creating indexes faster after inserting massive data rows?
The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential – fill the first block, move on to the next block, and never have to go back. James... Thanks for your answer, so clearly. Firstly: I thought that the block split for building of the BTree has to been done to do random I/O before accepting this answer. Now, i have known that the mysql do the optimization to keep from block split by sort merge for building BTree, so it does not do more random I/O. Secondly: It bypass BTree traversals, When the index are too big to be cached which involves disk hit(s) fro each row inserted. Thank you very much. Sincerely yours Zhigang Zhang 发件人: Rick James rja...@yahoo-inc.com 收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 抄送: mysql@lists.mysql.com mysql@lists.mysql.com 发送日期: 2012年5月9日, 星期三, 下午 11:21 主题: RE: 回复: Why is creating indexes faster after inserting massive data rows? A BTree that is small enough to be cached in RAM can be quickly maintained. Even the “block splits” are not too costly without the I/O. A big file that needs sorting – bigger than can be cached in RAM – is more efficiently done with a dedicated “sort merge” program. A “big” INDEX on a table may be big enough to fall into this category. I/O is the most costly part of any of these operations. My rule of thumb for MySQL SQL statements is: If everything is cached, the query will run ten times as fast as it would if things have to be fetched from disk. Sortmerge works this way: 1. Sort as much of the file as you can in RAM. Write that sorted piece to disk. 2. Repeat for the next chunk of the file. Repeat until the input file is broken into sorted chunks. 3. Now, “merge” those chunks together – take the first row from each, decide which is the “smallest”, send it to the output 4. Repeat until finished with all the pieces. For a really big task, there may have to be more than on “merge” pass. Note how sort merge reads the input sequentially once, writes the output sequentially once, and has sequential I/O for each merge chunk. “Sequential” I/O is faster than “random” I/O – no arm motion on traditional disks. (SSDs are a different matter; I won’t go into that.) The “output” from the sortmerge is fed into code that builds the BTree for the table. This building of the BTree is sequential – fill the first block, move on to the next block, and never have to go back. BTrees (when built randomly), if they need to spill to disk, will involve random I/O. (And we are talking about an INDEX that is so big that it needs to spill to disk.) When a block “splits”, one full block becomes two half-full blocks. Randomly filling a BTree leads to, on average, the index being 69% full. This is not a big factor in the overall issue, but perhaps worth noting. How bad can it get? Here’s an example. · You have an INDEX on some random value, such as a GUID or MD5. · The INDEX will be 5 times as big as you can fit in RAM. · MySQL is adding to the BTree one row at a time (the non-sortmerge way) When it is nearly finished, only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk. If you are using normal disks, that is on the order of 125 rows per second that you can insert – Terrible! Sortmerge is likely to average over 10,000. From:Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] Sent: Tuesday, May 08, 2012 9:13 PM To: Rick James Cc: mysql@lists.mysql.com Subject: 回复:Why is creating indexes faster after inserting massive data rows? James... * By doing all the indexes after building the table (or at least all the non-UNIQUE indexes), sort merge can be used. This technique had been highly optimized over the past half-century, and is more efficient. I have a question about sort merge: Why does it do the all sort merge? In my opinion, it just maintains the B tree and inserts one key into a B tree node which has fewer sorted keys, so it is good performance. If it only does the sort merge, the B tree data structure have to been created separately. it wastes some performance. Does it? 发件人:Rick James rja...@yahoo-inc.com 收件人:Johan De Meersman vegiv...@tuxera.be; Zhangzhigang zzgang_2...@yahoo.com.cn 抄送:mysql@lists.mysql.com mysql@lists.mysql.com 发送日期:2012年5月8日, 星期二, 上午12:35 主题:RE: Why is creating indexes faster after inserting massive data rows? * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to INDEX updating speed. * The cache size is quite important to dealing with indexing during INSERT; see http://mysql.rjweb.org/doc.php/memory * Note that mysqldump sets up for an efficient creation of indexes after loading the data. This is not practical (or necessarily efficient) when incremental INSERTing into a table.