Re: One table gets locked by itself

2012-05-09 Thread Johan De Meersman
- Original Message -
 From: nixofortune nixofort...@gmail.com
 
 Few more things. You can't have a deadlock on Mylsam table. You can

You *can* have deadlocks in MyISAM; the concept is not related to any specific 
engine - or even databases.

What you can't have, is deadlock on a single table :-)

-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: One table gets locked by itself

2012-05-09 Thread Claudio Nanni
Yes indeed,
but I think we are talking about MySQL level deadlocks,
that can happen only with row level locking and transactions.
If the deadlock is generated at application level then you can have it on
anything, also blackhole :-)

Claudio

2012/5/9 Johan De Meersman vegiv...@tuxera.be

 - Original Message -
  From: nixofortune nixofort...@gmail.com
 
  Few more things. You can't have a deadlock on Mylsam table. You can

 You *can* have deadlocks in MyISAM; the concept is not related to any
 specific engine - or even databases.

 What you can't have, is deadlock on a single table :-)

 --
 Bier met grenadyn
 Is als mosterd by den wyn
 Sy die't drinkt, is eene kwezel
 Hy die't drinkt, is ras een ezel

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql




-- 
Claudio


Re: One table gets locked by itself

2012-05-09 Thread Johan De Meersman
- Original Message -

 From: Claudio Nanni claudio.na...@gmail.com

 Yes indeed,
 but I think we are talking about MySQL level deadlocks,
 that can happen only with row level locking and transactions.
 If the deadlock is generated at application level then you can have
 it on anything, also blackhole :-)
Yup, but you don't know where the lock comes from, so that's possible. 

However, I just checked the documentation, and he *is* right after all: MySQL 
claims to be deadlock-free for table-locking engines, they simply lock all 
tables at once (I suppose that means any lock statement will invalidate the 
previous one) and always lock tables in the same order. 

-- 

Bier met grenadyn 
Is als mosterd by den wyn 
Sy die't drinkt, is eene kwezel 
Hy die't drinkt, is ras een ezel 


Re: One table gets locked by itself

2012-05-09 Thread Shawn Green

On 5/9/2012 6:17 AM, Johan De Meersman wrote:

- Original Message -


From: Claudio Nanniclaudio.na...@gmail.com



Yes indeed,
but I think we are talking about MySQL level deadlocks,
that can happen only with row level locking and transactions.
If the deadlock is generated at application level then you can have
it on anything, also blackhole :-)

Yup, but you don't know where the lock comes from, so that's possible.

However, I just checked the documentation, and he *is* right after all: MySQL 
claims to be deadlock-free for table-locking engines, they simply lock all 
tables at once (I suppose that means any lock statement will invalidate the 
previous one) and always lock tables in the same order.



Actually, we do not lock all tables at the same time. Each table is 
locked as it is needed by the connection that needs it.


As long as this is an implicit lock (created by the SQL command to be 
executed) and not a lock held by an explicit LOCK command then there is 
no way for any two connections to deadlock on a MyISAM (or another 
non-transactional table). It is only when a lock extends beyond the 
single-object scenario that makes a deadlock is possible. As each 
statement against a MyISAM table generally only needs to lock one object 
(the table) no two MySQL-based statements can deadlock. Even in those 
cases where one statement uses two objects (such as a INSERT...SELECT... 
or multi-table UPDATE command) the lock prioritization process for 
MyISAM prevents any two threads from cross-locking at the same time.


--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-09 Thread Rick James
A BTree that is small enough to be cached in RAM can be quickly maintained.  
Even the “block splits” are not too costly without the I/O.

A big file that needs sorting – bigger than can be cached in RAM – is more 
efficiently done with a dedicated “sort merge” program.  A “big” INDEX on a 
table may be big enough to fall into this category.

I/O is the most costly part of any of these operations.  My rule of thumb for 
MySQL SQL statements is:  If everything is cached, the query will run ten times 
as fast as it would if things have to be fetched from disk.

Sortmerge works this way:

1.   Sort as much of the file as you can in RAM.  Write that sorted piece 
to disk.

2.   Repeat for the next chunk of the file.  Repeat until the input file is 
broken into sorted chunks.

3.   Now, “merge” those chunks together – take the first row from  each, 
decide which is the “smallest”, send it to the output

4.   Repeat until finished with all the pieces.
For a really big task, there may have to be more than on “merge” pass.
Note how sort merge reads the input sequentially once, writes the output 
sequentially once, and has sequential I/O for each merge chunk.
“Sequential” I/O is faster than “random” I/O – no arm motion on traditional 
disks.  (SSDs are a different matter; I won’t go into that.)

The “output” from the sortmerge is fed into code that builds the BTree for the 
table.  This building of the BTree is sequential – fill the first block, move 
on to the next block, and never have to go back.

BTrees (when built randomly), if they need to spill to disk, will involve 
random I/O.  (And we are talking about an INDEX that is so big that it needs to 
spill to disk.)

When a block “splits”, one full block becomes two half-full blocks.  Randomly 
filling a BTree leads to, on average, the index being 69% full.  This is not a 
big factor in the overall issue, but perhaps worth noting.

How bad can it get?  Here’s an example.

· You have an INDEX on some random value, such as a GUID or MD5.

· The INDEX will be 5 times as big as you can fit in RAM.

· MySQL is adding to the BTree one row at a time (the non-sortmerge way)
When it is nearly finished, only 1 of 5 updates to the BTree can be done 
immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk.  If 
you are using normal disks, that is on the order of 125 rows per second that 
you can insert – Terrible!  Sortmerge is likely to average over 10,000.



From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn]
Sent: Tuesday, May 08, 2012 9:13 PM
To: Rick James
Cc: mysql@lists.mysql.com
Subject: 回复: Why is creating indexes faster after inserting massive data rows?

James...
* By doing all the indexes after building the table (or at least all the 
non-UNIQUE indexes), sort merge can be used.  This technique had been highly 
optimized over the past half-century, and is more efficient.

I have a question about sort merge:

Why does it do the all sort merge?

In my opinion, it just maintains the B tree and inserts one key into a B tree 
node which has fewer sorted keys, so it is good performance.

If it only does the sort merge, the B tree data structure have to been 
created separately. it wastes some performance.

Does it?



发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com
收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be; 
Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn
抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com 
mysql@lists.mysql.commailto:mysql@lists.mysql.com
发送日期: 2012年5月8日, 星期二, 上午 12:35
主题: RE: Why is creating indexes faster after inserting massive data rows?

* Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to 
INDEX updating speed.
* The cache size is quite important to dealing with indexing during INSERT; see 
http://mysql.rjweb.org/doc.php/memory http://mysql.rjweb.org/doc.php/memory%0A
* Note that mysqldump sets up for an efficient creation of indexes after 
loading the data.  This is not practical (or necessarily efficient) when 
incremental INSERTing into a table.

As for the original question...
* Updating the index(es) for one row often involves random BTree traversals.  
When the index(es) are too big to be cached, this can involve disk hit(s) for 
each row inserted.
* By doing all the indexes after building the table (or at least all the 
non-UNIQUE indexes), sort merge can be used.  This technique had been highly 
optimized over the past half-century, and is more efficient.


 -Original Message-
 From: Johan De Meersman [mailto:vegiv...@tuxera.bemailto:vegiv...@tuxera.be]
 Sent: Monday, May 07, 2012 1:29 AM
 To: Zhangzhigang
 Cc: mysql@lists.mysql.commailto:mysql@lists.mysql.com
 Subject: Re: Why is creating indexes faster after inserting massive
 data rows?

 - Original Message -
  From: Zhangzhigang 
  zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn
 
  

Re: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-09 Thread Claudio Nanni
This thread is going on and on and on and on,
does anyone have time to actually measure I/O?
Let's make numbers talk.

Claudio


2012/5/9 Rick James rja...@yahoo-inc.com

 A BTree that is small enough to be cached in RAM can be quickly
 maintained.  Even the “block splits” are not too costly without the I/O.

 A big file that needs sorting �C bigger than can be cached in RAM �C is more
 efficiently done with a dedicated “sort merge” program.  A “big” INDEX on a
 table may be big enough to fall into this category.

 I/O is the most costly part of any of these operations.  My rule of thumb
 for MySQL SQL statements is:  If everything is cached, the query will run
 ten times as fast as it would if things have to be fetched from disk.

 Sortmerge works this way:

 1.   Sort as much of the file as you can in RAM.  Write that sorted
 piece to disk.

 2.   Repeat for the next chunk of the file.  Repeat until the input
 file is broken into sorted chunks.

 3.   Now, “merge” those chunks together �C take the first row from
  each, decide which is the “smallest”, send it to the output

 4.   Repeat until finished with all the pieces.
 For a really big task, there may have to be more than on “merge” pass.
 Note how sort merge reads the input sequentially once, writes the output
 sequentially once, and has sequential I/O for each merge chunk.
 “Sequential” I/O is faster than “random” I/O �C no arm motion on
 traditional disks.  (SSDs are a different matter; I won’t go into that.)

 The “output” from the sortmerge is fed into code that builds the BTree for
 the table.  This building of the BTree is sequential �C fill the first
 block, move on to the next block, and never have to go back.

 BTrees (when built randomly), if they need to spill to disk, will involve
 random I/O.  (And we are talking about an INDEX that is so big that it
 needs to spill to disk.)

 When a block “splits”, one full block becomes two half-full blocks.
  Randomly filling a BTree leads to, on average, the index being 69% full.
  This is not a big factor in the overall issue, but perhaps worth noting.

 How bad can it get?  Here’s an example.

 ・ You have an INDEX on some random value, such as a GUID or MD5.

 ・ The INDEX will be 5 times as big as you can fit in RAM.

 ・ MySQL is adding to the BTree one row at a time (the
 non-sortmerge way)
 When it is nearly finished, only 1 of 5 updates to the BTree can be done
 immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk.
  If you are using normal disks, that is on the order of 125 rows per second
 that you can insert �C Terrible!  Sortmerge is likely to average over 10,000.



 From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn]
 Sent: Tuesday, May 08, 2012 9:13 PM
 To: Rick James
 Cc: mysql@lists.mysql.com
 Subject: 回复: Why is creating indexes faster after inserting massive data
 rows?

 James...
 * By doing all the indexes after building the table (or at least all the
 non-UNIQUE indexes), sort merge can be used.  This technique had been
 highly optimized over the past half-century, and is more efficient.

 I have a question about sort merge:

 Why does it do the all sort merge?

 In my opinion, it just maintains the B tree and inserts one key into a B
 tree node which has fewer sorted keys, so it is good performance.

 If it only does the sort merge, the B tree data structure have to been
 created separately. it wastes some performance.

 Does it?


 
 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com
 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be;
 Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn
 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com 
 mysql@lists.mysql.commailto:mysql@lists.mysql.com
 发送日期: 2012年5月8日, 星期二, 上午 12:35
 主题: RE: Why is creating indexes faster after inserting massive data rows?

 * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated
 to INDEX updating speed.
 * The cache size is quite important to dealing with indexing during
 INSERT; see http://mysql.rjweb.org/doc.php/memory 
 http://mysql.rjweb.org/doc.php/memory%0A
 * Note that mysqldump sets up for an efficient creation of indexes after
 loading the data.  This is not practical (or necessarily efficient) when
 incremental INSERTing into a table.

 As for the original question...
 * Updating the index(es) for one row often involves random BTree
 traversals.  When the index(es) are too big to be cached, this can involve
 disk hit(s) for each row inserted.
 * By doing all the indexes after building the table (or at least all the
 non-UNIQUE indexes), sort merge can be used.  This technique had been
 highly optimized over the past half-century, and is more efficient.


  -Original Message-
  From: Johan De Meersman [mailto:vegiv...@tuxera.bemailto:
 vegiv...@tuxera.be]
  Sent: Monday, May 07, 2012 1:29 AM
  To: Zhangzhigang
  Cc: 

RE: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-09 Thread Claudio Nanni
Disagree all the way, numbers are numbers, and better than words, always.
Claudio
On May 9, 2012 7:22 PM, Rick James rja...@yahoo-inc.com wrote:

 Numbers can be misleading �C one benchmark will show no difference; another
 will show 10x difference.

 Recommend you benchmark _*your*_ case.

 ** **

 *From:* Claudio Nanni [mailto:claudio.na...@gmail.com]
 *Sent:* Wednesday, May 09, 2012 8:34 AM
 *To:* Rick James
 *Cc:* Zhangzhigang; mysql@lists.mysql.com
 *Subject:* Re: 回复: Why is creating indexes faster after inserting massive
 data rows?

 ** **

 This thread is going on and on and on and on,

 does anyone have time to actually measure I/O?

 Let's make numbers talk.

 ** **

 Claudio

 ** **

 2012/5/9 Rick James rja...@yahoo-inc.com

 A BTree that is small enough to be cached in RAM can be quickly
 maintained.  Even the “block splits” are not too costly without the I/O.

 A big file that needs sorting �C bigger than can be cached in RAM �C is more
 efficiently done with a dedicated “sort merge” program.  A “big” INDEX on a
 table may be big enough to fall into this category.

 I/O is the most costly part of any of these operations.  My rule of thumb
 for MySQL SQL statements is:  If everything is cached, the query will run
 ten times as fast as it would if things have to be fetched from disk.

 Sortmerge works this way:

 1.   Sort as much of the file as you can in RAM.  Write that sorted
 piece to disk.

 2.   Repeat for the next chunk of the file.  Repeat until the input
 file is broken into sorted chunks.

 3.   Now, “merge” those chunks together �C take the first row from
  each, decide which is the “smallest”, send it to the output

 4.   Repeat until finished with all the pieces.
 For a really big task, there may have to be more than on “merge” pass.
 Note how sort merge reads the input sequentially once, writes the output
 sequentially once, and has sequential I/O for each merge chunk.
 “Sequential” I/O is faster than “random” I/O �C no arm motion on
 traditional disks.  (SSDs are a different matter; I won’t go into that.)

 The “output” from the sortmerge is fed into code that builds the BTree for
 the table.  This building of the BTree is sequential �C fill the first
 block, move on to the next block, and never have to go back.

 BTrees (when built randomly), if they need to spill to disk, will involve
 random I/O.  (And we are talking about an INDEX that is so big that it
 needs to spill to disk.)

 When a block “splits”, one full block becomes two half-full blocks.
  Randomly filling a BTree leads to, on average, the index being 69% full.
  This is not a big factor in the overall issue, but perhaps worth noting.

 How bad can it get?  Here’s an example.

 ・ You have an INDEX on some random value, such as a GUID or MD5.

 ・ The INDEX will be 5 times as big as you can fit in RAM.

 ・ MySQL is adding to the BTree one row at a time (the
 non-sortmerge way)
 When it is nearly finished, only 1 of 5 updates to the BTree can be done
 immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk.
  If you are using normal disks, that is on the order of 125 rows per second
 that you can insert �C Terrible!  Sortmerge is likely to average over 10,000.



 From: Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn]
 Sent: Tuesday, May 08, 2012 9:13 PM
 To: Rick James
 Cc: mysql@lists.mysql.com
 Subject: 回复: Why is creating indexes faster after inserting massive data
 rows?


 James...
 * By doing all the indexes after building the table (or at least all the
 non-UNIQUE indexes), sort merge can be used.  This technique had been
 highly optimized over the past half-century, and is more efficient.

 I have a question about sort merge:

 Why does it do the all sort merge?

 In my opinion, it just maintains the B tree and inserts one key into a B
 tree node which has fewer sorted keys, so it is good performance.

 If it only does the sort merge, the B tree data structure have to been
 created separately. it wastes some performance.

 Does it?


 
 发件人: Rick James rja...@yahoo-inc.commailto:rja...@yahoo-inc.com
 收件人: Johan De Meersman vegiv...@tuxera.bemailto:vegiv...@tuxera.be;
 Zhangzhigang zzgang_2...@yahoo.com.cnmailto:zzgang_2...@yahoo.com.cn
 抄送: mysql@lists.mysql.commailto:mysql@lists.mysql.com 
 mysql@lists.mysql.commailto:mysql@lists.mysql.com

 发送日期: 2012年5月8日, 星期二, 上午 12:35
 主题: RE: Why is creating indexes faster after inserting massive data rows?

 * Batch INSERTs run faster than one-row-at-a-time, but this is unrelated
 to INDEX updating speed.

 * The cache size is quite important to dealing with indexing during
 INSERT; see http://mysql.rjweb.org/doc.php/memory 
 http://mysql.rjweb.org/doc.php/memory%0A

 * Note that mysqldump sets up for an efficient creation of indexes after
 loading the data.  This is not practical (or necessarily efficient) when
 incremental 

回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-09 Thread Zhangzhigang
The “output” from the sortmerge is fed into code that builds the BTree for 
the table.  This building of the BTree is sequential – fill the first 
block, move on to the next block, and never have to go back.

 James...


Thanks for your answer, so clearly.

Firstly:

I thought that the block split for building of the BTree has to been done to 
do random I/O before accepting this answer.

Now, i have known that the mysql do the optimization to keep from block split 
by sort merge for building BTree, so it does not do more random I/O.


Secondly:

It bypass BTree traversals, When the index are too big to be cached which 
involves disk hit(s)  fro each row inserted.


Thank you very much.


Sincerely yours
Zhigang Zhang



 发件人: Rick James rja...@yahoo-inc.com
收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 
抄送: mysql@lists.mysql.com mysql@lists.mysql.com 
发送日期: 2012年5月9日, 星期三, 下午 11:21
主题: RE: 回复: Why is creating indexes faster after inserting massive data rows?
 

A BTree that is small enough to be cached in RAM can be quickly maintained.  
Even the “block splits” are not too costly without the I/O.
 
A big file that needs sorting – bigger than can be cached in RAM – is more 
efficiently done with a dedicated “sort merge” program.  A “big” INDEX on a 
table may be big enough to fall into this category.
 
I/O is the most costly part of any of these operations.  My rule of thumb for 
MySQL SQL statements is:  If everything is cached, the query will run ten times 
as fast as it would if things have to be fetched from disk.
 
Sortmerge works this way:
1.   Sort as much of the file as you can in RAM.  Write that sorted piece 
to disk.
2.   Repeat for the next chunk of the file.  Repeat until the input file is 
broken into sorted chunks.
3.   Now, “merge” those chunks together – take the first row from  each, 
decide which is the “smallest”, send it to the output
4.   Repeat until finished with all the pieces.
For a really big task, there may have to be more than on “merge” pass.
Note how sort merge reads the input sequentially once, writes the output 
sequentially once, and has sequential I/O for each merge chunk.
“Sequential” I/O is faster than “random” I/O – no arm motion on traditional 
disks.  (SSDs are a different matter; I won’t go into that.)
 
The “output” from the sortmerge is fed into code that builds the BTree for the 
table.  This building of the BTree is sequential – fill the first block, move 
on to the next block, and never have to go back.
 
BTrees (when built randomly), if they need to spill to disk, will involve 
random I/O.  (And we are talking about an INDEX that is so big that it needs to 
spill to disk.)
 
When a block “splits”, one full block becomes two half-full blocks.  Randomly 
filling a BTree leads to, on average, the index being 69% full.  This is not a 
big factor in the overall issue, but perhaps worth noting.
 
How bad can it get?  Here’s an example.
· You have an INDEX on some random value, such as a GUID or MD5.
· The INDEX will be 5 times as big as you can fit in RAM.
· MySQL is adding to the BTree one row at a time (the non-sortmerge way)
When it is nearly finished, only 1 of 5 updates to the BTree can be done 
immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk.  If 
you are using normal disks, that is on the order of 125 rows per second that 
you can insert – Terrible!  Sortmerge is likely to average over 10,000.
 
 
 
From:Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] 
Sent: Tuesday, May 08, 2012 9:13 PM
To: Rick James
Cc: mysql@lists.mysql.com
Subject: 回复:Why is creating indexes faster after inserting massive data rows?
 
James...
* By doing all the indexes after building the table (or at least all the 
non-UNIQUE indexes), sort merge can be used.  This technique had been highly 
optimized over the past half-century, and is more efficient.
 
I have a question about sort merge:
 
Why does it do the all sort merge? 
 
In my opinion, it just maintains the B tree and inserts one key into a B tree 
node which has fewer sorted keys, so it is good performance.
 
If it only does the sort merge, the B tree data structure have to been 
created separately. it wastes some performance.
 
Does it?
 
 



发件人:Rick James rja...@yahoo-inc.com
收件人:Johan De Meersman vegiv...@tuxera.be; Zhangzhigang 
zzgang_2...@yahoo.com.cn 
抄送:mysql@lists.mysql.com mysql@lists.mysql.com 
发送日期:2012年5月8日, 星期二, 上午12:35
主题:RE: Why is creating indexes faster after inserting massive data rows?

* Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to 
INDEX updating speed.
* The cache size is quite important to dealing with indexing during INSERT; see 
http://mysql.rjweb.org/doc.php/memory 
* Note that mysqldump sets up for an efficient creation of indexes after 
loading the data.  This is not practical (or necessarily efficient) when 
incremental INSERTing into a table.