回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-09 Thread Zhangzhigang
The “output” from the sortmerge is fed into code that builds the BTree for 
the table.  This building of the BTree is sequential – fill the first 
block, move on to the next block, and never have to go back.

 James...


Thanks for your answer, so clearly.

Firstly:

I thought that the block split for building of the BTree has to been done to 
do random I/O before accepting this answer.

Now, i have known that the mysql do the optimization to keep from block split 
by sort merge for building BTree, so it does not do more random I/O.


Secondly:

It bypass BTree traversals, When the index are too big to be cached which 
involves disk hit(s)  fro each row inserted.


Thank you very much.


Sincerely yours
Zhigang Zhang



 发件人: Rick James rja...@yahoo-inc.com
收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 
抄送: mysql@lists.mysql.com mysql@lists.mysql.com 
发送日期: 2012年5月9日, 星期三, 下午 11:21
主题: RE: 回复: Why is creating indexes faster after inserting massive data rows?
 

A BTree that is small enough to be cached in RAM can be quickly maintained.  
Even the “block splits” are not too costly without the I/O.
 
A big file that needs sorting – bigger than can be cached in RAM – is more 
efficiently done with a dedicated “sort merge” program.  A “big” INDEX on a 
table may be big enough to fall into this category.
 
I/O is the most costly part of any of these operations.  My rule of thumb for 
MySQL SQL statements is:  If everything is cached, the query will run ten times 
as fast as it would if things have to be fetched from disk.
 
Sortmerge works this way:
1.   Sort as much of the file as you can in RAM.  Write that sorted piece 
to disk.
2.   Repeat for the next chunk of the file.  Repeat until the input file is 
broken into sorted chunks.
3.   Now, “merge” those chunks together – take the first row from  each, 
decide which is the “smallest”, send it to the output
4.   Repeat until finished with all the pieces.
For a really big task, there may have to be more than on “merge” pass.
Note how sort merge reads the input sequentially once, writes the output 
sequentially once, and has sequential I/O for each merge chunk.
“Sequential” I/O is faster than “random” I/O – no arm motion on traditional 
disks.  (SSDs are a different matter; I won’t go into that.)
 
The “output” from the sortmerge is fed into code that builds the BTree for the 
table.  This building of the BTree is sequential – fill the first block, move 
on to the next block, and never have to go back.
 
BTrees (when built randomly), if they need to spill to disk, will involve 
random I/O.  (And we are talking about an INDEX that is so big that it needs to 
spill to disk.)
 
When a block “splits”, one full block becomes two half-full blocks.  Randomly 
filling a BTree leads to, on average, the index being 69% full.  This is not a 
big factor in the overall issue, but perhaps worth noting.
 
How bad can it get?  Here’s an example.
· You have an INDEX on some random value, such as a GUID or MD5.
· The INDEX will be 5 times as big as you can fit in RAM.
· MySQL is adding to the BTree one row at a time (the non-sortmerge way)
When it is nearly finished, only 1 of 5 updates to the BTree can be done 
immediately in RAM; 4 out of 5 updates to the BTree will have to hit disk.  If 
you are using normal disks, that is on the order of 125 rows per second that 
you can insert – Terrible!  Sortmerge is likely to average over 10,000.
 
 
 
From:Zhangzhigang [mailto:zzgang_2...@yahoo.com.cn] 
Sent: Tuesday, May 08, 2012 9:13 PM
To: Rick James
Cc: mysql@lists.mysql.com
Subject: 回复:Why is creating indexes faster after inserting massive data rows?
 
James...
* By doing all the indexes after building the table (or at least all the 
non-UNIQUE indexes), sort merge can be used.  This technique had been highly 
optimized over the past half-century, and is more efficient.
 
I have a question about sort merge:
 
Why does it do the all sort merge? 
 
In my opinion, it just maintains the B tree and inserts one key into a B tree 
node which has fewer sorted keys, so it is good performance.
 
If it only does the sort merge, the B tree data structure have to been 
created separately. it wastes some performance.
 
Does it?
 
 



发件人:Rick James rja...@yahoo-inc.com
收件人:Johan De Meersman vegiv...@tuxera.be; Zhangzhigang 
zzgang_2...@yahoo.com.cn 
抄送:mysql@lists.mysql.com mysql@lists.mysql.com 
发送日期:2012年5月8日, 星期二, 上午12:35
主题:RE: Why is creating indexes faster after inserting massive data rows?

* Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to 
INDEX updating speed.
* The cache size is quite important to dealing with indexing during INSERT; see 
http://mysql.rjweb.org/doc.php/memory 
* Note that mysqldump sets up for an efficient creation of indexes after 
loading the data.  This is not practical (or necessarily efficient) when 
incremental INSERTing into a table

回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-08 Thread Zhangzhigang
Ok, OS cache.
 There isn't really a million of block writes.   The record gets 
added to the block, but that gets modified in OS cache if we assume 
MyISAM tables and in the Innodb buffer if we assume InnoDB tables.

As i known, the mysql writes the data to disk directly but does not use the Os 
cache when the table is updating.

If it writes to the Os cache, which leads to massive system invoking, when the 
table is inserted a lot of rows one by one. 





 发件人: Karen Abgarian a...@apple.com
收件人: mysql@lists.mysql.com 
发送日期: 2012年5月8日, 星期二, 上午 11:37
主题: Re: 回复: 回复: Why is creating indexes faster after inserting massive data 
rows?
 
Honestly, I did not understand that.   I did not say anything about being 
complicated.  What does mysql not use, caching??

Judging by experience, creating a unique index on say, a 200G table could be a 
bitter one.  


On 07.05.2012, at 19:26, Zhangzhigang wrote:

 Karen...
 
 The mysql does not use this approach what you said which is complicated.
 
 I  agree with ohan De Meersman.
 
 
 
 发件人: Karen Abgarian a...@apple.com
 收件人: mysql@lists.mysql.com 
 发送日期: 2012年5月8日, 星期二, 上午 1:30
 主题: Re: 回复: Why is creating indexes faster after inserting massive data rows?
 
 Hi, 
 
 A couple cents to this. 
 
 There isn't really a million of block writes.   The record gets added to the 
 block, but that gets modified in OS cache if we assume MyISAM tables and in 
 the Innodb buffer if we assume InnoDB tables.   In both cases, the actual 
 writing does not take place and does not slow down the process.    What does 
 however happen for each operation, is processing the statement, locating the 
 entries to update in the index, index block splits and , for good reason, 
 committing.  
 
 When it comes to creating an index, what needs to happen, is to read the 
 whole table and to sort all rows by the index key.   The latter process will 
 be the most determining factor in answering the original question, because 
 for the large tables the sort will have to do a lot of disk I/O.    The point 
 I am trying to make is there will be situations when creating indexes and 
 then inserting the rows will be faster than creating an index afterwards.   
 If we try to determine such situations, we could notice that the likelihood 
 of the sort going to disk increases with the amount of distinct values to be 
 sorted.   For this reason, my choice would be to create things like 
 primary/unique keys beforehand unless I am certain that everything will fit 
 in the available memory. 
 
 Peace
 Karen
 
 
 
 On May 7, 2012, at 8:05 AM, Johan De Meersman wrote:
 
 - Original Message -
 
 From: Zhangzhigang zzgang_2...@yahoo.com.cn
 
 Ok, Creating the index *after* the inserts, the index gets created in
 a single operation.
 But the indexes has to be updating row by row after the data rows has
 all been inserted. Does it work in this way?
 No, when you create an index on an existing table (like after a mass 
 insert), what happens is that the engine does a single full tablescan and 
 builds the index in a single pass, which is a lot more performant than 
 updating a single disk block for every record, for the simple reason that a 
 single disk block can contain dozens of index entries. 
 
 Imagine that you insert one million rows, and you have 100 index entries in 
 a disk block (random numbers, to make a point. Real numbers will depend on 
 storage, file system, index, et cetera). Obviously there's no way to write 
 less than a single block to disk - that's how it works. 
 
 You can update your index for each record in turn. That means you will need 
 to do 1 million index - and thus block - writes; plus additional reads for 
 those blocks you don't have in memory - that's the index cache. 
 
 Now, if you create a new index on an existing table, you are first of all 
 bypassing any index read operations - there *is* no index to read, yet. Then 
 the system is going to do a full tablescan - considered slow, but you need 
 all the data, so there's no better way anyway. The index will be built - 
 in-memory as much as possible - and the system will automatically prefer to 
 write only complete blocks - 10.000 of them. That's the exact same number of 
 index blocks, but you only write each block once, so that's only 10.000 
 writes instead of 1.000.000. 
 
 Now there's a lot more at play, things like B-tree balancing and whatnot, 
 but that's the basic picture. 
 
 -- 
 
 Bier met grenadyn 
 Is als mosterd by den wyn 
 Sy die't drinkt, is eene kwezel 
 Hy die't drinkt, is ras een ezel 
 
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

回复: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-08 Thread Zhangzhigang
Ok, thanks for your help.




 发件人: Johan De Meersman vegiv...@tuxera.be
收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 
抄送: mysql@lists.mysql.com; Karen Abgarian a...@apple.com 
发送日期: 2012年5月8日, 星期二, 下午 6:07
主题: Re: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data 
rows?
 
- Original Message -
 From: Zhangzhigang zzgang_2...@yahoo.com.cn
 
 As i known, the mysql writes the data to disk directly but does not
 use the Os cache when the table is updating.

If it were to use the OS cache for reading but not writing, then the OS cache 
would be inconsistent with the underlying filesystem as soon as you wrote a 
block, and you'd need some complicated logic to figure out which of the two was 
correct.

No, the MyISAM engine will simply yield to whatever the kernel/VFS wants to do 
with the blocks; whereas InnoDB explicitly opens the files with O_SYNC and 
bypasses the OS cache entirely, because it manages it's own buffer cache.

 If it writes to the Os cache, which leads to massive system invoking,
 when the table is inserted a lot of rows one by one.

From the code's point of view, you simply request a read or a write. Wether or 
not the OS cache gets in between is entirely a matter for the kernel to decide, 
assuming you specified no specific options at file open time.


-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

回复: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-08 Thread Zhangzhigang
 Oh... I thought that it uses it's own buffer cache as same as the InnoDB. I 
have got a mistake for this,  thanks!




 发件人: Karen Abgarian a...@apple.com
收件人: mysql@lists.mysql.com 
发送日期: 2012年5月9日, 星期三, 上午 2:51
主题: Re: 回复: 回复: 回复: Why is creating indexes faster after inserting massive data 
rows?
 
Hi, 

If MyISAM tables were being written directly to disk, the MyISAM tables would 
be so slow that nobody would ever use them.    That's the cornerstone of their 
performance, that the writes do not wait for the physical I/O to complete!



On May 8, 2012, at 3:07 AM, Johan De Meersman wrote:

 - Original Message -
 From: Zhangzhigang zzgang_2...@yahoo.com.cn
 
 As i known, the mysql writes the data to disk directly but does not
 use the Os cache when the table is updating.
 
 If it were to use the OS cache for reading but not writing, then the OS cache 
 would be inconsistent with the underlying filesystem as soon as you wrote a 
 block, and you'd need some complicated logic to figure out which of the two 
 was correct.
 
 No, the MyISAM engine will simply yield to whatever the kernel/VFS wants to 
 do with the blocks; whereas InnoDB explicitly opens the files with O_SYNC and 
 bypasses the OS cache entirely, because it manages it's own buffer cache.
 
 If it writes to the Os cache, which leads to massive system invoking,
 when the table is inserted a lot of rows one by one.
 
 From the code's point of view, you simply request a read or a write. Wether 
 or not the OS cache gets in between is entirely a matter for the kernel to 
 decide, assuming you specified no specific options at file open time.
 
 
 -- 
 Bier met grenadyn
 Is als mosterd by den wyn
 Sy die't drinkt, is eene kwezel
 Hy die't drinkt, is ras een ezel
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql
 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

回复: Why is creating indexes faster after inserting massive data rows?

2012-05-08 Thread Zhangzhigang
James...
* By doing all the indexes after building the table (or at least all the 
non-UNIQUE indexes), sort merge can be used.  This technique had been highly 
optimized over the past half-century, and is more efficient.


I have a question about sort merge:

Why does it do the all sort merge? 


In my opinion, it just maintains the B tree and inserts one key into a B tree 
node which has fewer sorted keys, so it is good performance.

If it only does the sort merge, the B tree data structure have to been 
createdseparately. it wastes some performance.

Does it?




 发件人: Rick James rja...@yahoo-inc.com
收件人: Johan De Meersman vegiv...@tuxera.be; Zhangzhigang 
zzgang_2...@yahoo.com.cn 
抄送: mysql@lists.mysql.com mysql@lists.mysql.com 
发送日期: 2012年5月8日, 星期二, 上午 12:35
主题: RE: Why is creating indexes faster after inserting massive data rows?
 
* Batch INSERTs run faster than one-row-at-a-time, but this is unrelated to 
INDEX updating speed.
* The cache size is quite important to dealing with indexing during INSERT; see 
http://mysql.rjweb.org/doc.php/memory 
* Note that mysqldump sets up for an efficient creation of indexes after 
loading the data.  This is not practical (or necessarily efficient) when 
incremental INSERTing into a table.

As for the original question...
* Updating the index(es) for one row often involves random BTree traversals.  
When the index(es) are too big to be cached, this can involve disk hit(s) for 
each row inserted.
* By doing all the indexes after building the table (or at least all the 
non-UNIQUE indexes), sort merge can be used.  This technique had been highly 
optimized over the past half-century, and is more efficient.


 -Original Message-
 From: Johan De Meersman [mailto:vegiv...@tuxera.be]
 Sent: Monday, May 07, 2012 1:29 AM
 To: Zhangzhigang
 Cc: mysql@lists.mysql.com
 Subject: Re: Why is creating indexes faster after inserting massive
 data rows?
 
 - Original Message -
  From: Zhangzhigang zzgang_2...@yahoo.com.cn
 
  Creating indexes after inserting massive data rows is faster than
  before inserting data rows.
  Please tell me why.
 
 Plain and simple: the indices get updated after every insert statement,
 whereas if you only create the index *after* the inserts, the index
 gets created in a single operation, which is a lot more efficient.
 
 I seem to recall that inside of a transaction (thus, InnoDB or so) the
 difference is markedly less; I might be wrong, though.
 
 
 --
 Bier met grenadyn
 Is als mosterd by den wyn
 Sy die't drinkt, is eene kwezel
 Hy die't drinkt, is ras een ezel
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql 
 To unsubscribe:    http://lists.mysql.com/mysql 

回复: Why is creating indexes faster after inserting massive data rows?

2012-05-07 Thread Zhangzhigang
Version : Mysql 5.1 

Engine : MyISAM.

The indexes  are normal but neither primary key or unique key.

I should describe mysql question clearly.

When inserting massive data rows to table which need to be created indexes, i 
can create indexes before inserting data rows, anther way is that i can insert 
all data rows firstly and then create indexes. Normally, the sum using 
time(inserting data rows and creating indexes) of first way is longer than the 
second way.

Please tell me why?








 
发件人: Ananda Kumar anan...@gmail.com
收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 
抄送: mysql@lists.mysql.com mysql@lists.mysql.com 
发送日期: 2012年5月7日, 星期一, 下午 3:31
主题: Re: Why is creating indexes faster after inserting massive data rows?
 

which version of mysql 燼re you using.

Is this secondary index.?




On Mon, May 7, 2012 at 12:07 PM, Zhangzhigang zzgang_2...@yahoo.com.cn wrote:

hi all:

I have a question:

Creating indexes after inserting massive data rows is faster than before 
inserting data rows.
Please tell me why.


回复: Why is creating indexes faster after inserting massive data rows?

2012-05-07 Thread Zhangzhigang
johan 
Plain and simple: the indices get updated after every insert statement, 
whereas if you only create the index *after* the inserts, the index gets 
created in a single operation, which is a lot more efficient..


Ok, Creating the index *after* the inserts, the index gets created in a single 
operation.
But the indexes has to be updating row by row after the data rows has all been 
inserted. Does it work in this way?
So i can not find the different overhead  about two ways.






 发件人: Johan De Meersman vegiv...@tuxera.be
收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 
抄送: mysql@lists.mysql.com 
发送日期: 2012年5月7日, 星期一, 下午 4:28
主题: Re: Why is creating indexes faster after inserting massive data rows?
 
- Original Message -
 From: Zhangzhigang zzgang_2...@yahoo.com.cn
 
 Creating indexes after inserting massive data rows is faster than
 before inserting data rows.
 Please tell me why.

Plain and simple: the indices get updated after every insert statement, whereas 
if you only create the index *after* the inserts, the index gets created in a 
single operation, which is a lot more efficient.

I seem to recall that inside of a transaction (thus, InnoDB or so) the 
difference is markedly less; I might be wrong, though.


-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-07 Thread Zhangzhigang
Ok, but my opinion is that the sorting algorithms is not impact this 
difference, two ways  all do B+ tree inserts.



 发件人: Claudio Nanni claudio.na...@gmail.com
收件人: Zhangzhigang zzgang_2...@yahoo.com.cn 
抄送: Johan De Meersman vegiv...@tuxera.be; mysql@lists.mysql.com 
mysql@lists.mysql.com 
发送日期: 2012年5月7日, 星期一, 下午 5:01
主题: Re: 回复: Why is creating indexes faster after inserting massive data rows?
 

Creating the index in one time is one macro-sort operation,
updating the index at every row is doing the operation on and on again.
If you do not understand the difference I recommend you to read some basics 
about sorting algorithms,
very interesting read anyway.

Claudio 


2012/5/7 Zhangzhigang zzgang_2...@yahoo.com.cn

johan 

Plain and simple: the indices get updated after every insert statement,
whereas if you only create the index *after* the inserts, the index gets 
created in a single operation, which is a lot more efficient..


Ok, Creating the index *after* the inserts, the index gets created in a single 
operation.
But the indexes has to be updating row by row after the data rows has all been 
inserted. Does it work in this way?
So i can not find the different overhead  about two ways.






 发件人: Johan De Meersman vegiv...@tuxera.be

收件人: Zhangzhigang zzgang_2...@yahoo.com.cn
抄送: mysql@lists.mysql.com
发送日期: 2012年5月7日, 星期一, 下午 4:28

主题: Re: Why is creating indexes faster after inserting massive data rows?


- Original Message -
 From: Zhangzhigang zzgang_2...@yahoo.com.cn

 Creating indexes after inserting massive data rows is faster than
 before inserting data rows.
 Please tell me why.

Plain and simple: the indices get updated after every insert statement, 
whereas if you only create the index *after* the inserts, the index gets 
created in a single operation, which is a lot more efficient.

I seem to recall that inside of a transaction (thus, InnoDB or so) the 
difference is markedly less; I might be wrong, though.


--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel


-- 
Claudio

回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-07 Thread Zhangzhigang
Thanks, i thought about this answer in the past, and i appreciate your reply.




 发件人: Alex Schaft al...@quicksoftware.co.za
收件人: mysql@lists.mysql.com 
发送日期: 2012年5月7日, 星期一, 下午 4:59
主题: Re: 回复: Why is creating indexes faster after inserting massive data rows?
 
On 2012/05/07 10:53, Zhangzhigang wrote:
 johan 
 Plain and simple: the indices get updated after every insert statement,
 whereas if you only create the index *after* the inserts, the index gets 
 created in a single operation, which is a lot more efficient..
 
 
 Ok, Creating the index *after* the inserts, the index gets created in a 
 single operation.
 But the indexes has to be updating row by row after the data rows has all 
 been inserted. Does it work in this way?
 So i can not find the different overhead  about two ways.
My simplified 2c. When inserting rows with active indexes one by one (insert), 
mysql has to

1) lock the space for the data to be added,
2) write the data,
3) lock the index,
4) write the index key(s),
5) unlock the index,
6)unlock the data

This happens for each row

When first doing all data without index, only 1, 2, and 6 happen. When you then 
create an index, it can lock the index, read all the data and write all index 
keys in one go and then unlock the index.

If you make an omelet, do you fetch your eggs from the fridge one by one, or 
all at the same time? :)

HTH,
Alex


-- MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

回复: 回复: Why is creating indexes faster after inserting massive data rows?

2012-05-07 Thread Zhangzhigang
Karen...

The mysql does not use this approach what you said which is complicated.

I  agree with ohan De Meersman.



 发件人: Karen Abgarian a...@apple.com
收件人: mysql@lists.mysql.com 
发送日期: 2012年5月8日, 星期二, 上午 1:30
主题: Re: 回复: Why is creating indexes faster after inserting massive data rows?
 
Hi, 

A couple cents to this. 

There isn't really a million of block writes.   The record gets added to the 
block, but that gets modified in OS cache if we assume MyISAM tables and in the 
Innodb buffer if we assume InnoDB tables.   In both cases, the actual writing 
does not take place and does not slow down the process.    What does however 
happen for each operation, is processing the statement, locating the entries to 
update in the index, index block splits and , for good reason, committing.  

When it comes to creating an index, what needs to happen, is to read the whole 
table and to sort all rows by the index key.   The latter process will be the 
most determining factor in answering the original question, because for the 
large tables the sort will have to do a lot of disk I/O.    The point I am 
trying to make is there will be situations when creating indexes and then 
inserting the rows will be faster than creating an index afterwards.   If we 
try to determine such situations, we could notice that the likelihood of the 
sort going to disk increases with the amount of distinct values to be sorted.   
For this reason, my choice would be to create things like primary/unique keys 
beforehand unless I am certain that everything will fit in the available 
memory. 

Peace
Karen



On May 7, 2012, at 8:05 AM, Johan De Meersman wrote:

 - Original Message -
 
 From: Zhangzhigang zzgang_2...@yahoo.com.cn
 
 Ok, Creating the index *after* the inserts, the index gets created in
 a single operation.
 But the indexes has to be updating row by row after the data rows has
 all been inserted. Does it work in this way?
 No, when you create an index on an existing table (like after a mass insert), 
 what happens is that the engine does a single full tablescan and builds the 
 index in a single pass, which is a lot more performant than updating a single 
 disk block for every record, for the simple reason that a single disk block 
 can contain dozens of index entries. 
 
 Imagine that you insert one million rows, and you have 100 index entries in a 
 disk block (random numbers, to make a point. Real numbers will depend on 
 storage, file system, index, et cetera). Obviously there's no way to write 
 less than a single block to disk - that's how it works. 
 
 You can update your index for each record in turn. That means you will need 
 to do 1 million index - and thus block - writes; plus additional reads for 
 those blocks you don't have in memory - that's the index cache. 
 
 Now, if you create a new index on an existing table, you are first of all 
 bypassing any index read operations - there *is* no index to read, yet. Then 
 the system is going to do a full tablescan - considered slow, but you need 
 all the data, so there's no better way anyway. The index will be built - 
 in-memory as much as possible - and the system will automatically prefer to 
 write only complete blocks - 10.000 of them. That's the exact same number of 
 index blocks, but you only write each block once, so that's only 10.000 
 writes instead of 1.000.000. 
 
 Now there's a lot more at play, things like B-tree balancing and whatnot, but 
 that's the basic picture. 
 
 -- 
 
 Bier met grenadyn 
 Is als mosterd by den wyn 
 Sy die't drinkt, is eene kwezel 
 Hy die't drinkt, is ras een ezel 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

回复: update query

2012-04-30 Thread Zhangzhigang
Ok, there is another approach if you are using shell script.

Step 1: You may invoke one mysql user who has not password to access the mysql 
database.
Step 2: Shell script:

c=0
for i in `mysql -u username -e use database;show tables;`
do
    if [ $c -ge 1 ]
    then
    mysql -u username  -e use database;update $i set client='NEW'
    fi
    let c=$c+1
done





 发件人: Ananda Kumar anan...@gmail.com
收件人: Pothanaboyina Trimurthy skd.trimur...@gmail.com 
抄送: mysql@lists.mysql.com 
发送日期: 2012年4月30日, 星期一, 下午 5:26
主题: Re: update query
 
Do you just want to replace current value in client column to NEW.
You can write a stored proc , with a cursor and loop through the cursor,
update each table.

regards
anandkl

On Mon, Apr 30, 2012 at 2:47 PM, Pothanaboyina Trimurthy 
skd.trimur...@gmail.com wrote:

 Hi all,
      i have one database with 120 tables and each table contains one
 common column that is client now i want to update all the tables
 column client = NEW. is it possible to write a single query to
 update this one.

 please help me.

 thanks in advance

 Thanks  Kind Regards,
 Trimurthy.p

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:    http://lists.mysql.com/mysql



RE: Why does the limit use the early row lookup.

2012-04-23 Thread Zhangzhigang
 If you are doing Pagination via OFFSET and LIMIT --
 Don't.  Instead, remember where you left off. 
 (More details upon request.)

Thanks for your answer.

Can you tell us the better approach about pagination to prevent to scan all 
table rows? 
How to use left off?




--- 12年4月24日,周二, Rick James rja...@yahoo-inc.com 写道:

 发件人: Rick James rja...@yahoo-inc.com
 主题: RE: Why does the limit use the early row lookup.
 收件人: 张志刚 zhig...@leju.sina.com.cn, mysql@lists.mysql.com 
 mysql@lists.mysql.com
 日期: 2012年4月24日,周二,上午2:54
 InnoDB or MyISAM?
 PRIMARY KEY (id) is a separate index in MyISAM, so scanning
 110 rows is faster than for InnoDB, where the PK is
 clustered with the data.  That is, MyISAM scans a
 narrow, 2-column, index (id + pointer); InnoDB scans wide
 rows (all columns).
 
 There is no way to avoid scanning 110 rows of something
 (data or index).
 
 If you are doing Pagination via OFFSET and LIMIT --
 Don't.  Instead, remember where you left off. 
 (More details upon request.)
 
 You can trick MySQL into doing late row lookup via a self
 join:
 SELECT b.*
     FROM tbl a
     JOIN ( SELECT id FROM  tbl ORDER BY id
 LIMIT 100, 10) b
         ON a.id = b.id
 
 Meanwhile, see if it is already a feature request at
 bugs.mysql.com .  If not, add it.
 
 Probably the optimization needs heuristics to decide which
 way to go.  The choice of early vs late may depend on
 all of these:
   * OFFSET
   * LIMIT
   * Number of rows in the table
   * Width of the table versus width of the key
 involved.
 
  -Original Message-
  From: 张志刚 [mailto:zhig...@leju.sina.com.cn]
  Sent: Thursday, April 19, 2012 7:30 PM
  To: mysql@lists.mysql.com
  Subject: Why does the limit use the early row lookup.
  
  Dear all:
  
  I encounted a question that the limit is not use index
 to lookup row
  when I issue a sql.
  
  Theoretically, the lock is used when the sql update
 table data and
  update table indexes, It ensures updating data and
 updating indexes are
  synchronous.
  
  Why does the limit use early row lookup but not late
 row lookup?
  
  For example :
  
  Create table test (id int primary key, name char(20));
  
  select * from test order by id limit 100, 10.
  
  The above sql is very slow when one fetch column is not
 in the indexes
  and the offset is more than million.
  
  The above sql count off the 110 rows and return the
 top 10 rows.
  
  But the sql: select id from test order by id limit
 100, 10 is very
  fast,it skips to count off the 100 rows by using
 indexes, and
  count off
  10 rows only.
  
  I don’t know the reason.
  
  My point is that the limit can use late row lookup:
 lookup rows after
  checking indexes to optimize the select speed.
  
  But the mysql optimizer do it with the early row
 lookup: lookup all
  rows before checking indexes when the one fetch column
 is not in the
  indexes.
  
  Tell me why?
  
  Thanks
  
  
  
  Sincerely yours,
  
  Zhigang zhang
 
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql

 To unsubscribe:    http://lists.mysql.com/mysql

 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Why does the limit use the early row lookup.

2012-04-22 Thread Zhangzhigang
Why does not the mysql developer team to do this optimization?

--- 12年4月20日,周五, Reindl Harald h.rei...@thelounge.net 写道:

 发件人: Reindl Harald h.rei...@thelounge.net
 主题: Re: Why does the limit use the early row lookup.
 收件人: mysql@lists.mysql.com
 日期: 2012年4月20日,周五,下午3:50
 
 
 Am 20.04.2012 04:29, schrieb 张志刚:
  My point is that the limit can use late row lookup:
 lookup rows after
  checking indexes to optimize the select speed.
  
  But the mysql optimizer do it with the early row
 lookup: lookup all rows
  before checking indexes when the one fetch column is
 not in the indexes.
  
  Tell me why?
 
 because the mysql optimizer until now is really
 bad in many situations - order by rand() makes a
 temporary table wil ALL data as example even with limit
 
 select * from table order by rand() limit 10;
 reads and writes the whole table to disk
 have fun with large tables :-)
 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql