----- Original Message ----- From: "Paul Chu" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, June 19, 2004 10:55 PM Subject: RE: Clustered Index - Where is data inserted ? Not Answered
> Hi Rhino, > > Thanks for that explanation of how and why to use a clustered index ! :) > Again, I really haven't looked into how MySQL/InnoDB handles clustered indexes. I told you how they work in DB2, which I know much better than MySQL/InnoDB. Before using anything I say, I strongly advise you to check the MySQL/InnoDB documentation to see how *they* handle clustered indexes, it might not be the same. > I am just a complete beginner with MySql. I have extensive experience > with Sql Server, some Oracle and some DB2. > > It seems that if I want to create a very large master table with 300 > million rows using MYISAM > I could use an autoincrement as the PK and create indexes on the columns > I'm searching by: e.g. order date > Remember, clustering indexes have nothing to do with your PK or other indexes. Again, assuming MySQL/InnoDB works the same as DB2, a clustering index has all the same attributes as any other index PLUS one other attribute: it controls the placement of new rows in a table. Which columns should comprise your primary key and what other non-clustering indexes you should create still need to get decided. The primary key is probably the most important decision you need to make: remember, it could be single or multi-column but it must be unique and there can't be nulls in any part of the primary key. (Again, those are the rules for DB2; things may differ in MySQL for all I know.) The choice of primary key normally results from normalization; typically, you put your data into Third (or Fourth) Normal Form and then denormalize on a case by case basis if it seems appropriate. I've already explained the key thinking behind the selection of the clustering index. There's one other thing to remember about clustering indexes: you can only have one of them on a given table! That should be obvious if you think about it for a second: since the clustering index influences the physical placement of data and you can only store the data in one order, naturally there can only be one clustering index. Other non-clustering indexes are often chosen on the basis of normalization or simply familiarity with the data. For instance, most people would intuitively put an index on Social Security Number because they know it is unique and it could easily be something that we know about an employee so that we can look up his records. Other non-clustering indexes will be less obvious. They may or may not be unique. They may be single-column or multi-column. Their main job though will be to help you find things faster. In other words, they are a performance technique. In the case of DB2 (and hopefully MySQL/InnoDB), any query will return the right answer, even if you don't have any indexes. However, having an appropriate index should get you the answer (considerably) more quickly, assuming the optimizer actually uses the index. (That is not always a sure thing in DB2 - usually for good reasons - but I *think* that MySQL lets you specify that you want to use a specific index in a query.) Since I have no idea of how the MySQL optimizer works, I won't make any further remarks about it at this time for fear of putting my foot really deep into my mouth ;-) > Accordingly, new records are inserted at the end of the table and this > would make MASS inserts ( millions of rows ) very quick. > > I hope this assessment is correct ? > It *should* work the way you say, again assuming that clustering works the same in MySQL as in DB2. But, as always with databases, you should definitely test your design with a test database first and then monitor carefully as you scale up the amount of data to be sure that you don't get any rude surprises. And be sure to get someone like Heikki to verify that what I've said is applicable to MySQL/InnoDB. I don't want to sent you astray with inappropriate extrapolations from DB2! Rhino -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]