"Richard Lynch" <[EMAIL PROTECTED]> wrote on 03/10/2005 05:13:09 PM:
> Apologies if this comes through twice... > I *think* I fargled the To: the first time... > > My boss is claiming that having multiple 1-1 tables, with an index on the > keys, is better performance. > > Example of his claim: > > table_1: person_id, name, phone > table_2: person_id, address, city, state, zip, country > is faster than: > table: person_id, name, phone, address, city, state, zip, country > > > Don't seem right to me that you'd get better performance by having the > computer have to look up more crap, even if it is indexed. > > And surely "SELECT name, phone FROM ..." isn't going to be slowed down > *THAT* much by the other data in the table sitting there unused... Is it? > > You can assume we're not dumb enough to do "SELECT * FROM ..." when we > only need name and phone, okay? > > PS Since his largest table has way less than 1000 records, this question > is largely academic anyway, and he should go for more maintainable and > simpler schema. But I've given up on that argument. "We might sell this > to somebody someday with LOTS of records." Yeah, right. > > -- > Like Music? > http://l-i-e.com/artists.htm > > In my general experience, there are a few times when you need to split a single "logical" item across several "physical" tables (where there is a 1-1 between the tables for these records): a) You could put frequently accessed (or required) information on one table and rarely used (or optional) information on another. Smaller tables imply (but do not guarantee) faster response especially if one of the tables is a fixed-width table. b) You wanted/needed to use different ENGINES for different parts of the information. This occurs frequently in MySQL development when you want to use a fulltext index (MyISAM) on some fields while the rest of the data is stored in InnoDB. c) You want to assign separate permissions to groups of columns. You can put each group into their own table. This is related to but slightly different than column-level permissions. Fortunately, MySQL supports both design choices. Then there is the case of normalization. I can understand keeping phone numbers and addresses on separate tables than the person's name, birthdate, and other bits of unique information. People only get one set of names (not counting aliases, they would get a separate table) but they can have multiple phone numbers (home phone, work phone, fax machine, cell phone, etc.) and multiple addresses (work address, home address, vacation address, etc.). It makes great sense to split that kind of associated information into their own tables so that you can create 1-n relationships. However, if you have just 1-1 information and neither table is fixed width, and you do not have any of the other reasons (and I am sure there are more) that I outlined above, then you will be better served by keeping the information in a single table. Here is some more ammunition for your argument. The single slowest thing about using a database is using the hard disk. Each read request takes some average time (from <1ms to 100ms or more, depending on hardware) to find and retrieve any single block of information from the physical recording media. <disclaimer> THIS IS NOT AN OFFICIAL MYSQL EXAMPLE BUT IS MEANT TO BE REPRESENTATIVE OF HOW MANY DATABASES OPERATE Say you want to find a record whose PK value is 16. (If we assume that the OS has already cached the physical file information, then we can eliminate disk reads of the file allocation indexes in order to locate each file). seek 1 - find the first record of PK index file seek 2 - scan/search the index file for record with value 16. Get the offset of the data record in the data file from this index record. seek 3 - Use the offset found in the index to position the read heads at the beginning of the data record in the data file. seek 4 - Scan the disk to read record the entire record into memory (this is usually faster (by a factor of 10 or more) than any of the other previous steps). In essence this takes nearly no time compared to all of the other seeking we have to do just to get to this point. </disclaimer> So by forcing the database to read one "logical" record from multiple physical tables (if it doesn't need to be arranged that way due to other reasons), your "boss" is forcing that cycle to repeat several times while if it were all on one table you would go through that cycle once and be done with it. HTH, Shawn Green Database Administrator Unimin Corporation - Spruce Pine