"Richard Lynch" <[EMAIL PROTECTED]> wrote on 03/10/2005 05:13:09 PM:

> Apologies if this comes through twice...
> I *think* I fargled the To: the first time...
> 
> My boss is claiming that having multiple 1-1 tables, with an index on 
the
> keys, is better performance.
> 
> Example of his claim:
> 
> table_1: person_id, name, phone
> table_2: person_id, address, city, state, zip, country
> is faster than:
> table: person_id, name, phone, address, city, state, zip, country
> 
> 
> Don't seem right to me that you'd get better performance by having the
> computer have to look up more crap, even if it is indexed.
> 
> And surely "SELECT name, phone FROM ..." isn't going to be slowed down
> *THAT* much by the other data in the table sitting there unused...  Is 
it?
> 
> You can assume we're not dumb enough to do "SELECT * FROM ..." when we
> only need name and phone, okay?
> 
> PS Since his largest table has way less than 1000 records, this question
> is largely academic anyway, and he should go for more maintainable and
> simpler schema.  But I've given up on that argument.  "We might sell 
this
> to somebody someday with LOTS of records."  Yeah, right.
> 
> -- 
> Like Music?
> http://l-i-e.com/artists.htm
> 
> 


In my general experience, there are a few times when you need to split a 
single "logical" item across several "physical" tables (where there is a 
1-1 between the tables for these records):

a) You could put frequently accessed (or required) information on one 
table and rarely used (or optional) information on another. Smaller tables 
imply (but do not guarantee) faster response especially if one of the 
tables is a fixed-width table.
b) You wanted/needed to use different ENGINES for different parts of the 
information. This occurs frequently in MySQL development when you want to 
use a fulltext index (MyISAM) on some fields while the rest of the data is 
stored in InnoDB. 
c) You want to assign separate permissions to groups of columns. You can 
put each group into their own table. This is related to but slightly 
different than column-level permissions. Fortunately, MySQL supports both 
design choices.

Then there is the case of normalization. I can understand keeping phone 
numbers and addresses on separate tables than the person's name, 
birthdate, and other bits of unique information. People only get one set 
of names (not counting aliases, they would get a separate table) but they 
can have multiple phone numbers (home phone, work phone, fax machine, cell 
phone, etc.) and multiple addresses (work address, home address, vacation 
address, etc.). It makes great sense to split that kind of associated 
information into their own tables so that you can create 1-n 
relationships. However, if you have just 1-1 information and neither table 
is fixed width, and you do not have any of the other reasons (and I am 
sure there are more) that I outlined above, then you will be better served 
by keeping the information in a single table. 

Here is some more ammunition for your argument. The single slowest thing 
about using a database is using the hard disk. Each read request takes 
some average time (from <1ms to 100ms or more, depending on hardware) to 
find and retrieve any single block of information from the physical 
recording media. 

<disclaimer>
THIS IS NOT AN OFFICIAL MYSQL EXAMPLE BUT IS MEANT TO BE REPRESENTATIVE OF 
HOW MANY DATABASES OPERATE

Say you want to find a record whose PK value is 16. (If we assume that the 
OS has already cached the physical file information, then we can eliminate 
disk reads of the file allocation indexes in order to locate each file). 
seek 1 - find the first record of PK index file
seek 2 - scan/search the index file for record with value 16. Get the 
offset of the data record in the data file from this index record.
seek 3 - Use the offset found in the index to position the read heads at 
the beginning of the data record in the data file.
seek 4 - Scan the disk to read record the entire record into memory (this 
is usually faster (by a factor of 10 or more) than any of the other 
previous steps). In essence this takes nearly no time compared to all of 
the other seeking we have to do just to get to this point.
</disclaimer>


So by forcing the database to read one "logical" record from multiple 
physical tables (if it doesn't need to be arranged that way due to other 
reasons), your "boss" is forcing that cycle to repeat several times while 
if it were all on one table you would go through that cycle once and be 
done with it.

HTH,

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine



Reply via email to