Re: Reason for Auto-increment primary keys?

Kenneth Wagner Wed, 21 Dec 2005 13:28:12 -0800

Hi Rhino,

Excellent question. Felt as you do, initially.


Here's what changed my mind.

Integer keys are fast. And small. Hence, they take very little RAM space.

They are contiguous. A missing PK is easy to find. There's a gap in thenumber sequence.Can't do this with the part description. No way to tell if a record ismissing.


Example: The system gets hung up or crashes and a reboot is needed.

How to test the integrity of the parts table. I.e., anything missing? Checkthe PK forcontinuity is a good place to start. With a timestamp I would even know thedatewhere the file got truncated. Example. It's Dec 20th. The highest date inthe file isDec 1st at rec# 1203023. That's where the analysis would begin. Other filesthatdidn't get truncated but have the related key # in them would tip me off asto how

much is missing. Like an order file.

Speed. Especially where related files are concerned. Foreign keys. Links onintegerfields are faster, smaller and more efficient. Keys remain smaller andfaster.

Activity testing: Let's say I do some statistical testing. Like how manynew partsper month on average. Easy to do with the integer PK. Even easier if it hasa timestamp.Then if the average suddenly drops or increases I would want to know why. Ormodifymy DB tables or coding. Note that the timestamp does not have to be in yourexampletable. It could be in an insert/update table that just tracks what has beenadded or updated

by PK, timestamp, activity type and updatedbyuserID.

So, there's 2 cents worth.

Wondering how relevant this is?

HTH,

Ken Wagner

----- Original Message -----From: "Rhino" <[EMAIL PROTECTED]>

To: "mysql" <mysql@lists.mysql.com>
Sent: Wednesday, December 21, 2005 2:54 PM
Subject: Reason for Auto-increment primary keys?

One technique that I see a lot on this mailing list is people puttingauto-incremented integer primary keys on their tables.
Maybe I'm just "old school" but I've always thought that you should choosea primary key based on data that is actually in the table wheneverpossible, rather than generating a new value out of thin air.
The only exception that comes to mind is things like ID numbers; forexample, it is better to use an internally-generated integer for anemployee number than it is to use an employee's name. Even the combinationof first name and last name is not necessarily unique - I could cite areal life example -and, of course, people can change their names. Thatmakes names less desireable than a generated value when you are trying touniquely indentify such entities. In such a case, a nice, reasonable shortinteger is easier.
I just found this rather good definition of primary keys athttp://www.utexas.edu/its/windows/database/datamodeling/dm/keys.html. Therelevant bit says that a primary key must have:
- a non-null value for each instance of the entity
- a value that is unique for each instance of an entity
- a value that must not change or become null during the life of the eachinstance of the entity
That article makes the same basic remarks about name vs. ID but makes thepoint that it is more commonly the case that table designers will usesomething like a social security number - an _externally_ generatednumber - to distinguish between employees rather than aninternally-generated number.
But the trend in this mailing list is toward using generated values asprimary keys in virtually EVERY table, even when good primary keys can befound in the (non-generated) data already existing in the table.
Now, I haven't done anything remotely resembling a quantified analysis somaybe I'm wildly exaggerating this trend. But I do seem to recall a lot oftable descriptions with auto-generated keys and I don't think they wereall a name vs. ID scenario....
Has anyone else noticed a similar trend?
If this trend is real, it doesn't seem like a very good trend to me. Forexample, if you were keeping track of parts in a warehouse, why wouldanyone make a table that looked like this:
ID (autogenerated PK)     PART_NO    PART_DESCRIPTION
1                                   A01             Widget
2                                    B03            Grapple Grommet
3                                    A02            Snow Shovel
4                                    D11            Whisk
5                                    C04            Duct Tape

when this table is simpler:

PART_NO (PK)   PART_DESCRIPTION
A01                 Widget
B03                Grapple Grommet
A02                Snow Shovel
D11                Whisk
C04                Duct Tape
Would anyone care to convince me that the first version of the table is"better" than the second version in some way?
I just want to be sure that no one has come along with some new andcompelling reason to autogenerate keys when perfectly good keys can befound within the data already. I don't mind being "old school" but I don'twant to be "out to lunch" :-)
Rhino



--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.371 / Virus Database: 267.14.2/208 - Release Date: 20/12/2005


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Reason for Auto-increment primary keys?

Reply via email to