> -----Original Message----- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Josh Berkus > Sent: Thursday, January 19, 2006 10:09 AM > To: Martijn van Oosterhout > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Surrogate keys (Was: enums) > > Martjin, > > > In any of these either misspellings, changes of names, ownership or > > even structure over time render the obvious useless as keys. There are > > techniques for detecting and reducing duplication but the point is that > > for any of these duplicates *can* be valid data. > > Please point me out where, in the writings of E.F. Codd or in the SQL > Standard, it says that keys have to be immutable for the life of the row.
Only do that for data that you care about. If you think that the data has no value, there is no need to have a way to identify a row. > Duplicate *values* can be valid data. Duplicate *tuples* show some > serious flaws in your database design. If you have a personnel > directory on which you've not bothered to define any unique constraints > other than the ID column, then you can't match your data to reality. If > you have two rows with the same first and last name, you don't know if > they are two different people or the same person, duplicated. Which > will be a big problem come paycheck time. > > Per E.F. Codd, each tuple is a *unique* predicate (or key) comprising a > set of values definining a *unique* data entity. i.e. "The employeee > named "John" "Little" at extension "4531". There is nothing anywhere > said about keys never changing. > > This is Databases 101 material. Really! I give it an 'F.' When the data changes, the problems generated are not just due to repercussions related to the child and parent tables related through the primary key. Someone has an invoice, and they call in with a question. A combination of their name and address was used as a primary key. They moved, and sent in a forwarding address. The DBA was smart enough to design the database to cascade results, so that there are no orphan records and we have not compromised the structure of the database. The customer calls in with a question about an old invoice. "We have no record of that transaction." I was a DBA for a database for a company with many millions of customers worldwide (e.g. the product registration table was 24 GB). Their design had natural keys in it. It caused dozens of problems, every single day. I content that most people are not smart enough to decide when a natural key is a good idea. The engineers that designed the database were probably pretty smart, since it sort of worked and had thousands of tables and hundreds of millions of rows in it. But one bad decision on a natural key will cause literally millions of dollars of damage. The primary defense I have heard so far is that the Oids are hard to understand. They are nothing in comparison with understanding what to do when you have 25 changes to primary keys on various tables every single day. Once you get used to Oids, I find it hard to believe that any intelligent person finds them confusing. Confusion resulting from having primary keys that are a moving target? Now that's confusion for you. IMO-YMMV. I think it is time for me to give it a rest, though. My experience may be very atypical and I feel strangely passionate about it. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org