On 1/4/07, Daniel Kiss <[EMAIL PROTECTED]> wrote:
Hi All, I'm designing a database where it is required for each row in a table to have a unique identifier accross a bunch of running MySQL servers on a network. I'm considering two options: 1. Have a two-column primary key, where the first column is a normal auto incremented ID field, the second identifies the database and contains a constant ID per server. Something like this: CREATE TABLE MyTable ( ROWID int unsigned not null auto_increment, DBID int unsigned not null, AnyData varchar(10) not null, PRIMARY KEY (ROWID, DBID) ); INSERT INTO MyTable (DBID, AnyData) VALUES (8, 'Any text'); 2. I would use a traditional one-column binary primary key populated by the built-in uuid() fuction of MySQL. Like this: CREATE TABLE MyTable ( ID binary(36) not null, AnyData varchar(10) not null, PRIMARY KEY (ID) ); INSERT INTO MyTable (ID, AnyData) VALUES (uuid(), 'Any text'); In my view both solutions have their adventages and disadvantages. The first is more optimal in storage space and speed, the second is easier to maintain, administer and query. And there is another aspect, which is needed to be tested, I guess... How fast is the second solution when I execute complex queries based on primary key relations? Does it pay to use the ease of the second solution? Anyone has any experience in similar problems? What is your opinion?
Human readability is important in case something goes wrong, and I like the first solution better. I looked at the format of UUID in the MySQL documentation ... not all that human-friendly. In all the database code I've written (admittedly, all on a single server), I've always had a function (written in PHP) that returns server unique identifiers (and a globally unique identifier is along the same lines). They have typically been fairly long character strings that include the Unix time (seconds and microseconds) and the PID. I typically also spin-lock until the microtime changes--that way it is guaranteed that two processes can't have the same PID at the same time. I'd say go with (a)generation that you understand and control, and (b)human-readability in case you have to dissect it. The server-unique identifiers that I described above have the advantage that they double as timestamps. Just an opinion. Dave.