On Thu, 2005-05-26 at 13:49 +0200, Paul J Stevens wrote:
> Ok, let me recapitulate:
> 
> - we want to replace all auto-incremented fields with bigint fields to
> hold uuids in order to accomodate N-clustered databases.

....

> - we can't directly use those uuids as UID message attributes because
> those are required to be 32bit, whereas uuids will need to be 64bit or
> longer.

They're _close_ to 32bit. Zero is explicitly disallowed. Some IMAP
clients have problems and treat it as signed sometimes (Thunderbird,
IIRC) - better to limit to 31-bit unsigned -1 (skip zero).

> So where does that leave us: we will have to map uuids to messsage-uids.
>
> The simplest scheme I can come up with that is not lossy is:
> 
> dbmail_messages.mailbox_idnr - dbmail_messsages.message_idnr;
>
> This is assuming that mailbox_idnr *and* message_idnr are time-based
> uuids.

What if two machines inserting have the same clock?

Worse: what if a machine with a fast clock receives a message after a
machine with a slow clock?

-- Don't fool yourself into thinking they're unlikely because an
attacker can cause this to happen easily on any distributed dbmail
setup. 


FAR Worse: What if a stupid system administrator runs their clock
backwards (using ntpdate or something because of complaints)

Don't assume administrators know what they're doing. How many
administrators do you know run DBMAIL with ulimit ahead ofit? :)

EVEN FAR WORSE: What happens in 60 years? :)





The problem is that a global sequence number generator is REQUIRED
because generating UID numbers is something that ABSOLUTELY MUST be
serialized (according to RFC2060).


Fortunately, this problem _is_ solvable, but it doesn't invole
UUID/GUIDs so you might not like it :)

1. Pg supports SEQUENCEs. Even with replication these will never
collide.

2. SQLite supports single files only, so this is moot.

3. MySQL supports sequences as well- older versions might have an issue,
but those versions won't support replication _either_.

So it would seen that the best method is to have a :

int db_get_next_sequence(const char *d, unsigned int *e);

function. It returns 0 if sequence space is exhausted (and UIDVALIDITY
needs updating and reodering, etc). *e will contain the resulting uid
number if db_get_next_sequence() returns nonzero.

d is a "domain" string- and is essentially the "sequence name". This
would be unique per mailbox per user.


-- 
Internet Connection High Quality Web Hosting
http://www.internetconnection.net/

Reply via email to