Re: [Dbmail-dev] Replace unique_id with GUIDfor LoadBalancing&Failover

Kevin Baker Fri, 27 May 2005 00:36:26 +0200 (CEST)

What about going back to the idea of a uid provider...

The RFC talks a good bit about a "next unique identifier"
value. If this value was infact transfered to all servers
in the cluster it could be used to increment the id.
Server ID could be appended to the end of the increment
value to assure uniqueness even if all machines receive a
message at the same time.


This could be setup with a simple socket connection or
possibly ftp or scp transfer of a the file that holds the
next unique id.

So:
> echo /var/mail/uid/uid_next
> 7

If there were 3 machines id=1, id=2, id=3. The resulting
message id's if they all received a message at the same
time would be.

71, 72, 73

If they received messages at diffrent times it could end
up as:
71, 82, 93

Machine one would always end in "1", two with "2"...


This would of course require that a machine that was
restarted sync to the cluster before accepting messages..
but this could just be part of the startup process.


Sounds a bit less elagant buy might work fine. Plus just
another idea to put out there. It would likely take care
of the 32bit restrictions with timestamps.



Kevin Baker


> Here's a thought:
>
> Mandatory mailbox compaction. If we go a route that uses
> time for message
> insertion, and allows only (say) 6 months of uniqueness,
> we can be fully
> multimaster, no worries, BUT once every 6 months, the
> UID's MUST be
> compacted, and the UIDVALIDITY value of the mailbox
> incremented, as well
> as the "last compaction time" which will be the value
> subtracted from the
> current time to yield the truncated time (24 bits or
> whatever):
>
> http://lists.ccil.org/pipermail/fetchmail-friends/2004-May/008697.html
>
> And (big ass RFC 3501 quote):
>
>
> 2.3.1.1.        Unique Identifier (UID) Message Attribute
>
>    A 32-bit value assigned to each message, which when
> used with the
>    unique identifier validity value (see below) forms a
> 64-bit value
>    that MUST NOT refer to any other message in the mailbox
> or any
>    subsequent mailbox with the same name forever.  Unique
> identifiers
>    are assigned in a strictly ascending fashion in the
> mailbox; as each
>    message is added to the mailbox it is assigned a higher
> UID than the
>    message(s) which were added previously.  Unlike message
> sequence
>    numbers, unique identifiers are not necessarily
> contiguous.
>
>    The unique identifier of a message MUST NOT change
> during the
>    session, and SHOULD NOT change between sessions.  Any
> change of
>    unique identifiers between sessions MUST be detectable
> using the
>    UIDVALIDITY mechanism discussed below.  Persistent
> unique identifiers
>    are required for a client to resynchronize its state
> from a previous
>    session with the server (e.g., disconnected or offline
> access
>    clients); this is discussed further in [IMAP-DISC].
>
>    Associated with every mailbox are two values which aid
> in unique
>    identifier handling: the next unique identifier value
> and the unique
>    identifier validity value.
>
>    The next unique identifier value is the predicted value
> that will be
>    assigned to a new message in the mailbox.  Unless the
> unique
>    identifier validity also changes (see below), the next
> unique
>    identifier value MUST have the following two
> characteristics.  First,
>    the next unique identifier value MUST NOT change unless
> new messages
>    are added to the mailbox; and second, the next unique
> identifier
>    value MUST change whenever new messages are added to
> the mailbox,
>    even if those new messages are subsequently expunged.
>
>         Note: The next unique identifier value is intended
> to
>         provide a means for a client to determine whether
> any
>         messages have been delivered to the mailbox since
> the
>         previous time it checked this value.  It is not
> intended to
>         provide any guarantee that any message will have
> this
>         unique identifier.  A client can only assume, at
> the time
>         that it obtains the next unique identifier value,
> that
>         messages arriving after that time will have a UID
> greater
>         than or equal to that value.
>
>    The unique identifier validity value is sent in a
> UIDVALIDITY
>    response code in an OK untagged response at mailbox
> selection time.
>    If unique identifiers from an earlier session fail to
> persist in this
>    session, the unique identifier validity value MUST be
> greater than
>    the one used in the earlier session.
>
>         Note: Ideally, unique identifiers SHOULD persist
> at all
>         times.  Although this specification recognizes
> that failure
>         to persist can be unavoidable in certain server
>         environments, it STRONGLY ENCOURAGES message store
>         implementation techniques that avoid this problem.
>  For
>         example:
>
>          1) Unique identifiers MUST be strictly ascending
> in the
>             mailbox at all times.  If the physical message
> store is
>             re-ordered by a non-IMAP agent, this requires
> that the
>             unique identifiers in the mailbox be
> regenerated, since
>             the former unique identifiers are no longer
> strictly
>             ascending as a result of the re-ordering.
>
>          2) If the message store has no mechanism to store
> unique
>             identifiers, it must regenerate unique
> identifiers at
>             each session, and each session must have a
> unique
>             UIDVALIDITY value.
>
>          3) If the mailbox is deleted and a new mailbox
> with the
>             same name is created at a later date, the
> server must
>             either keep track of unique identifiers from
> the
>             previous instance of the mailbox, or it must
> assign a
>             new UIDVALIDITY value to the new instance of
> the
>             mailbox.  A good UIDVALIDITY value to use in
> this case
>             is a 32-bit representation of the creation
> date/time of
>             the mailbox.  It is alright to use a constant
> such as
>             1, but only if it guaranteed that unique
> identifiers
>             will never be reused, even in the case of a
> mailbox
>             being deleted (or renamed) and a new mailbox
> by the
>             same name created at some future time.
>
>          4) The combination of mailbox name, UIDVALIDITY,
> and UID
>             must refer to a single immutable message on
> that server
>             forever.  In particular, the internal date,
> [RFC-2822]
>             size, envelope, body structure, and message
> texts
>             (RFC822, RFC822.HEADER, RFC822.TEXT, and all
> BODY[...]
>             fetch data items) must never change.  This
> does not
>             include message numbers, nor does it include
> attributes
>             that can be set by a STORE command (e.g.,
> FLAGS).
>
>
>
> On Thu, May 26, 2005, ""Aaron Stone""
> <[EMAIL PROTECTED]> said:
>
>> I think an important question to ask is how many
>> messages we really want
>> to be able to fit into a mailbox. IMAP's 32 bit limit,
>> and that some
>> clients treat those 32 bits as signed (is this true? do
>> we need to
>> worry?), indicates that there's already a limit of 2
>> billion messages per
>> mailbox.
>>
>> If we're comfortable with a limit of, say, 16 million
>> messages per
>> mailbox, then we can go with 24 bits incrementing, 7
>> bits server id, and 1
>> bit lost.
>>
>> 2^24 == 16,777,216
>> 60 sec * 60 min * 24 hours * 365 days ==  31,536,000
>> Unfortunately, 24 bits will only hold about 6 months
>> worth of seconds.
>>
>> Using UNIX time only gives us until 2038 before we're
>> screwed; and that's
>> if there's one message per second. Using a time window
>> could help with
>> this, which is what I think Paul might have mentioned in
>> a previous email:
>>
>> next_uid = (curret_time - mailbox_time)
>>
>> By subtracting the mailbox's creation time from the
>> current time, we
>> effectively restart the sequence, and, in a weird way,
>> we give each
>> mailbox about 50 years to live from the time of the
>> mailbox's creation.
>>
>> If we do this:
>>
>> next_uid = (curret_time - mailbox_time) (24 bits) .
>> server id (7 bits)
>> Then the mailbox only has 6 months, with a cluster of
>> 128 machines.
>>
>> next_uid = (current_time - mailbox_time) (27 bits) .
>> server id (4 bits)
>> Then the mailbox has 5 years to live, with a cluster of
>> 16 machines.
>>
>> next_uid = (current_time - mailbox_time) (29 bits) .
>> server id (2 bits)
>> Then the mailbox has 17 years to live, with a cluster of
>> 4 machines.
>>
>> ---
>>
>> Sadly, it's looking like 32 bits is just too small to
>> combine time with
>> anything else. We could go to a keyserver architecture,
>> where there's one
>> machine that has the task of doling out the next keys,
>> which means we
>> would be bound not by time but by the number of
>> messages, but it also
>> means that we'd lose significant clustering flexibility.
>>
>> ***
>>
>> Why isn't the a spec for 64 bit IMAP ids yet?! Should we
>> write one?
>>
>> Aaron
>>
>>
>> On Thu, May 26, 2005, ""Kevin Baker""
>> <[EMAIL PROTECTED]> said:
>>
>>>> Geo Carncross wrote:
>>>>> On Thu, 2005-05-26 at 13:49 +0200, Paul J Stevens
>>>>> wrote:
>>>>>
>>>>>>Ok, let me recapitulate:
>>>>>>
>>>>>>- we want to replace all auto-incremented fields with
>>>>>> bigint fields to
>>>>>>hold uuids in order to accomodate N-clustered
>>>>>> databases.
>>>
>>> So the problem seems to be generating a sequential
>>> 32bit
>>> char based on time and server id.
>>>
>>> I'll have to do some reading on 32bit chars... to get
>>> my
>>> head around this...
>>>
>>>
>>> Kevin
>>>
>>> <snip: comments that seem to go over stuff from last
>>> years
>>> thread>
>>>
>>> _______________________________________________
>>> Dbmail-dev mailing list
>>> Dbmail-dev@dbmail.org
>>> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>>>
>>
>> --
>>
>>
>>
>> _______________________________________________
>> Dbmail-dev mailing list
>> Dbmail-dev@dbmail.org
>> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>>
>
> --
>
>
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>

Re: [Dbmail-dev] Replace unique_id with GUIDfor LoadBalancing&Failover

Reply via email to