On Thu, 2005-05-26 at 15:13 -0700, Kevin Baker wrote:
> What about going back to the idea of a uid provider...

This is a good idea. But at this point, why not make the database the
UID provider? SEQUENCEs are adequate for this purpose.

Believe it or not, a single UID provider isn't hard, but there's no
redundancy on it. Might as well use a single instance of Pg.

There _DO_ exist distributed algorithms for generating sequence numbers,
but they're very ugly- and databases have to do them anyway to provide
multi-master replication.

The best ones involve a ring of hosts:

 a -> b -> c
/|\        |
 |        \|/
 f <- e <- d

(use the IPv4 addresses to form a ring, that part's the easy part).

"a" starts with the token. if it has any messages ready to go in the
database, it increments the token and writes the messages to the
database. when it's done, it sends the token to "b". "B" then repeats
this algorithm until "a" gets it again.

Don't bother adding acks to that. A simple UDP unicast message is
sufficient.

To detect if the token has been lost (e.g. network ate it, machine
crashed, etc), we wait until a machine wants it.

"e" wants the token, so it tries to generate a new one. It gets sent to
"f" which says it hasn't seen a newer token so it sends it to "a" and so
on. Each machine acks _THIS STAGE_ (just use TCP connects, they're
simpler) so that the previous knows the token wasn't lost. Once the
token gets back to "e" it's the new token.

Now, "b" had this token, "a" never reached it so it sent the message to
"c":

      b (too busy to talk)

 a ------> c
/|\        |
 |        \|/
 f <- e <- d

"b" now writes it's message and tries to give it to "c". "c" notices
immediately this token is too old and deletes the messages "b" has
added.

when "b" gets the real token, it noticed it got regenerated somewhere
along the way so it reattempts it's last delivery batch.

"e" will eventually see this new token again and know it's made a full
circle so the messages it wrote last time have no objections. "e" marks
those messages as good and then moves on to the rest of it's work.

So, general order of operations are:

loop:
do we have messages ready? 
        yes: wait for 5 minutes.
                timeout: generate pseudotoken

get token:
        same as last token? mark my messages as ok
        write messages to db
        increment token
        pass to peer

generate pseudotoken:
        send to next peer
        if it fails, send to it's next peer
        never skip more than 1/2 of the hosts.
        if see same pseudotoken, make it real token and "get token"


This is resistant to 1/2 of the hosts being blown up by act of angry
network administrator.

If I missed something, it's my own fault and not the inventor of the
algorithm. I'll try and clear it up if you see anything :)


Other algorithms are more complex and MAY give better latency, but in
practice, the token ring has been good enough.

[[ those other algorithms ARE better if you have reliable group
communication. If you _WANT_ to try implementing that, fine, but it's a
whole other can of worms. ]]


-- 
Internet Connection High Quality Web Hosting
http://www.internetconnection.net/

Reply via email to