Alex Stapleton wrote:
As Steve says in his follow-up we also need to handle collisions while creating the continuum. Simply incrementing the point value by 1 until we find an empty place is going to give us a rather biased continuum. I think that incrementing the point number 1 and recalculating the point value from that should give a better distribution while being fairly easy to actually implement.

I'm not a big fan of that idea because an implementer would have to be very careful to avoid insertion order bugs. You need the mapping from a key to a host to be the same whether the host list was initialized as (A,B,C,D) or as (A,B,D) with (C) added later, and any kind of "Oops, that bucket is taken already, do something else with the node I'm trying to insert" algorithm will, unless it's carefully implemented, potentially give you a subtly different data structure in those two cases.

I'm much more in favor of making "n" large enough that it's reasonable to say something like, "Oops, that bucket is taken already; overwrite it with me if the address of the node it currently points to is numerically less than mine, else don't insert myself here." Then you will deterministically end up with exactly the same search tree no matter what order the nodes are inserted. If you are inserting each node 10 times in a tree and your search space is a full 32 bits, the occasional node that's inserted 9 times instead is going to be basically irrelevant to the health of the system. Not perfectly balanced, sure, but there will almost never be any collisions to begin with and it isn't worth optimizing such a rare edge case.

-Steve

Reply via email to