On Tue, Mar 11, 2008 at 22:17:51 +0300, Tomash Brechko wrote: > So, can we all agree on fixed (user controllable) number of points per > server,
The last thing to consider is whether or not to hash the namespace prefix along with the key. In Cache::Memcached compatibility mode namespace prefix is not hashed, but I think for compatible Ketama implementation it should be hashed. Not hashing the prefix is counter-intuitive: the same "some/key" may end up being hosted on different servers, because one client could use namespace: "some", key: "key", and the other namespace: "", key: "some/key". My guess is that most users expect this to be interchangeable (and this is also a requirement for interoperability with clients that do not have the notion of namespace prefix). The reference implementation may be found at http://git.openhack.ru/?p=Cache-Memcached-Fast.git;a=blob;f=src/dispatch_key.c;hb=ketama-compat Number of points for a given server is computed on line 160. Hashing of "server:port<4-byte little-endian seqno starting from zero>" happens on lines 166-185 ("host" and "port" are exactly as specified by the user). Fast-forward on insert collision is on lines 208-209. Namespace prefix hashing happens on line 273, and requested key hashing is on line 237. Binary search that is used both to find the position for new points on server insert, and to find a server for a given key, is on lines 48-83. In particular, rewind to the first server among the run of equal points is on lines 71-72 (note that this code is executed very rarely). Wraparound is on lines 79-80 (on insert we detect wraparound and insert to the end, lines 195-198). FNV-1a hash function is implemented in a separate file (because shouldn't be copyrighted), http://git.openhack.ru/?p=Cache-Memcached-Fast.git;a=blob;f=src/compute_fnv1a.h;hb=ketama-compat I'm ready to release the code, but to call it "compatible Ketama implementation" there ought to be more than one client implementing it. So I ask client authors: please either explicitly express your commitment to the above algorithm, or share your doubts/concerns. When, say, at least 3 more client authors will commit to it, I will consider this as carved in stone, and release the next version of C::M::F. But please do actually try to implement the algorithm before agreeing on it. The implementation won't take long, but perhaps you will encounter some difficulties with it for your language and/or client architecture. Thanks in advance! ;) -- Tomash Brechko
