Brian Beuning napisaƂ(a):
I do not see "sequence of identical selections" being
a problem.  It is like saying you flip a coin 10 times
and getting 5 heads in a row means the coin is broken.
Except that, with the current algorithm, you are guaranteed, for a given number of trials and certain keys to always get the single one server. Which is not a slight statistical hazard, but a prescribed catastrophe.
I did not get your numbers.  Why not do a test with N
random keys?  If about N/2 go to the 2 servers then it
is OK.  No matter how many happen "in a row".
The methodology of the test does not matter that much. It was tested with keys of the form "1", "2", ... I am pretty sure that using random keys from a certain more "random" distribution would result with same effect.
If your app does 1000 key lookups while 1 of 2 hosts are
down, you would expect about 500 to fail (given the current
memcached availability model).  What does it matter if
500 in a row fail, or every other one fails?
The key issue here is determinism. If a random key in a random situation is innaccessible one time in a 1000 requests, there is no problem OK. (Well there is a problem, but let's pretend that we can shove it under the carpet). On the other hand, if the situation is deterministic and the "guilty" key (our one in a thousand, but a specific one) happens to permanently cause the home page of a web site to load 5 times slower because it was caching a result of a heavy SQL query... well, talk about hazards.
With any hashing algorithm, it is always possible your
application uses keys that cause the hash to be unfair.
This is not a matter of being fair. In fact, the algorithm is fair and, from a certain point of view, it is too fair. A solution to my problem would be to guarantee that potentially every server gets tried in a round and at the same time the query is random and fair in a sense. Which brings us to the random permutation solution I proposed.

Reply via email to