> our 50+ consistent hashing cluster is very reliable on normal > operations, incr/decr, get, set, multiget, etc. is not a problem. If > we have a problem with keys on wrong servers in the continuum, we > should have more problems, which we currently have not. > The cluster is always under relatively high load (the number of > connections for example is very high due to 160+ webservers in the > front). We are now expecting in a very few cases, that this > locking mechanism does not work. Two different clients try to lock the > with the same object (if you want to prevent multiple inserts in a > database on the same > primary key you have to explicitly set one key valid for all clients > and not a key with unique hashes in it), it works millions of times as > expected (we are generating a large number of user triggered database > inserts (~60/sec.) > with this construct). But a handful of locks does not work and shows > the behaviour described. So now my question is again: is it thinkable > (even if it is very implausible), that > a multithreaded memd does not provide 100% sure atomic add()?
restart memcached with -t 1 and see if it stops happening. I already said it's not possible.