On Wed, Jan 27, 2010 at 8:34 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > While being able to write (with CL.ZERO or new-in-0.6 ANY) even if all > the real write targets are down is cool, but since your goal in real > life is to keep enough replicas alive that you can actually do reads, > I'm not sure how useful it is. HH also has a measurable performance > problem in small clusters (that is, where cluster size is not much > larger than replication factor) since having a node go down means you > will increase the write load on the remaining nodes a non-negligible > amount to write the hints, which can be a nasty surprise if you > weren't planning for it.
The HH code currently tries to send the hints to nodes other than the natural endpoints. If small-scale performance is a problem, we could make the natural endpoints be responsible for the hints. This reduces durability a bit, but might be a decent tradeoff. > As for HH's consistency-improving characteristics, remember that HH is > not reliable (it's possible for a node to be down for several seconds > before HH gets turned on for it; it's also possible that the node with > the hints itself goes down before the target node comes back up), > which is why we needed the anti-entropy repair code. So I think you > could make the case that now that we have anti-entropy, read repair > will be sufficient to handle inconsistency on "hot" keys, with > anti-entropy to handle infrequently accessed ones. (Remembering of > course that if you wanted strong consistency in the first place, you > need to be doing quorum reads and writes and HH doesn't really matter. > So we are talking about how to reduce inconsistency, when the client > has explicitly told us they're okay with seeing a little.) I'm not up to date on the latest AES work, but if its running automatically (not just admin triggered) and converges relatively quickly, I think this is a good move. > Finally, I note that Cliff Moon, the author of Dynomite (probably the > most advanced pure Dynamo clone), deliberately left HH out for I > believe substantially these reasons. (CC'd in case he wants to chime > in. :) Please share the secret knowledge. :) -ryan