On 6/28/10 3:29 AM, Lu Ming wrote: > Every one hour HintedHandOffManager will check hintedhandoff > ColumnFamily then send out the big rowmutations to alive nodes, > It fails again because of the TimeoutException, so the task will never > finish and the big rowmutation is sending again and again. > In multi-datacenters, a big rowmutation can not be transferred in > several seconds. so It is a potential risk when a big rowmutation occurs. This is one of the several reasons that folks using Cassandra in production often wire Hinted Handoff off via the storage-conf.xml tunables, which I gather are now (0.6) per-keyspace resolution. [1]
Another unpleasant possible consequence of the current implementation is the following scenario : a) Nodes A, B, C are responsible for a range. b) Node A experiences a load spike and becomes non-responsive. c) Nodes B and C take over A's read *AND WRITE* load, as they are writing Hints via Hinted Handoff. [2] d) The additional load on node B (due in part to the current relatively inefficient current implementation of HH [3]) causes enough thrash for node B to become unresponsive. e) Node C now starts storing hints for node B in addition to its node A hints (and 3x its normal read traffic!), which encourages its own meltdown. =Rob [1] https://issues.apache.org/jira/browse/CASSANDRA-894 [2] This could be slightly mitigated by using CL.ANY, which would write to coordinating nodes, which there are more of than replica nodes. [2] https://issues.apache.org/jira/browse/CASSANDRA-1142