What can be the reason for the handoff process not to finish?
Check for other errors about timing out during hint reply.
What would be the best way to recover from this situation?
If they are really causing trouble drop the hints via HintedHandoffManager JMX
MBean or stopping the node and deleting the files on disk. Then use repair
later.
What can be done to prevent this from happening again?
Hints are stored when either the node is down before the request starts or when
the coordinator times out waiting for the remote node. Check the logs for nodes
going down, and check the MessagingService MBean for TimedOuts from other
nodes. This may indicate issues with a cross DC connection.
Cheers
-
Aaron Morton
New Zealand
@aaronmorton
Co-Founder Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
On 27/09/2013, at 11:18 PM, Tom van den Berge t...@drillster.com wrote:
Hi,
One one of my nodes, the (storage) load increased dramatically (doubled),
within one or two hours. The hints column family was causing the growth. I
noticed one HintedHandoff process that was started some two hours ago, but
hadn't finished. Normally, these processes take only a few seconds, 15
seconds max, in my cluster.
The not-finishing process was handing the hints over to a host in another
data center. There were no warning or error messages in the logs, other than
the repeated flushing high-traffic column family hints.
I'm using Cassandra 1.2.3.
What can be the reason for the handoff process not to finish?
What would be the best way to recover from this situation?
What can be done to prevent this from happening again?
Thanks in advance,
Tom