[jira] Commented: (CASSANDRA-2058) Nodes periodically spike in load

David King (JIRA) Tue, 25 Jan 2011 22:50:15 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986865#action_12986865
 ]


David King commented on CASSANDRA-2058:
---------------------------------------

bq. You were running 0.6.8 + DS before? Or is "it" not DynamicSnitch?

I was running 0.6.8 with no DES. Then I upgraded to 0.6.10 and turned it on. I 
had the aforementioned problems.

Now I'm running 0.6.10 with the DES turned off. (As of this writing, I'm still 
seeing the momentary spikes but thus far no sustained ones.)

If I continue to have the momentary or sustained spikes (I'll probably know by 
the morning), then I'll revert to 0.6.8, and turn *on* the DES.

If after that I continue to have problems I'll revert back to 0.6.8 with no 
DES, which is at least a configuration in which I didn't have any of these 
problems

> Nodes periodically spike in load
> --------------------------------
>
>                 Key: CASSANDRA-2058
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2058
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.10
>            Reporter: David King
>         Attachments: cassandra.pmc01.log.bz2, cassandra.pmc14.log.bz2, graph 
> a.png, graph b.png
>
>
> (Filing as a placeholder bug as I gather information.)
> At ~10p 24 Jan, I upgraded our 20-node cluster from 0.6.8->0.6.10, turned on 
> the DES, and moved some CFs from one KS into another (drain whole cluster, 
> take it down, move files, change schema, put it back up). Since then, I've 
> had four storms whereby a node's load will shoot to 700+ (400% CPU on a 4-cpu 
> machine) and become totally unresponsive. After a moment or two like that, 
> its neighbour dies too, and the failure cascades around the ring. 
> Unfortunately because of the high load I'm not able to get into the machine 
> to pull a thread dump to see wtf it's doing as it happens.
> I've also had an issue where a single node spikes up to high load, but 
> recovers. This may or may not be the same issue from which the nodes don't 
> recover as above, but both are new behaviour

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-2058) Nodes periodically spike in load

Reply via email to