A quick and dirty way is to run jstack a few times and see if you can spot some common methods where code is spending time.
Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thu, Sep 10, 2015 at 1:05 AM, Roman Tkachenko <ro...@mailgunhq.com> wrote: > Hey guys, > > We've been having issues in the past couple of days with CPU usage / load > average suddenly skyrocketing on some nodes of the cluster, affecting > performance significantly so majority of requests start timing out. It can > go on for several hours, with CPU spiking through the roof then coming back > down to norm and so on. Weirdly, it affects only a subset of nodes and it's > always the same ones. The boxes Cassandra is running on are pretty beefy, > 24 cores, and these CPU spikes go up to >1000%. > > What is the best way to debug such kind of issues and find out what > Cassandra is doing during spikes like this? Doesn't seem to be compaction > related as sometimes during these spikes "nodetool compactionstats" says no > compactions are running. > > Thanks! > >