[ https://issues.apache.org/jira/browse/CASSANDRA-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959338#comment-14959338 ]
Jeff Griffith edited comment on CASSANDRA-10515 at 10/15/15 7:19 PM: --------------------------------------------------------------------- Yeah, i saw all the blocked threads behind it. checking to see what monitoring tools are not checking for the previous instance to finish. but this is just an ugly side effect, isn't it? (a side effect of lock?) i will disable all monitoring and restart to be sure. (UPDATE: looks like a cron job piled those up after things got stuck. i disabled it to be sure.) was (Author: jeffery.griffith): Yeah, i saw all the blocked threads behind it. checking to see what monitoring tools are not checking for the previous instance to finish. but this is just an ugly side effect, isn't it? (a side effect of lock?) i will disable all monitoring and restart to be sure. > Commit logs back up with move to 2.1.10 > --------------------------------------- > > Key: CASSANDRA-10515 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10515 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: redhat 6.5, cassandra 2.1.10 > Reporter: Jeff Griffith > Assignee: Branimir Lambov > Priority: Critical > Labels: commitlog, triage > Attachments: CommitLogProblem.jpg, CommitLogSize.jpg, stacktrace.txt, > system.log.clean > > > After upgrading from cassandra 2.0.x to 2.1.10, we began seeing problems > where some nodes break the 12G commit log max we configured and go as high as > 65G or more before it restarts. Once it reaches the state of more than 12G > commit log files, "nodetool compactionstats" hangs. Eventually C* restarts > without errors (not sure yet whether it is crashing but I'm checking into it) > and the cleanup occurs and the commit logs shrink back down again. Here is > the nodetool compactionstats immediately after restart. > {code} > jgriffith@prod1xc1.c2.bf1:~$ ndc > pending tasks: 2185 > compaction type keyspace table completed > total unit progress > Compaction SyncCore *cf1* 61251208033 > 170643574558 bytes 35.89% > Compaction SyncCore *cf2* 19262483904 > 19266079916 bytes 99.98% > Compaction SyncCore *cf3* 6592197093 > 6592316682 bytes 100.00% > Compaction SyncCore *cf4* 3411039555 > 3411039557 bytes 100.00% > Compaction SyncCore *cf5* 2879241009 > 2879487621 bytes 99.99% > Compaction SyncCore *cf6* 21252493623 > 21252635196 bytes 100.00% > Compaction SyncCore *cf7* 81009853587 > 81009854438 bytes 100.00% > Compaction SyncCore *cf8* 3005734580 > 3005768582 bytes 100.00% > Active compaction remaining time : n/a > {code} > I was also doing periodic "nodetool tpstats" which were working but not being > logged in system.log on the StatusLogger thread until after the compaction > started working again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)