[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-9914: --- Fix Version/s: (was: 2.1.x) Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland Assignee: Marcus Eriksson Attachments: cass_high_cpu.png, high_pending_compactions.txt We have a 3-node test cluster (initially running 2.1.8) with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-9914: - Attachment: high_pending_compactions.txt Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland Attachments: high_pending_compactions.txt We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-9914: - Description: We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with [CASSANDRA-9637|https://issues.apache.org/jira/browse/CASSANDRA-9637], but we observed the same behavior. was: We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with [CASSANDRA-9637|https://issues.apache.org/jira/browse/CASSANDRA-9637], but we observed the same behavior. Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with [CASSANDRA-9637|https://issues.apache.org/jira/browse/CASSANDRA-9637], but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-9914: - Description: We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. was: We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with [CASSANDRA-9637|https://issues.apache.org/jira/browse/CASSANDRA-9637], but we observed the same behavior. Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-9914: - Attachment: cass_high_cpu.png Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland Attachments: cass_high_cpu.png, high_pending_compactions.txt We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-9914: - Description: We have a 3-node test cluster (initially running 2.1.8) with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. was: We have a 3-node test cluster with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland Attachments: cass_high_cpu.png, high_pending_compactions.txt We have a 3-node test cluster (initially running 2.1.8) with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9914) Millions of fake pending compaction tasks + high CPU
[ https://issues.apache.org/jira/browse/CASSANDRA-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9914: --- Assignee: Marcus Eriksson Fix Version/s: 2.1.x Millions of fake pending compaction tasks + high CPU Key: CASSANDRA-9914 URL: https://issues.apache.org/jira/browse/CASSANDRA-9914 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS Reporter: Robbie Strickland Assignee: Marcus Eriksson Fix For: 2.1.x Attachments: cass_high_cpu.png, high_pending_compactions.txt We have a 3-node test cluster (initially running 2.1.8) with *zero traffic* and about 10GB of data on each node. It's showing millions of pending compaction tasks (but no actual work in progress), and the CPUs are pegged on all three nodes. The task count goes down rapidly, but then jumps back up again seconds later. All tables are set to STCS. The issue persists after restart, but takes a few minutes before it becomes a problem. SSTable counts are below 10 for every table. We're also seeing 20s Old Gen GC pauses about every 2-3 mins. This started happening after bulk loading some old data. We started seeing very long GC pauses (sometimes 30 min or more) that would bring down the nodes. We then truncated this table, which resulted in the current behavior. We attempted to roll back our cluster to 2.1.7 patched with CASSANDRA-9637, but we observed the same behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)