[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202520#comment-16202520 ] Marcus Eriksson commented on CASSANDRA-13943: - Yeah getting the lock is clearly a problem My thinking was that we should try if we see a big improvement when lowering the throughput, and thus finishing fewer compactions where we grab the write lock to replace sstables You might also want to try the patch in CASSANDRA-13948 > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: cassandra-jstack-2017-10-12-infinite-sstable-adding.txt, > cassandra-jstack-2017-10-12.txt, cassandra.yaml, debug.log, > debug.log-with-commit-d8f3f2780, debug.log.1.zip, debug.log.zip, jvm.options > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202502#comment-16202502 ] Dan Kinder commented on CASSANDRA-13943: Sure, I can try, but I'm not sure how that would help, these nodes have plenty of cpu and iops available... But there is nonetheless a problem, because flushers are getting stuck. Looking at the stack trace, it seems to come down to the readLock in CompactionStrategyManager. The flushers are blocked on that: {noformat} Thread 7385: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=836 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(int) @bci=83, line=967 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) @bci=10, line=1283 (Compiled frame) - java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock() @bci=5, line=727 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionStrategyManager.createSSTableMultiWriter(org.apache.cassandra.io.sstable.Descriptor, long, long, org.apache. cassandra.io.sstable.metadata.MetadataCollector, org.apache.cassandra.db.SerializationHeader, java.util.Collection, org.apache.cassandra.db.lifecycle. LifecycleTransaction) @bci=4, line=901 (Compiled frame) - org.apache.cassandra.db.ColumnFamilyStore.createSSTableMultiWriter(org.apache.cassandra.io.sstable.Descriptor, long, long, org.apache.cassandra.io. sstable.metadata.MetadataCollector, org.apache.cassandra.db.SerializationHeader, org.apache.cassandra.db.lifecycle.LifecycleTransaction) @bci=21, line=515 (Compiled frame) - org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(org.apache.cassandra.db.lifecycle.LifecycleTransaction, java.lang.String, org.apache.cassandra.db.PartitionColumns, org.apache.cassandra.db.rows.EncodingStats) @bci=104, line=506 (Compiled frame) - org.apache.cassandra.db.Memtable$FlushRunnable.(org.apache.cassandra.db.Memtable, java.util.concurrent.ConcurrentNavigableMap, org.apache.cassandra.db.Directories$DataDirectory, org.apache.cassandra.db.PartitionPosition, org.apache.cassandra.db.PartitionPosition, org.apache.cassandra.db.lifecycle. LifecycleTransaction) @bci=253, line=447 (Compiled frame) - org.apache.cassandra.db.Memtable$FlushRunnable.(org.apache.cassandra.db.Memtable, org.apache.cassandra.db.PartitionPosition, org.apache.cassandra. db.PartitionPosition, org.apache.cassandra.db.Directories$DataDirectory, org.apache.cassandra.db.lifecycle.LifecycleTransaction) @bci=19, line=417 (Compiled frame) - org.apache.cassandra.db.Memtable.createFlushRunnables(org.apache.cassandra.db.lifecycle.LifecycleTransaction) @bci=125, line=318 (Interpreted frame) - org.apache.cassandra.db.Memtable.flushRunnables(org.apache.cassandra.db.lifecycle.LifecycleTransaction) @bci=2, line=300 (Compiled frame) - org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(org.apache.cassandra.db.Memtable, boolean) @bci=82, line=1137 (Compiled frame) - org.apache.cassandra.db.ColumnFamilyStore$Flush.run() @bci=85, line=1102 (Interpreted frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1149 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Compiled frame) - org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable) @bci=1, line=81 (Compiled frame) - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5.run() @bci=4 (Compiled frame) - java.lang.Thread.run() @bci=11, line=748 (Compiled frame) {noformat} All the callers in CompactionStrategyManager that *aren't* blocked on a lock are trying to get background tasks: {noformat} Thread 1468: (state = IN_JAVA) - java.util.HashMap$HashIterator.nextNode() @bci=95, line=1441 (Compiled frame; information may be imprecise) - java.util.HashMap$KeyIterator.next() @bci=1, line=1461 (Compiled frame) - java.util.AbstractCollection.toArray() @bci=39, line=141 (Compiled frame) - java.util.ArrayList.(java.util.Collection) @bci=6, line=177 (Compiled frame) - org.apache.cassandra.db.compaction.LeveledManifest.ageSortedSSTables(java.util.Collection) @bci=5, line=731 (Compiled frame) - org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(int) @bci=187, line=644 (Compiled frame) - org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates() @bci=290, line=385 (Compiled frame) - org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(int) @bci=4, line=119 (Compiled frame) -
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202454#comment-16202454 ] Marcus Eriksson commented on CASSANDRA-13943: - I think you could lower the number of {{concurrent_compactors}} (maybe 10 or so) and you probably need to set {{compaction_throughput_mb_per_sec}} to something conservative, I would start with 10 and then increase if things look good. > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: cassandra-jstack-2017-10-12-infinite-sstable-adding.txt, > cassandra-jstack-2017-10-12.txt, cassandra.yaml, debug.log, > debug.log-with-commit-d8f3f2780, debug.log.1.zip, debug.log.zip, jvm.options > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202320#comment-16202320 ] Marcus Eriksson commented on CASSANDRA-13943: - but not seeing any infinite compactions in the logs anymore? and could you post your cassandra.yaml and describe how much data you are writing? > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: cassandra-jstack-2017-10-12.txt, debug.log, > debug.log-with-commit-d8f3f2780 > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201286#comment-16201286 ] Dan Kinder commented on CASSANDRA-13943: I've rolled out https://github.com/krummas/cassandra/commits/marcuse/13215 now per the last comment on https://issues.apache.org/jira/browse/CASSANDRA-13215 and am currently running relocatesstables. So far it is not doing the infinite compaction thing. However it's a bit painful, it seems like nodetool compactionstats and nodetool cfstats no longer work, they hang indefinitely for me. > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log, debug.log-with-commit-d8f3f2780 > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199363#comment-16199363 ] Dan Kinder commented on CASSANDRA-13943: Hm, it looks like the new patch though does not change any behavior, and without any changes I don't think my nodes will be able to do any flushing... I might try it on just one node and send those logs. > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198777#comment-16198777 ] Dan Kinder commented on CASSANDRA-13943: [~krummas] it looks like this latest past does not include the changes from the simple-cache patch, I'm assuming I should leave that one applied? I.e. use both patches? > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198778#comment-16198778 ] Dan Kinder commented on CASSANDRA-13943: Ope didn't see your message. Got it. > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198770#comment-16198770 ] Marcus Eriksson commented on CASSANDRA-13943: - [~dkinder] you should probably revert that patch as it doesn't invalidate the cache. I'll hopefully post my patch to 13215 today > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198769#comment-16198769 ] Dan Kinder commented on CASSANDRA-13943: Yeah I am running with the patch from https://issues.apache.org/jira/browse/CASSANDRA-13215 I'll try that latest patch today. Thanks [~krummas] > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198388#comment-16198388 ] Marcus Eriksson commented on CASSANDRA-13943: - It would be really helpful if you could start one of the nodes with [this|https://github.com/krummas/cassandra/commits/marcuse/log_compactionindex] patch and post the logs > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198356#comment-16198356 ] Marcus Eriksson commented on CASSANDRA-13943: - There is clearly a bug in the startsWith code, patch for that [here|https://github.com/krummas/cassandra/commits/marcuse/13943] But since you have another subdirectory after the similar prefix, I don't think that is the problem here. Reading back some other issues - it seems you were running the (somewhat broken) patch from CASSANDRA-13215 for a while - is that still true on these nodes? Could you start these nodes with a patch that logs a bit more? > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197682#comment-16197682 ] Dan Kinder commented on CASSANDRA-13943: FYI: {noformat} data_file_directories: - /srv/disk1/cassandra-data - /srv/disk2/cassandra-data - /srv/disk3/cassandra-data - /srv/disk4/cassandra-data - /srv/disk5/cassandra-data - /srv/disk6/cassandra-data - /srv/disk7/cassandra-data - /srv/disk8/cassandra-data - /srv/disk9/cassandra-data - /srv/disk10/cassandra-data - /srv/disk11/cassandra-data - /srv/disk12/cassandra-data {noformat} > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197459#comment-16197459 ] Dan Kinder commented on CASSANDRA-13943: I do see a questionable {{startsWith}} here: https://github.com/apache/cassandra/blob/7d4d1a32581ff40ed1049833631832054bcf2316/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java#L309 Also here: https://github.com/apache/cassandra/blob/3cec208c40b85e1be0ff8c68fc9d9017945a1ed8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L570 > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197350#comment-16197350 ] Marcus Eriksson commented on CASSANDRA-13943: - {{/srv/disk10/..., /srv/disk1/...}} - I guess there is a prefix matching problem somewhere - I'll get a patch out tomorrow > Infinite compaction of L0 SSTables in JBOD > -- > > Key: CASSANDRA-13943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13943 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.11.0 / Centos 6 >Reporter: Dan Kinder > Attachments: debug.log > > > I recently upgraded from 2.2.6 to 3.11.0. > I am seeing Cassandra loop infinitely compacting the same data over and over. > Attaching logs. > It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It > does create new SSTables but immediately recompacts again. Note that I am not > inserting anything at the moment, there is no flushing happening on this > table (Memtable switch count has not changed). > My theory is that it somehow thinks those should be compaction candidates. > But they shouldn't be, they are on different disks and I ran nodetool > relocatesstables as well as nodetool compact. So, it tries to compact them > together, but the compaction results in the exact same 2 SSTables on the 2 > disks, because the keys are split by data disk. > This is pretty serious, because all our nodes right now are consuming CPU > doing this for multiple tables, it seems. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org