[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202520#comment-16202520
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

Yeah getting the lock is clearly a problem

My thinking was that we should try if we see a big improvement when lowering 
the throughput, and thus finishing fewer compactions where we grab the write 
lock to replace sstables

You might also want to try the patch in CASSANDRA-13948

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: cassandra-jstack-2017-10-12-infinite-sstable-adding.txt, 
> cassandra-jstack-2017-10-12.txt, cassandra.yaml, debug.log, 
> debug.log-with-commit-d8f3f2780, debug.log.1.zip, debug.log.zip, jvm.options
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-12 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202502#comment-16202502
 ] 

Dan Kinder commented on CASSANDRA-13943:


Sure, I can try, but I'm not sure how that would help, these nodes have plenty 
of cpu and iops available... But there is nonetheless a problem, because 
flushers are getting stuck.

Looking at the stack trace, it seems to come down to the readLock in 
CompactionStrategyManager. The flushers are blocked on that:
{noformat}
Thread 7385: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may 
be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
line=175 (Compiled frame)
 - 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
@bci=1, line=836 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(int) 
@bci=83, line=967 (Interpreted frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) 
@bci=10, line=1283 (Compiled frame)
 - java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock() @bci=5, 
line=727 (Compiled frame)
 - 
org.apache.cassandra.db.compaction.CompactionStrategyManager.createSSTableMultiWriter(org.apache.cassandra.io.sstable.Descriptor,
 long, long, org.apache. cassandra.io.sstable.metadata.MetadataCollector, 
org.apache.cassandra.db.SerializationHeader, java.util.Collection, 
org.apache.cassandra.db.lifecycle.   LifecycleTransaction) @bci=4, line=901 
(Compiled frame)
 - 
org.apache.cassandra.db.ColumnFamilyStore.createSSTableMultiWriter(org.apache.cassandra.io.sstable.Descriptor,
 long, long, org.apache.cassandra.io.   sstable.metadata.MetadataCollector, 
org.apache.cassandra.db.SerializationHeader, 
org.apache.cassandra.db.lifecycle.LifecycleTransaction) @bci=21, line=515   
(Compiled frame)
 - 
org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(org.apache.cassandra.db.lifecycle.LifecycleTransaction,
 java.lang.String, org.apache.cassandra.db.PartitionColumns, 
org.apache.cassandra.db.rows.EncodingStats) @bci=104, line=506 (Compiled frame)
 - 
org.apache.cassandra.db.Memtable$FlushRunnable.(org.apache.cassandra.db.Memtable,
 java.util.concurrent.ConcurrentNavigableMap, 
org.apache.cassandra.db.Directories$DataDirectory, 
org.apache.cassandra.db.PartitionPosition, 
org.apache.cassandra.db.PartitionPosition, org.apache.cassandra.db.lifecycle.   
LifecycleTransaction) @bci=253, line=447 (Compiled frame)
 - 
org.apache.cassandra.db.Memtable$FlushRunnable.(org.apache.cassandra.db.Memtable,
 org.apache.cassandra.db.PartitionPosition, org.apache.cassandra.  
db.PartitionPosition, org.apache.cassandra.db.Directories$DataDirectory, 
org.apache.cassandra.db.lifecycle.LifecycleTransaction) @bci=19, line=417 
(Compiled frame)
 - 
org.apache.cassandra.db.Memtable.createFlushRunnables(org.apache.cassandra.db.lifecycle.LifecycleTransaction)
 @bci=125, line=318 (Interpreted frame)
 - 
org.apache.cassandra.db.Memtable.flushRunnables(org.apache.cassandra.db.lifecycle.LifecycleTransaction)
 @bci=2, line=300 (Compiled frame)
 - 
org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(org.apache.cassandra.db.Memtable,
 boolean) @bci=82, line=1137 (Compiled frame)
 - org.apache.cassandra.db.ColumnFamilyStore$Flush.run() @bci=85, line=1102 
(Interpreted frame)
 - 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
 @bci=95, line=1149 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
(Compiled frame)
 - 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
 @bci=1, line=81 (Compiled frame)
 - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5.run() @bci=4 
(Compiled frame)
 - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
{noformat}

All the callers in CompactionStrategyManager that *aren't* blocked on a lock 
are trying to get background tasks:
{noformat}
Thread 1468: (state = IN_JAVA)
 - java.util.HashMap$HashIterator.nextNode() @bci=95, line=1441 (Compiled 
frame; information may be imprecise)
 - java.util.HashMap$KeyIterator.next() @bci=1, line=1461 (Compiled frame)
 - java.util.AbstractCollection.toArray() @bci=39, line=141 (Compiled frame)
 - java.util.ArrayList.(java.util.Collection) @bci=6, line=177 (Compiled 
frame)
 - 
org.apache.cassandra.db.compaction.LeveledManifest.ageSortedSSTables(java.util.Collection)
 @bci=5, line=731 (Compiled frame)
 - org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(int) 
@bci=187, line=644 (Compiled frame)
 - org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates() 
@bci=290, line=385 (Compiled frame)
 - 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(int)
 @bci=4, line=119 (Compiled frame)
 - 

[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202454#comment-16202454
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

I think you could lower the number of {{concurrent_compactors}} (maybe 10 or 
so) and you probably need to set {{compaction_throughput_mb_per_sec}} to 
something conservative, I would start with 10 and then increase if things look 
good.

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: cassandra-jstack-2017-10-12-infinite-sstable-adding.txt, 
> cassandra-jstack-2017-10-12.txt, cassandra.yaml, debug.log, 
> debug.log-with-commit-d8f3f2780, debug.log.1.zip, debug.log.zip, jvm.options
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202320#comment-16202320
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

but not seeing any infinite compactions in the logs anymore?

and could you post your cassandra.yaml and describe how much data you are 
writing?

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: cassandra-jstack-2017-10-12.txt, debug.log, 
> debug.log-with-commit-d8f3f2780
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-11 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201286#comment-16201286
 ] 

Dan Kinder commented on CASSANDRA-13943:


I've rolled out https://github.com/krummas/cassandra/commits/marcuse/13215 now 
per the last comment on https://issues.apache.org/jira/browse/CASSANDRA-13215 
and am currently running relocatesstables. So far it is not doing the infinite 
compaction thing.

However it's a bit painful, it seems like nodetool compactionstats and nodetool 
cfstats no longer work, they hang indefinitely for me.

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log, debug.log-with-commit-d8f3f2780
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199363#comment-16199363
 ] 

Dan Kinder commented on CASSANDRA-13943:


Hm, it looks like the new patch though does not change any behavior, and 
without any changes I don't think my nodes will be able to do any flushing... I 
might try it on just one node and send those logs.

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198777#comment-16198777
 ] 

Dan Kinder commented on CASSANDRA-13943:


[~krummas] it looks like this latest past does not include the changes from the 
simple-cache patch, I'm assuming I should leave that one applied? I.e. use both 
patches?

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198778#comment-16198778
 ] 

Dan Kinder commented on CASSANDRA-13943:


Ope didn't see your message. Got it.

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198770#comment-16198770
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

[~dkinder] you should probably revert that patch as it doesn't invalidate the 
cache.

I'll hopefully post my patch to 13215 today

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198769#comment-16198769
 ] 

Dan Kinder commented on CASSANDRA-13943:


Yeah I am running with the patch from 
https://issues.apache.org/jira/browse/CASSANDRA-13215

I'll try that latest patch today. Thanks [~krummas]

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198388#comment-16198388
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

It would be really helpful if you could start one of the nodes with 
[this|https://github.com/krummas/cassandra/commits/marcuse/log_compactionindex] 
patch and post the logs

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-10 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198356#comment-16198356
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

There is clearly a bug in the startsWith code, patch for that 
[here|https://github.com/krummas/cassandra/commits/marcuse/13943]

But since you have another subdirectory after the similar prefix, I don't think 
that is the problem here. Reading back some other issues - it seems you were 
running the (somewhat broken) patch from CASSANDRA-13215 for a while - is that 
still true on these nodes?

Could you start these nodes with a patch that logs a bit more?

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-09 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197682#comment-16197682
 ] 

Dan Kinder commented on CASSANDRA-13943:


FYI:
{noformat}
data_file_directories:
- /srv/disk1/cassandra-data
- /srv/disk2/cassandra-data
- /srv/disk3/cassandra-data
- /srv/disk4/cassandra-data
- /srv/disk5/cassandra-data
- /srv/disk6/cassandra-data
- /srv/disk7/cassandra-data
- /srv/disk8/cassandra-data
- /srv/disk9/cassandra-data
- /srv/disk10/cassandra-data
- /srv/disk11/cassandra-data
- /srv/disk12/cassandra-data
{noformat}

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-09 Thread Dan Kinder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197459#comment-16197459
 ] 

Dan Kinder commented on CASSANDRA-13943:


I do see a questionable {{startsWith}} here: 
https://github.com/apache/cassandra/blob/7d4d1a32581ff40ed1049833631832054bcf2316/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java#L309

Also here: 
https://github.com/apache/cassandra/blob/3cec208c40b85e1be0ff8c68fc9d9017945a1ed8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L570

> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
>Assignee: Marcus Eriksson
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13943) Infinite compaction of L0 SSTables in JBOD

2017-10-09 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197350#comment-16197350
 ] 

Marcus Eriksson commented on CASSANDRA-13943:
-

{{/srv/disk10/..., /srv/disk1/...}} - I guess there is a prefix matching 
problem somewhere - I'll get a patch out tomorrow


> Infinite compaction of L0 SSTables in JBOD
> --
>
> Key: CASSANDRA-13943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13943
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.11.0 / Centos 6
>Reporter: Dan Kinder
> Attachments: debug.log
>
>
> I recently upgraded from 2.2.6 to 3.11.0.
> I am seeing Cassandra loop infinitely compacting the same data over and over. 
> Attaching logs.
> It is compacting two tables, one on /srv/disk10, the other on /srv/disk1. It 
> does create new SSTables but immediately recompacts again. Note that I am not 
> inserting anything at the moment, there is no flushing happening on this 
> table (Memtable switch count has not changed).
> My theory is that it somehow thinks those should be compaction candidates. 
> But they shouldn't be, they are on different disks and I ran nodetool 
> relocatesstables as well as nodetool compact. So, it tries to compact them 
> together, but the compaction results in the exact same 2 SSTables on the 2 
> disks, because the keys are split by data disk.
> This is pretty serious, because all our nodes right now are consuming CPU 
> doing this for multiple tables, it seems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org