[ 
https://issues.apache.org/jira/browse/CASSANDRA-16637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-16637:
----------------------------------------
    Fix Version/s:     (was: 4.0-rc)
                   4.0.x

Looks like this can happen for two reasons - first we seem to be notifying the 
compaction strategies twice about the sstables that are getting removed after 
compaction - first via a 
[SSTableDeletingNotification|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogTransaction.java#L192]
 and then via a 
[SSTableListChangedNotification|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LifecycleTransaction.java#L233].
 This means that there is a small window where LCS doesn't know about the old 
or the new sstables, and in this case we might pick sstables for compaction 
that cause overlap with those new sstables.

Secondly there is a window after compaction where both the old and new sstables 
are 'live', so if we 
[reload|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java#L509]
 the compaction strategies just in that moment, both will be added to the 
compaction strategy and there will be overlaps (say we were compacting L2 -> 
L3, we will create a bunch of new L3 sstables from a set of L2 + L3 sstables, 
so the old + new L3 sstables will overlap)

Both these issues are currently handled by sending the offending sstable to L0, 
which of course is inefficient if its the new sstable getting getting sent 
there. This is not a new issue in 4.0 so I don't think we should block GA on it.

I'll work on fixes for these issues, but changing fix version.

> LongLeveledCompactionStrategyCQLTest.stressTestCompactionStrategyManager fails
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16637
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16637
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction/LCS
>            Reporter: Adam Holmberg
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 4.0.x
>
>
> Test is failing occasionally as follows:
> {noformat}
> Caused by: java.lang.AssertionError: Got unexpected overlap in level 3
>       at 
> org.apache.cassandra.db.compaction.LeveledGenerations.addAll(LeveledGenerations.java:161)
>       at 
> org.apache.cassandra.db.compaction.LeveledManifest.addSSTables(LeveledManifest.java:131)
>       at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.addSSTable(LeveledCompactionStrategy.java:365)
>       at 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.startup(CompactionStrategyManager.java:312)
>       at 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.reload(CompactionStrategyManager.java:532)
>       at 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.maybeReloadDiskBoundaries(CompactionStrategyManager.java:495)
>       at 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(CompactionStrategyManager.java:743)
>       at org.apache.cassandra.db.lifecycle.Tracker.notify(Tracker.java:508)
>       at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyDiscarded(Tracker.java:502)
>       at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:373)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1592)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1194)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1075)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> {noformat}
> [Recent 
> ci|https://ci-cassandra.apache.org/job/Cassandra-trunk/476/testReport/junit/org.apache.cassandra.db.compaction/LongLeveledCompactionStrategyCQLTest/stressTestCompactionStrategyManager/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to