[ https://issues.apache.org/jira/browse/CASSANDRA-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868832#comment-17868832 ]
Cameron Zemek commented on CASSANDRA-19776: ------------------------------------------- Okay I have found the cause of this. The EstimatedPartitionCount metric asks for reference to the CANONICAL sstables. {code:java} ViewFragment view = select(filter); Refs<SSTableReader> refs = Refs.tryRef(view.sstables); {code} Meanwhile there is a compaction running that includes a fully expired sstable. {code:java} Set<SSTableReader> actuallyCompact = Sets.difference(transaction.originals(), fullyExpiredSSTables); // ... try (Refs<SSTableReader> refs = Refs.ref(actuallyCompact);{code} But the compaction doesn't take a reference on the fully expired sstable. So the selectAndReference call by the EstimatedPartitionCount is stuck looping trying to take a reference to the fully expired sstable as that sstable has no references and so fails due to counts check: {code:java} boolean ref() { while (true) { int cur = counts.get(); if (cur < 0) return false; if (counts.compareAndSet(cur, cur + 1)) return true; } } {code} It spins until the compaction is completed when the fully expired sstable is removed from the CANONICAL set of sstables. > Spinning trying to capture readers > ---------------------------------- > > Key: CASSANDRA-19776 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19776 > Project: Cassandra > Issue Type: Bug > Reporter: Cameron Zemek > Priority: Normal > Attachments: extract.log > > > On a handful of clusters we are noticing Spin locks occurring. I traced back > all the calls to the EstimatedPartitionCount metric (eg. > org.apache.cassandra.metrics:type=Table,keyspace=testks,scope=testcf,name=EstimatedPartitionCount) > Using the following patched function: > {code:java} > public RefViewFragment selectAndReference(Function<View, > Iterable<SSTableReader>> filter) > { > long failingSince = -1L; > boolean first = true; > while (true) > { > ViewFragment view = select(filter); > Refs<SSTableReader> refs = Refs.tryRef(view.sstables); > if (refs != null) > return new RefViewFragment(view.sstables, view.memtables, > refs); > if (failingSince <= 0) > { > failingSince = System.nanoTime(); > } > else if (System.nanoTime() - failingSince > > TimeUnit.MILLISECONDS.toNanos(100)) > { > List<SSTableReader> released = new ArrayList<>(); > for (SSTableReader reader : view.sstables) > if (reader.selfRef().globalCount() == 0) > released.add(reader); > NoSpamLogger.log(logger, NoSpamLogger.Level.WARN, 1, > TimeUnit.SECONDS, > "Spinning trying to capture readers {}, > released: {}, ", view.sstables, released); > if (first) > { > first = false; > try { > throw new RuntimeException("Spinning trying to > capture readers"); > } catch (Exception e) { > logger.warn("Spin lock stacktrace", e); > } > } > failingSince = System.nanoTime(); > } > } > } > {code} > Digging into this code I found it will fail if any of the sstables are in > released state (ie. reader.selfRef().globalCount() == 0). > See the extract.log for an example of one of these spin lock occurrences. > Sometimes these spin locks last over 5 minutes. Across the worst cluster with > this issue, I ran a log processing script that everytime the 'Spinning trying > to capture readers' was different to previous one it would output if the > released tables were in Compacting state. Every single occurrence has it spin > locking with released listing a sstable that is compacting. > In the extract.log example its spin locking saying that nb-320533-big-Data.db > has been released. But you can see prior to it spinning that sstable is > involved in a compaction. The compaction completes at 01:03:36 and the > spinning stops. nb-320533-big-Data.db is deleted at 01:03:49 along with the > other 9 sstables involved in the compaction. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org