[ 
https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193934#comment-17193934
 ] 

Caleb Rackliffe commented on CASSANDRA-15164:
---------------------------------------------

Let's revisit the stack trace from CASSANDRA-15326, which is almost certainly 
the same here:

{noformat}
Exception in thread Thread[CompactionExecutor:113041,1,main] 
java.lang.IllegalStateException: Unable to compute ceiling for max when 
histogram overflowed
at 
org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:220) 
~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:115)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1926)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:424)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:99)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:183)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
at 
org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:153)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
{noformat}

All our compaction strategies, at some point, want to know what the ratio of 
droppable tombstones to cells is on average for the partitions in an SSTable. 
However, if we ever have more than about 1.9 billion cells in a partition, the 
{{EstimatedHistogram}} that tracks this will overflow. Then, when compaction 
attempts to get the mean number of cells per partition {{EstimatedHistogram}} 
throws an {{IllegalStateException}} that aborts the attempt at compaction. This 
can continue indefinitely.

In C* 4.0, full checksum validation for metadata components exists, but it's 
also possible that, in previous versions, the serialization/deserialization 
cycle for {{EstimatedHistogram}} could introduce corruption that breaks the 
mean calculation.

> Overflowed Partition Cell Histograms Can Prevent Compactions from Executing
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15164
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15164
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL/Interpreter
>            Reporter: Ankur Jha
>            Assignee: Caleb Rackliffe
>            Priority: Urgent
>              Labels: compaction, partition
>
> Hi, we are running 6 node Cassandra cluster in production with 3 seed node 
> but from last night one of our seed nodes is continuously throwing an error 
> like this;-
> cassandra.protocol.ServerError: <Error from server: code=0000 [Server error] 
> message="java.lang.IllegalStateException: Unable to compute ceiling for max 
> when histogram overflowed">
> For a cluster to be up and running I Drained this node.
> Can somebody help me out with this?
>  
> Any help or lead would be appreciated 
>  
> Note : We are using Cassandra version 3.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to