[jira] [Commented] (CASSANDRA-21321) Add RowsRead and RowsMutated counters to TableMetrics for accurate per-table row throughput tracking

Stefan Miklosovic (Jira) Tue, 26 May 2026 04:31:05 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18083479#comment-18083479
 ]


Stefan Miklosovic commented on CASSANDRA-21321:
-----------------------------------------------

There is similar metric already like {{writeLatency}} It is called in 
{{ColumnFamilyStore.apply}} like

{code}
metric.writeLatency.addNano(nanoTime() - start);
{code}

This is then used in {{localWriteCount}} in {{TableStatsHolder}} where it calls 
{{getCount()}} on {{WriteLatency}}. It is basically what {{Local read count: }} 
and {{Local write count: }} in {{nodetool tablestats}} for a specific table 
returns. That metric also operations on {{PartitionUpdate}} but not on so 
granular level as this newly proposed metric is doing. The new one calls 
{{PartitionUpdate.affectedRowCount}} and {{getCount()}} on the existing one 
seems to be just a way to get _how many times we did a write_ instead of _how 
many rows a particular PU is affecting_ which are two different things. 

I think this patch makes sense if I read the situation correctly, but I am not 
sure if we should not introduce this new metric to {{nodetool tablestats}} too. 
Might come as a follow up though. 

> Add RowsRead and RowsMutated counters to TableMetrics for accurate per-table 
> row throughput tracking
> ----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21321
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21321
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Observability/Metrics
>            Reporter: Piotr Walczak
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 7.x
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Cassandra already exposes {{LiveScannedHistogram}} (rows returned per read) 
> and {{RowsMutatedPerWriteHistogram}} (rows touched per write - 
> https://issues.apache.org/jira/browse/CASSANDRA-21320) at the table level. 
> These histograms are valuable for understanding the *distribution* of rows 
> per operation, but they are *insufficient for measuring total row throughput* 
> over a time window — which is essential for capacity planning and predictions.
> Specifically:
>  * A histogram records _how many rows a single operation touched_ (e.g., 
> "this query returned 47 rows").
>  * To derive total rows from a histogram you would need to multiply bucket 
> midpoints by counts — an approximation that becomes increasingly inaccurate 
> for skewed distributions.
>  * Histograms also do not give you a simple delta between two scrape 
> timestamps, making windowed rate calculations fragile.
> A *monotonically-increasing counter* solves all of these problems: scrape the 
> value at {{t1}} and {{{}t2{}}}, subtract, divide by the interval — exact 
> rows/sec with no approximation.
>  
> Changes can look similar to: 
> [https://github.com/apache/cassandra/compare/trunk...pwalczak:cassandra:pwalczak/CASSANDRA-21321?expand=1]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-21321) Add RowsRead and RowsMutated counters to TableMetrics for accurate per-table row throughput tracking

Reply via email to