[ https://issues.apache.org/jira/browse/CASSANDRA-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764329#comment-17764329 ]
Caleb Rackliffe commented on CASSANDRA-18580: --------------------------------------------- +1 > Baseline Metrics for Accord Transactions > ---------------------------------------- > > Key: CASSANDRA-18580 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18580 > Project: Cassandra > Issue Type: Improvement > Components: Accord, Observability/JMX, Observability/Metrics > Reporter: Caleb Rackliffe > Assignee: Jacek Lewandowski > Priority: Normal > Fix For: 5.x > > Time Spent: 7h 50m > Remaining Estimate: 0h > > Based on some conversations w/ [~benedict] and [~dcapwell], this is the > initial set of metrics that seem both feasible to implement and useful as we > monitor the health of a cluster performing Accord transactions: > 1.) Basic latency metrics for transactions up to the point of COMMIT and rate > metrics for preemption, failure, and timeouts at the coordinator. > This has already been implemented and split into read and write-specific > metrics. Our position for now is that metrics around preemption should be > useful in place of a more difficult-to-define metric around how many > transactions are completed via recovery. > 2.) Global cache stats/metrics (i.e. aggregated for all command stores) > We could, at some point, build metrics scoped to a specific {{CommandStore}}, > but they might be awkward in MBean/JMX space, as command stores would have to > be identified by ID or key rangeā¦the latter possibly being able to change > across epochs. (An alternative would be just publishing command > store-specific stats on-demand to a virtual table instead.) > 3.) Something like a decaying histogram of the number of dependencies per > transaction (or per partial transaction). > If this is getting worse over time, it could be useful to know/be a way for > us to detect that contention is increasing. We should be able to hook this up > to {{ProgressLog}} notifications. Recording for PartialDeps/PartialTxn (which > ProgressLog gives us at pre-accept) seems acceptable, given this is a > directional metric. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org