[ https://issues.apache.org/jira/browse/CASSANDRA-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752279#comment-17752279 ]
Jacek Lewandowski commented on CASSANDRA-18580: ----------------------------------------------- I've added cache metrics for {{AccordStateCache}} - global and per-instance, which means that it applies individually to {{Command}} and {{CommandsForKey}}. Also added a simple meter for progress log size. I have a question about what is exactly needed for "Size of CommandsForKey"? Do you need some histogram of serialized size of loaded objects? Should that be applied in {{AccordStateCache}}? > Baseline Metrics for Accord Transactions > ---------------------------------------- > > Key: CASSANDRA-18580 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18580 > Project: Cassandra > Issue Type: Improvement > Components: Accord, Observability/JMX, Observability/Metrics > Reporter: Caleb Rackliffe > Assignee: Jacek Lewandowski > Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Based on some conversations w/ [~benedict] and [~dcapwell], this is the > initial set of metrics that seem both feasible to implement and useful as we > monitor the health of a cluster performing Accord transactions: > 1.) Basic latency metrics for transactions up to the point of COMMIT and rate > metrics for preemption, failure, and timeouts at the coordinator. > This has already been implemented and split into read and write-specific > metrics. Our position for now is that metrics around preemption should be > useful in place of a more difficult-to-define metric around how many > transactions are completed via recovery. > 2.) Global cache stats/metrics (i.e. aggregated for all command stores) > We could, at some point, build metrics scoped to a specific {{CommandStore}}, > but they might be awkward in MBean/JMX space, as command stores would have to > be identified by ID or key rangeā¦the latter possibly being able to change > across epochs. (An alternative would be just publishing command > store-specific stats on-demand to a virtual table instead.) > 3.) Something like a decaying histogram of the number of dependencies per > transaction (or per partial transaction). > If this is getting worse over time, it could be useful to know/be a way for > us to detect that contention is increasing. We should be able to hook this up > to {{ProgressLog}} notifications. Recording for PartialDeps/PartialTxn (which > ProgressLog gives us at pre-accept) seems acceptable, given this is a > directional metric. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org