Matt Byrd created CASSANDRA-14531: ------------------------------------- Summary: Only include data owned by the node in totals for repaired, un-repaired and pending repair. Key: CASSANDRA-14531 URL: https://issues.apache.org/jira/browse/CASSANDRA-14531 Project: Cassandra Issue Type: Improvement Components: Metrics, Repair Reporter: Matt Byrd Fix For: 4.x
If there is data which is left over from a topology change and is not yet cleaned up, it will be included in the total for BytesRepaired, BytesUnrepaired or BytesPendingRepair metrics. This can distort the total and lead to misleading metrics (albeit potentially short-lived). As an operator if you wanted to keep track of percent repaired, you might not have an accurate idea of the relevant percent repaired under such conditions. I propose we only include sstables owned by the node in the totals for BytesRepaired, BytesUnrepaired, BytesPendingRepair and PercentRepaired. It feels more logical to only emit metrics like repaired/un-repaired for data which can actually be repaired. When an SStable is partially owned by the node, we can compute the size which falls within the token-range by binary searching the index for the uncompressed offsets. We can finally also emit a metric which consists of all the data which is not owned by the node. This might also be helpful for operators to discover whether there is data which is not owned by the node and hence the need to run cleanup. On slight complication is that with a large number of sstables and a reasonable number of vnodes, computing these values now becomes a bit expensive. There is probably a way of keeping some of these metrics updated online rather than re-computing periodically, though this might be a bit fiddly. Alternately using things like the interval tree or some other data-structure might be enough to ensure it performs sufficiently and doesn't add undue overhead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org