[ https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910811#comment-13910811 ]
sankalp kohli commented on CASSANDRA-6758: ------------------------------------------ If you are depending on extrapolated, you can see how many ranges did not match during repair. This will give you a similar answer as this info is logged. Also since, a tree range/leaf can have multiple rows, you can calculate that by knowing the number of rows per instance and dividing by 32k leafs of Merkle tree. > Measure data consistency in the cluster > --------------------------------------- > > Key: CASSANDRA-6758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6758 > Project: Cassandra > Issue Type: New Feature > Reporter: Jimmy MÃ¥rdell > Priority: Minor > > Running multi-DC Cassandra can be a challenge as the cluster easily tends to > get out-of-sync. We have been thinking it would be nice to measure how out of > sync a cluster is and expose those metrics somehow. > One idea would be to just run the first half of the repair process and output > the result of the differencer. If you use Random or the Murmur3 partitioner, > it should be enough to calculate the merkle tree over a small subset of the > ring as the result can be extrapolated. > This could be exposed in nodetool. Either a separate command or perhaps a > dry-run flag to repair? > Not sure about the output format. I think it would be nice to have one value > ("% consistent"?) within a DC, and also one value for every pair of DC's > perhaps? -- This message was sent by Atlassian JIRA (v6.1.5#6160)