[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2017-12-11 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287089#comment-16287089
 ] 

Jeff Jirsa commented on CASSANDRA-6758:
---

Think this is done.
In 4.0, we have CASSANDRA-11503 (nodetool repaired/unrepaired by sstables), 
CASSANDRA-13774 (repaired/unrepaired by bytes), CASSANDRA-13289 (track an ideal 
consistency level beyond what acks the write), and CASSANDRA-13257 (repair 
preview). Seems like that covers the intent of this ticket.

Propose we close as wontfix, because it's basically done by those others.


> Measure data consistency in the cluster
> ---
>
> Key: CASSANDRA-6758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jimmy Mårdell
>Priority: Minor
>  Labels: proposed-wontfix
>
> Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
> get out-of-sync. We have been thinking it would be nice to measure how out of 
> sync a cluster is and expose those metrics somehow.
> One idea would be to just run the first half of the repair process and output 
> the result of the differencer. If you use Random or the Murmur3 partitioner, 
> it should be enough to calculate the merkle tree over a small subset of the 
> ring as the result can be extrapolated.
> This could be exposed in nodetool. Either a separate command or perhaps a 
> dry-run flag to repair?
> Not sure about the output format. I think it would be nice to have one value 
> ("% consistent"?) within a DC, and also one value for every pair of DC's 
> perhaps?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910212#comment-13910212
 ] 

Benedict commented on CASSANDRA-6758:
-

This doesn't seem like a bad idea at all. The only problem that I can see is 
that the first half of the repair process is actually one of the more 
expensive actions a cluster can perform, as the entire cluster needs to walk 
all of its data to compute its merkle tree. I wonder if it would be possible to 
calculate and save an abbreviated merkle tree when writing each sstable, that 
could be combined cheaply to give this answer.

 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910214#comment-13910214
 ] 

Benedict commented on CASSANDRA-6758:
-

bq. I wonder if it would be possible to calculate and save an abbreviated 
merkle tree when writing each sstable, that could be combined cheaply to give 
this answer.

Hmm. Thinking about it for just a few seconds more, this is highly unlikely to 
be workable since we would be hashing multiple versions of a partition key. It 
might be workable for datasets that are append only, but not sure if it's a 
worthwhile optimisation for that case alone.

 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910293#comment-13910293
 ] 

Jonathan Ellis commented on CASSANDRA-6758:
---

That is (one reason) why we took a different approach with CASSANDRA-5351.

 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910298#comment-13910298
 ] 

Jimmy Mårdell commented on CASSANDRA-6758:
--

Right. I realize this ticket is almost irrelevant in 2.1. But that's still a 
long way away (at least if you follow DSE). This would be some kind of 
mitigation until then, and should be done in 1.2 preferably.


 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910811#comment-13910811
 ] 

sankalp kohli commented on CASSANDRA-6758:
--

If you are depending on extrapolated, you can see how many ranges did not match 
during repair. This will give you a similar answer as this info is logged. 
Also since, a tree range/leaf can have multiple rows, you can calculate that by 
knowing the number of rows per instance and dividing by 32k leafs of Merkle 
tree. 

 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910830#comment-13910830
 ] 

Jimmy Mårdell commented on CASSANDRA-6758:
--

A range is not the same as a leaf, is it? If two leaves with the same parent 
mismatches, it's still only one range (I think?). So it's hard to know from the 
logs how much was out of sync.

We've had problems in the past with overstreaming causing serious performance 
problems. Had we known the cluster was that out of sync, we might have taken 
some extra measure before running the repair. With subrange repairs, and 
CASSANDRA-6713, perhaps this will no longer be an issue.


 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6758) Measure data consistency in the cluster

2014-02-24 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911092#comment-13911092
 ] 

sankalp kohli commented on CASSANDRA-6758:
--

Yes if there is a mismatch of an inner node in the tree, it will log that. 
May be we can sum the ranges which do not match in Differencer in 1.2. 

Regarding performance problems with lot of streaming. I think we should pause 
the streams if Cassandra detects that lot of data is being transferred causing 
the disk to get full or L0 to grow. I had created this JIRA
https://issues.apache.org/jira/browse/CASSANDRA-6752

This will also make things easy to operate from such problems as you don't need 
to do sub range repairs. 

 Measure data consistency in the cluster
 ---

 Key: CASSANDRA-6758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jimmy Mårdell
Priority: Minor

 Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
 get out-of-sync. We have been thinking it would be nice to measure how out of 
 sync a cluster is and expose those metrics somehow.
 One idea would be to just run the first half of the repair process and output 
 the result of the differencer. If you use Random or the Murmur3 partitioner, 
 it should be enough to calculate the merkle tree over a small subset of the 
 ring as the result can be extrapolated.
 This could be exposed in nodetool. Either a separate command or perhaps a 
 dry-run flag to repair?
 Not sure about the output format. I think it would be nice to have one value 
 (% consistent?) within a DC, and also one value for every pair of DC's 
 perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)