[jira] [Updated] (CASSANDRA-5263) Increase merkle tree depth as needed
[ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-5263: -- Labels: repair (was: ) > Increase merkle tree depth as needed > > > Key: CASSANDRA-5263 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5263 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Affects Versions: 1.1.9 >Reporter: Ahmed Bashir >Assignee: Yuki Morishita > Labels: repair > Fix For: 2.1.1 > > Attachments: 5263-2.1-v1.txt, 5263-formatted.txt > > > Currently, the maximum depth allowed for Merkle trees is hardcoded as 15. > This value should be configurable, just like phi_convict_treshold and other > properties. > Given a cluster with nodes responsible for a large number of row keys, Merkle > tree comparisons can result in a large amount of unnecessary row keys being > streamed. > Empirical testing indicates that reasonable changes to this depth (18, 20, > etc) don't affect the Merkle tree generation and differencing timings all > that much, and they can significantly reduce the amount of data being > streamed during repair. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5263) Increase merkle tree depth as needed
[ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-5263: -- Component/s: Streaming and Messaging > Increase merkle tree depth as needed > > > Key: CASSANDRA-5263 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5263 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Affects Versions: 1.1.9 >Reporter: Ahmed Bashir >Assignee: Yuki Morishita > Labels: repair > Fix For: 2.1.1 > > Attachments: 5263-2.1-v1.txt, 5263-formatted.txt > > > Currently, the maximum depth allowed for Merkle trees is hardcoded as 15. > This value should be configurable, just like phi_convict_treshold and other > properties. > Given a cluster with nodes responsible for a large number of row keys, Merkle > tree comparisons can result in a large amount of unnecessary row keys being > streamed. > Empirical testing indicates that reasonable changes to this depth (18, 20, > etc) don't affect the Merkle tree generation and differencing timings all > that much, and they can significantly reduce the amount of data being > streamed during repair. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5263) Increase merkle tree depth as needed
[ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5263: -- Attachment: 5263-formatted.txt This patch will err on the side of overestimating the depth, which is imo the right behavior. +1 from me. (Patch attached w/ minor reformatting.) Increase merkle tree depth as needed Key: CASSANDRA-5263 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263 Project: Cassandra Issue Type: Improvement Affects Versions: 1.1.9 Reporter: Ahmed Bashir Assignee: Yuki Morishita Fix For: 2.1.1 Attachments: 5263-2.1-v1.txt, 5263-formatted.txt Currently, the maximum depth allowed for Merkle trees is hardcoded as 15. This value should be configurable, just like phi_convict_treshold and other properties. Given a cluster with nodes responsible for a large number of row keys, Merkle tree comparisons can result in a large amount of unnecessary row keys being streamed. Empirical testing indicates that reasonable changes to this depth (18, 20, etc) don't affect the Merkle tree generation and differencing timings all that much, and they can significantly reduce the amount of data being streamed during repair. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5263) Increase merkle tree depth as needed
[ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-5263: -- Attachment: 5263-2.1-v1.txt Attaching my first attempt. * Merkle Tree creation is put back to validation compaction to reference SSTables involved. * MT depth is calculated based on the number of partition keys as Johnatahn/Minh commented and capped to 20. Increase merkle tree depth as needed Key: CASSANDRA-5263 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263 Project: Cassandra Issue Type: Improvement Affects Versions: 1.1.9 Reporter: Ahmed Bashir Assignee: Yuki Morishita Fix For: 2.1.1 Attachments: 5263-2.1-v1.txt Currently, the maximum depth allowed for Merkle trees is hardcoded as 15. This value should be configurable, just like phi_convict_treshold and other properties. Given a cluster with nodes responsible for a large number of row keys, Merkle tree comparisons can result in a large amount of unnecessary row keys being streamed. Empirical testing indicates that reasonable changes to this depth (18, 20, etc) don't affect the Merkle tree generation and differencing timings all that much, and they can significantly reduce the amount of data being streamed during repair. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5263) Increase merkle tree depth as needed
[ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5263: -- Component/s: (was: Config) Fix Version/s: 2.1.1 Assignee: Yuki Morishita (was: Minh Do) Summary: Increase merkle tree depth as needed (was: Allow Merkle tree maximum depth to be configurable) Increase merkle tree depth as needed Key: CASSANDRA-5263 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263 Project: Cassandra Issue Type: Improvement Affects Versions: 1.1.9 Reporter: Ahmed Bashir Assignee: Yuki Morishita Fix For: 2.1.1 Currently, the maximum depth allowed for Merkle trees is hardcoded as 15. This value should be configurable, just like phi_convict_treshold and other properties. Given a cluster with nodes responsible for a large number of row keys, Merkle tree comparisons can result in a large amount of unnecessary row keys being streamed. Empirical testing indicates that reasonable changes to this depth (18, 20, etc) don't affect the Merkle tree generation and differencing timings all that much, and they can significantly reduce the amount of data being streamed during repair. -- This message was sent by Atlassian JIRA (v6.2#6252)