[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-11 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17342:

Fix Version/s: 4.1

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Assignee: Paul Chandler
>Priority: Normal
> Fix For: 4.0.3, 4.1
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-11 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17342:

  Fix Version/s: 4.0.3
 (was: 4.0.x)
  Since Version: 4.0.0
Source Control Link: 
https://github.com/apache/cassandra/commit/c60ad61b3b6145af100578f2c652819f61729018
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed to 4.0 and merged up, thanks again for the patch!

trunk tests look bad, but similar to [non-patched 
trunk|https://app.circleci.com/pipelines/github/krummas/cassandra/775/workflows/b0ede5ae-db7c-4a1d-b6ff-22245922bb46]

[circleci 
4.0|https://app.circleci.com/pipelines/github/krummas/cassandra/770/workflows/edfe8c85-0de6-4191-b4be-e7c4cb1a4c1e]
[circleci 
trunk|https://app.circleci.com/pipelines/github/krummas/cassandra/769/workflows/6eea562c-0354-41e2-b253-32da2f929193]

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Assignee: Paul Chandler
>Priority: Normal
> Fix For: 4.0.3
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-11 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17342:

Status: Ready to Commit  (was: Review In Progress)

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Assignee: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-11 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17342:

Reviewers: Brandon Williams, Marcus Eriksson, Marcus Eriksson  (was: 
Brandon Williams, Marcus Eriksson)
   Brandon Williams, Marcus Eriksson, Marcus Eriksson  (was: 
Brandon Williams, Marcus Eriksson)
   Status: Review In Progress  (was: Patch Available)

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Assignee: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-08 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17342:
-
Reviewers: Brandon Williams, Marcus Eriksson  (was: Marcus Eriksson)

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Assignee: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-03 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17342:
-
Test and Documentation Plan: run CI
 Status: Patch Available  (was: Open)

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Assignee: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-03 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17342:

Reviewers: Marcus Eriksson

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-02 Thread Paul Chandler (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Chandler updated CASSANDRA-17342:
--
Attachment: LocalSessions.java
RepairedState.java
BulkRepairStateTest.java

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: BulkRepairStateTest.java, 
> IncrementalRepairStartupTest.java, LocalSessions.java, RepairedState.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17342) Performance problem for node restart with incremental range repairs

2022-02-02 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17342:
-
 Bug Category: Parent values: Degradation(12984)Level 1 values: Performance 
Bug/Regression(12997)
   Complexity: Normal
Discovered By: User Report
Fix Version/s: 4.0.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Performance problem for node restart with incremental range repairs
> ---
>
> Key: CASSANDRA-17342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: IncrementalRepairStartupTest.java
>
>
> There is a performance problem when restarting cassandra for clusters doing 
> incremental repairs with range repairs. 
> We have clusters with 16 vnodes per node, and are splitting each vnode into 
> 100 ranges, this causes a node to take over 30 minutes to process the data 
> stored in the system.repairs table before the node can restart. Even when we 
> reduce this to 10 ranges per vnode this still takes 2 minutes to process. The 
> cluster has 22 keyspaces and a rf of 3, this creates around 8100 records in 
> the system.repairs table.
>  
> The problem seems to occur in the 
> org.apache.cassandra.repair.consistent.RepairState class where the add method 
> re processes the complete list, including sorting, every time a new Range is 
> added. This leads is an exponential growth in processing time, this is 
> demonstrated in the attached unit test.
>  
> I have created a change, that collects the data read in from the 
> system.repairs table, in the 
> org.apache.cassandra.repair.consistent.LocalSessions class, before processing 
> it as a group at the end, this reduces the processing time to a couple of 
> seconds even for the 100 range version.
>  
> This is my first attempt at changing the cassandra code, so I am in need of a 
> mentor to help me with the process, and validate what I have done.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org