[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190326#comment-17190326 ] Sam Tunnicliffe commented on CASSANDRA-14801: - Thanks [~Ge], looks good as far as I'm concerned. I've pulled in your patch and I'm happy to commit pending CI and a second +1. > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190073#comment-17190073 ] Aleksandr Sorokoumov commented on CASSANDRA-14801: -- [~samt] Your suggestions look good! As a nitpick, I propose to replace the while-true loop with filtering out moving nodes and selecting one from what's left ([8d6dfc090da2a|https://github.com/Ge/cassandra/commit/8d6dfc090da2a49dcc07c65029f33deeff7e186b]). > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083399#comment-17083399 ] Aleksandr Sorokoumov commented on CASSANDRA-14801: -- [~fcofdezc] reviewed the PR and based on his suggestion I re-wrote the randomized test using QuickTheories. As the run time of that test ended comparable to the other unit tests, I included it in the suite as well ([a0246e|https://github.com/apache/cassandra/pull/495/commits/a0246e1e8ba481b98f76896e5005e9a8cc277586]). > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17075318#comment-17075318 ] Benedict Elliott Smith commented on CASSANDRA-14801: bq. Please refrain from focusing on who is employed by whom and instead focus on their individual merit and contribution to the code-base in the future. It is the case that new contributors can generally be assumed to have no knowledge of complex areas of the codebase. There is however precisely one company I am aware of with people who may secretly have the requisite knowledge and skill. There is also only one company with _full time_ contributors that were previously unknown to the community, who may have the time to either obtain the necessary skill, or to have weeks to dedicate to difficult tasks such as this. Were this to be a random individual, I could quite reasonably just direct them to another ticket. In this case I thought it was best only to issue a fair warning that this _might_ not be suitable but that I do not have sufficient knowledge to say for sure. I am perceiving a pattern on your part, of choosing to interpret my actions in an unwarrantedly negative light. Please check yourself, before you check others. > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17075201#comment-17075201 ] Josh McKenzie commented on CASSANDRA-14801: --- Thanks for raising that this is a thorny area of the code-base where specific care needs to be taken. There would probably be a lot of value in a class level comment on TokenMetaData.java indicating this for future people as they ramp on the code-base (and likely other areas of the code-base as well!). {quote}I can see you work at DataStax, so perhaps you have the time and skill to dedicate to this, but please be confident before you address it, and be willing to wait a while for a sufficient review. {quote} Regarding [earned authority|https://www.apache.org/theapacheway/]: {quote}their influence is based on publicly earned merit – what they contribute to the community. Merit lies with the individual, does not expire, is not influenced by employment status or employer {quote} Please refrain from focusing on who is employed by whom and instead focus on their individual merit and contribution to the code-base in the future. > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074336#comment-17074336 ] Michael Semb Wever commented on CASSANDRA-14801: ||branch||circleci||jenkins|| |[trunk_14801|https://github.com/apache/cassandra/compare/trunk...Ge:14801-4.0]|[circleci|https://circleci.com/gh/Ge/workflows/cassandra/tree/14801-4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/13/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/13]| > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068746#comment-17068746 ] Aleksandr Sorokoumov commented on CASSANDRA-14801: -- I reproduced the bug using simple [randomized test|https://gist.github.com/Ge/f59dc5dedaf6501bb31d79b068244213] that creates a cluster of N nodes and adds, moves, and removes nodes from it. I also found another bug where {{TokenMetadata#calculatePendingRanges}} fails during move-affected replica calculation. Patch that addresses both problems ([link to PR|https://github.com/apache/cassandra/pull/495]): * [a534b2|https://github.com/Ge/cassandra/commit/a534b2be9a653fb0cdda75043c0b79e481ca1701] adds tests that reproduce both bugs. * [bb36cd|https://github.com/Ge/cassandra/commit/bb36cd09c164840ad9ab3231f1039c443f8f040c] fixes the newly discovered bug ({{test1Leave1Move}} in [a534b2|https://github.com/Ge/cassandra/commit/a534b2be9a653fb0cdda75043c0b79e481ca1701]). There, the failure happens because for the same pending range we can include 2 replicas with the same endpoint and different ranges (violation of {{PendingRangeMaps Conflict.DUPLICATE}} policy). This happens because right now we include in {{PendingRangeMaps}} the entire new replica after leave for leave-affected ranges and only the pending part of it for move-affected ranges. This commit marks as pending only the missing part of the new replica. * [a33b03|https://github.com/Ge/cassandra/commit/a33b032cd75044db3475505cd32e6afd6f98ad40] solves the original issue. Without this commit {{getPendingRanges}} can fail if the same replica covers more than 1 pending range. I think that this is a valid situation. Consider {{testLeave2}} in [a534b2|https://github.com/Ge/cassandra/commit/a534b2be9a653fb0cdda75043c0b79e481ca1701]. In this case replica {{Full(127.0.0.1:7012,(-9,0])}} covers 2 ranges - {{(-9,-4]}} and {{(-4,0]}}. If we run the same test against 3.11 {{(-9, 0]}} is represented as {{\{(-9, -4], (-4, 0]\}}} and that possible representation would not trigger the bug in 4.0. However, I think that we should not base safety of {{getPendingRanges}} execution on the way a particular {{AbstractReplicationStrategy}} represents a range and we should allow duplicate entries while building pending {{RangesAtEndpoint}}. With these changes, the randomized test hasn't failed after running for a while. As I mentioned, it is a very simplistic test, so any suggestions for improvement are welcome. I haven't included it in the PR, as it seems to be a good candidate to become flaky. Maybe it is worth adding as a {{long}} test? > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054911#comment-17054911 ] Aleksandr Sorokoumov commented on CASSANDRA-14801: -- Thank you for a comprehensive response, [~benedict]! I am quite new to C* 4.0 code, so it will take me some time to ramp up. If anyone has planned to work on this issue in the next 1-2 weeks, it probably makes sense for me to work on something else. Otherwise, I'd be happy to contribute. In the latter case, in the next couple of days, I plan to read on how pending ranges are calculated and what changes since 3.11 introduced the bug. Then I'll write a test case that reproduces the issue. > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0, 4.0-beta > > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054832#comment-17054832 ] Benedict Elliott Smith commented on CASSANDRA-14801: Nobody is actively working on it, but this is one of the most deceptively complex tickets that needs to be accomplished before 4.0 is released. I can see you work at DataStax, so perhaps you have the time and skill to dedicate to this, but please be confident before you address it, and be willing to wait a while for a sufficient review. The class in which the change is needed has had numerous bugs (and in fact has inherent conceptually bugs wrt range movements that are mostly out of scope to address here), so a great deal of care is needed. Ideally this ticket would attempt to address some of the ugliness that permitted the bug, and _certainly_ needs to be accompanied by a sophisticated-ish randomised correctness test. > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0, 4.0-beta > > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054830#comment-17054830 ] Aleksandr Sorokoumov commented on CASSANDRA-14801: -- Is anyone working on this ticket? If not, I would like to work on it. > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0, 4.0-beta > > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918949#comment-16918949 ] Joseph Lynch commented on CASSANDRA-14801: -- [~benedict] do you think this should block the first alpha or it can wait for beta? > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict >Priority: Normal > Fix For: 4.0 > > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org