[jira] [Comment Edited] (CASSANDRA-16418) Unsafe to run nodetool cleanup during bootstrap or decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17792799#comment-17792799 ] Jacek Lewandowski edited comment on CASSANDRA-16418 at 12/4/23 12:15 PM: - Why that check in {{CompactionManager}} was removed? Was it needed for tests to make them run? I'm afraid that the check could have been legit for production use. cc [~szymon.miezal] was (Author: jlewandowski): Why that check in {{CompactionManager}} was removed? Was it needed for tests to make them run? I'm afraid that the check could have been legit for production use. > Unsafe to run nodetool cleanup during bootstrap or decommission > --- > > Key: CASSANDRA-16418 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16418 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: James Baker >Assignee: Lindsey Zurovchak >Priority: Normal > Fix For: 4.0.8, 4.1.1, 5.0-alpha1, 5.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > What we expected: Running a cleanup is a safe operation; the result of > running a query after a cleanup should be the same as the result of running a > query before a cleanup. > What actually happened: We ran a cleanup during a decommission. All the > streamed data was silently deleted, the bootstrap did not fail, the cluster's > data after the decommission was very different to the state before. > Why: Cleanups do not take into account pending ranges and so the cleanup > thought that all the data that had just been streamed was redundant and so > deleted it. We think that this is symmetric with bootstraps, though have not > verified. > Not sure if this is technically a bug but it was very surprising (and > seemingly undocumented) behaviour. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16418) Unsafe to run nodetool cleanup during bootstrap or decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678437#comment-17678437 ] Paulo Motta edited comment on CASSANDRA-16418 at 1/20/23 6:45 PM: -- Resubmitted CI after [test fix|https://github.com/linzuro/cassandra/commit/8de9c73d28291d2df67727ffcb7292f8c21b3442]: |branch||CI|| |[CASSANDRA-16418-4.0|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.0]|[#2205|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2205/] (finished)| |[CASSANDRA-16418-4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.1]|[#2206|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2206/] (finished)| |[CASSANDRA-16418-trunk|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-trunk]|[#2207|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2207/] (finished)| was (Author: paulo): Resubmitted CI after [test fix|https://github.com/linzuro/cassandra/commit/8de9c73d28291d2df67727ffcb7292f8c21b3442]: |branch||CI|| |[CASSANDRA-16418-4.0|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.0]|[#2205|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2205/] (running)| |[CASSANDRA-16418-4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.1]|[#2206|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2206/] (running)| |[CASSANDRA-16418-trunk|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-trunk]|[#2207|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2207/] (queued)| > Unsafe to run nodetool cleanup during bootstrap or decommission > --- > > Key: CASSANDRA-16418 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16418 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: James Baker >Assignee: Lindsey Zurovchak >Priority: Normal > Fix For: 4.0.x > > Time Spent: 3h > Remaining Estimate: 0h > > What we expected: Running a cleanup is a safe operation; the result of > running a query after a cleanup should be the same as the result of running a > query before a cleanup. > What actually happened: We ran a cleanup during a decommission. All the > streamed data was silently deleted, the bootstrap did not fail, the cluster's > data after the decommission was very different to the state before. > Why: Cleanups do not take into account pending ranges and so the cleanup > thought that all the data that had just been streamed was redundant and so > deleted it. We think that this is symmetric with bootstraps, though have not > verified. > Not sure if this is technically a bug but it was very surprising (and > seemingly undocumented) behaviour. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16418) Unsafe to run nodetool cleanup during bootstrap or decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677987#comment-17677987 ] Paulo Motta edited comment on CASSANDRA-16418 at 1/18/23 7:04 PM: -- Prepared [~linzuro]'s patch for commit on 4.0/4.1/trunk and submitted CI: |branch||CI|| |[CASSANDRA-16418-4.0|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.0]|[#2202|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2202/] (finished)| |[CASSANDRA-16418-4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.1]|[#2203|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2203/] (finished)| |[CASSANDRA-16418-trunk|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-trunk]|[#2204|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2204/] (finished)| was (Author: paulo): Prepared [~linzuro]'s patch for commit on 4.0/4.1/trunk and submitted CI: |branch||CI|| |[CASSANDRA-16418-4.0|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.0]|[#2202|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2202/] (running)| |[CASSANDRA-16418-4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-4.1]|[#2203|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2202/] (running)| |[CASSANDRA-16418-trunk|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-16418-trunk]|[#2204|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2204/] (queued)| > Unsafe to run nodetool cleanup during bootstrap or decommission > --- > > Key: CASSANDRA-16418 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16418 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: James Baker >Assignee: Lindsey Zurovchak >Priority: Normal > Fix For: 4.0.x > > Time Spent: 1h 20m > Remaining Estimate: 0h > > What we expected: Running a cleanup is a safe operation; the result of > running a query after a cleanup should be the same as the result of running a > query before a cleanup. > What actually happened: We ran a cleanup during a decommission. All the > streamed data was silently deleted, the bootstrap did not fail, the cluster's > data after the decommission was very different to the state before. > Why: Cleanups do not take into account pending ranges and so the cleanup > thought that all the data that had just been streamed was redundant and so > deleted it. We think that this is symmetric with bootstraps, though have not > verified. > Not sure if this is technically a bug but it was very surprising (and > seemingly undocumented) behaviour. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16418) Unsafe to run nodetool cleanup during bootstrap or decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17652521#comment-17652521 ] Paulo Motta edited comment on CASSANDRA-16418 at 12/28/22 3:22 PM: --- Nice work [~linzuro]! The approach and test looks mostly good to me, added a few comments to the PR. Can you add a similar regression test for bootstrap so we can detect if this is ever broken in the future? The test should fail when the bootstrap safeguard is removed. I think you can find some bootstrap dtest examples on {{{}org.apache.cassandra.distributed.test.ring.BootstrapTest{}}}. I have submitted a preliminary CI run for your branch on: * [https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2151/] was (Author: paulo): Nice work [~linzuro]! The approach and test looks mostly good to me, added a few comments to the PR. Can you add a similar regression test for bootstrap? The test should fail when the bootstrap safeguard is removed. I think you can find some bootstrap dtest examples on \{{org.apache.cassandra.distributed.test.ring.BootstrapTest}}. I have submitted a preliminary CI run for your branch on: * https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2151/ > Unsafe to run nodetool cleanup during bootstrap or decommission > --- > > Key: CASSANDRA-16418 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16418 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: James Baker >Assignee: Lindsey Zurovchak >Priority: Normal > Time Spent: 20m > Remaining Estimate: 0h > > What we expected: Running a cleanup is a safe operation; the result of > running a query after a cleanup should be the same as the result of running a > query before a cleanup. > What actually happened: We ran a cleanup during a decommission. All the > streamed data was silently deleted, the bootstrap did not fail, the cluster's > data after the decommission was very different to the state before. > Why: Cleanups do not take into account pending ranges and so the cleanup > thought that all the data that had just been streamed was redundant and so > deleted it. We think that this is symmetric with bootstraps, though have not > verified. > Not sure if this is technically a bug but it was very surprising (and > seemingly undocumented) behaviour. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org