[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635664#comment-17635664 ] Marcus Eriksson commented on CASSANDRA-17870: - +1, rerunning cci [here|https://app.circleci.com/pipelines/github/krummas/cassandra/839/workflows/87cb325e-27b2-4e8b-84c6-2a172e784dfa] - will commit once that looks good > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > Time Spent: 1.5h > Remaining Estimate: 0h > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc > > Example: > in a 3 DC cluster, > ks1 has DC1, DC2 > ks2 has DC1, DC2, DC3 > ks3 has DC2 > now, we add a new DC [DC4] and configured it to all 3 keyspaces. > if we run rebuild with src DC as DC1, ks3 will fail as it does not have DC1. > Now, without src DC, the expectation is rebuild would auto pick up DCs for > each keyspace (let's say ks1: DC1, ks2: DC1, ks3: DC2) and would never fail > due to under-replicated keyspaces. > The issue with this approach (without src dc) is that, DC4 is getting picked > up during rebuild (as src), but DC4 does not have any data yet! > so, with the patch (ignore local dc flag), DC4 can be filtered out and let > the database pick up the right dc for each keyspace [from existing 3 DCs]. > -- this is what is the expectation after the patch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635583#comment-17635583 ] Saranya Krishnakumar commented on CASSANDRA-17870: -- Circle CI run: [https://app.circleci.com/pipelines/github/sarankk/cassandra?branch=fix-nodetool-rebuild] > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > Time Spent: 1.5h > Remaining Estimate: 0h > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc > > Example: > in a 3 DC cluster, > ks1 has DC1, DC2 > ks2 has DC1, DC2, DC3 > ks3 has DC2 > now, we add a new DC [DC4] and configured it to all 3 keyspaces. > if we run rebuild with src DC as DC1, ks3 will fail as it does not have DC1. > Now, without src DC, the expectation is rebuild would auto pick up DCs for > each keyspace (let's say ks1: DC1, ks2: DC1, ks3: DC2) and would never fail > due to under-replicated keyspaces. > The issue with this approach (without src dc) is that, DC4 is getting picked > up during rebuild (as src), but DC4 does not have any data yet! > so, with the patch (ignore local dc flag), DC4 can be filtered out and let > the database pick up the right dc for each keyspace [from existing 3 DCs]. > -- this is what is the expectation after the patch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630660#comment-17630660 ] Francisco Guerrero commented on CASSANDRA-17870: Another +1 from me as well (non-committer +1 :( ) > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > Time Spent: 1.5h > Remaining Estimate: 0h > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc > > Example: > in a 3 DC cluster, > ks1 has DC1, DC2 > ks2 has DC1, DC2, DC3 > ks3 has DC2 > now, we add a new DC [DC4] and configured it to all 3 keyspaces. > if we run rebuild with src DC as DC1, ks3 will fail as it does not have DC1. > Now, without src DC, the expectation is rebuild would auto pick up DCs for > each keyspace (let's say ks1: DC1, ks2: DC1, ks3: DC2) and would never fail > due to under-replicated keyspaces. > The issue with this approach (without src dc) is that, DC4 is getting picked > up during rebuild (as src), but DC4 does not have any data yet! > so, with the patch (ignore local dc flag), DC4 can be filtered out and let > the database pick up the right dc for each keyspace [from existing 3 DCs]. > -- this is what is the expectation after the patch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630654#comment-17630654 ] Yifan Cai commented on CASSANDRA-17870: --- +1 on the patch. Thanks for addressing all my comments > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > Time Spent: 1.5h > Remaining Estimate: 0h > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc > > Example: > in a 3 DC cluster, > ks1 has DC1, DC2 > ks2 has DC1, DC2, DC3 > ks3 has DC2 > now, we add a new DC [DC4] and configured it to all 3 keyspaces. > if we run rebuild with src DC as DC1, ks3 will fail as it does not have DC1. > Now, without src DC, the expectation is rebuild would auto pick up DCs for > each keyspace (let's say ks1: DC1, ks2: DC1, ks3: DC2) and would never fail > due to under-replicated keyspaces. > The issue with this approach (without src dc) is that, DC4 is getting picked > up during rebuild (as src), but DC4 does not have any data yet! > so, with the patch (ignore local dc flag), DC4 can be filtered out and let > the database pick up the right dc for each keyspace [from existing 3 DCs]. > -- this is what is the expectation after the patch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620650#comment-17620650 ] Saranya Krishnakumar commented on CASSANDRA-17870: -- [~yifanc] created a PR [https://github.com/apache/cassandra/pull/1931 |https://github.com/apache/cassandra/pull/1931,] > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > Time Spent: 10m > Remaining Estimate: 0h > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609606#comment-17609606 ] Francisco Guerrero commented on CASSANDRA-17870: {quote} Why prefer excluding the local DC instead of setting a source DC? {quote} [~yifanc] This would be for the case where you are building out a new DC, and your local [new] DC nodes do not have data yet. There's already a way to specify a *{{src-dc-name}}* (which can be the local DC) [see|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/tools/nodetool/Rebuild.java#L30]. > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607984#comment-17607984 ] Yifan Cai commented on CASSANDRA-17870: --- Can you send a PR instead? It can be created on [https://github.com/apache/cassandra] > nodetool/rebuild: Add flag to exclude nodes from local datacenter > - > > Key: CASSANDRA-17870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17870 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > Attachments: fix_nodetool_rebuild.diff > > > During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild > the data from other DCs. If src-dc is not passed explicitly, then C* tries to > rebuild the data from the same (new dc) dc. > We don’t exclude other nodes in the same DC. Only down sources and the local > node itself are excluded. > ``` > // We're _always_ filtering out a local node and down sources > addSourceFilter(new > RangeStreamer.FailureDetectorSourceFilter(failureDetector)); > addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter()); > ``` > We should fix nodetool/rebuild to exclude the local DC (from where we’re > executing the command) while issuing nodetool/rebuild without passing src dc -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org