[ https://issues.apache.org/jira/browse/CASSANDRA-15863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133115#comment-17133115 ]
Berenguer Blasi edited comment on CASSANDRA-15863 at 6/11/20, 10:22 AM: ------------------------------------------------------------------------ This ticket fixes a number of failures so here's some direction for reviewers: *test_resume_failed_replace, test_restart_failed_replace_with_reset_resume_state & test_resume_failed_replace* This test fails waiting for [this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164] log trace. This is never reached bc on the test we are failing bootstrap and thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we [exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568] before we reach that point. The solution is to replace the nodes [without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482] waiting for that log trace and checking in an [alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485] way the bootstrap status. *test_resume_failed_replace* Once the above was fixed we would never hit the resume complete [log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507]. This is bc {{StorageService#resumeBoostrap}} [here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625] would throw an exception starting the daemon. That exception was being swallowed, now it is getting logged. Also I had to add a native transport [init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623] to avoid said exception and the daemon to start correctly. I am worried about any side effects of this extra native transport init, so sbdy with a broader knowledge of the codebase should chime in. *test_replace_nonexistent_node, test_replace_first_boot, test_replace_shutdown_node & test_replace_stopped_node* These in the end turned out to be failures based on the logging messages having changed throughout versions. was (Author: bereng): This ticket fixes a number of failures so here's some direction for reviewers: *test_resume_failed_replace, test_restart_failed_replace_with_reset_resume_state & test_resume_failed_replace* This test fails waiting for [this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164] log trace. This is never reached bc on the test we are failing bootstrap and thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we [exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568] before we reach that point. The solution is to replace the nodes [without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482] waiting for that log trace and checking in an [alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485] way the bootstrap status. *test_resume_failed_replace* Once the above was fixed we would never hit the resume complete [log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507]. This is bc {{StorageService#resumeBoostrap}} [here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625] would throw an exception starting the daemon. That exception was being swallowed, now it is getting logged. Also I had to add a native transport [init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623] to avoid said exception and the daemon to start correctly. I am worried about any side effects of this extra native transport init, so sbdy with a broader knowledge of the codebase should chime in. *test_replace_nonexistent_node, test_replace_first_boot, test_replace_shutdown_node & test_replace_stopped_node* These in the end turned out to be failures based on the logging messages having changed throughout versions. > Boostrap resume and TestReplaceAddress fixes > -------------------------------------------- > > Key: CASSANDRA-15863 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15863 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission, Test/dtest > Reporter: Berenguer Blasi > Assignee: Berenguer Blasi > Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > This has been > [broken|https://ci-cassandra.apache.org/job/Cassandra-trunk/159/testReport/dtest-large.replace_address_test/TestReplaceAddress/test_restart_failed_replace/history/] > for ages -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org