[ https://issues.apache.org/jira/browse/CASSANDRA-17894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608787#comment-17608787 ]
Brandon Williams edited comment on CASSANDRA-17894 at 9/23/22 2:53 PM: ----------------------------------------------------------------------- As we know, "Unknown keyspace ks" is a schema problem, usually from lack of waiting for agreement. The [create_ks call|https://github.com/apache/cassandra-dtest/blob/trunk/topology_test.py#L479] already has agreement built in so this shouldn't happen, but there are other "unprotected" schema changes present so it's hard to know exactly what happened, especially since we lack the logs due to CASSANDRA-17901. I have [a branch|https://github.com/driftx/cassandra-dtest/tree/CASSANDRA-17894] that waits for agreement after the other changes and a [circle run|https://app.circleci.com/pipelines/github/driftx/cassandra/651/workflows/45b3841e-522a-4169-8c27-4a0b518223ed/jobs/7355] just to prove it didn't break anything. I think given that this is the first failure we've ever seen and that waiting for agreement is correct regardless, we should commit this and we can reopen if it recurs, hopefully armed with the logs from CASSANDRA-17901. was (Author: brandon.williams): As we know, "Unknown keyspace ks" is a schema problem, usually from lack of waiting for agreement. The [create_ks call|https://github.com/apache/cassandra-dtest/blob/trunk/topology_test.py#L479] already has agreement built in so this shouldn't happen, but there are other "unprotected" schema changes present so it's hard to know exactly what happened, especially since we lack the logs due to CASSANDRA-17901. I have [a branch|https://github.com/driftx/cassandra-dtest/tree/CASSANDRA-17894] that waits for agreement after the other changes and a [circle run|https://app.circleci.com/pipelines/github/driftx/cassandra/651/workflows/45b3841e-522a-4169-8c27-4a0b518223ed/jobs/7355] just to prove it didn't break anything. I think given that this is the first failure we've ever seen and that waiting for agreement is correct regardless, we should commit this and we can reopen if it recurs, hopefully armed with the logs from CASSANDRA-17901. > Test Failure: > dtest-large.topology_test.TestTopology.test_stop_decommission_too_few_replicas_multi_dc > > ------------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-17894 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17894 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python > Reporter: Josh McKenzie > Assignee: Brandon Williams > Priority: Normal > Fix For: 4.1-beta, 4.1.x, 4.x > > > Link to failure: > https://ci-cassandra.apache.org/job/Cassandra-4.1/162/testReport/dtest-large.topology_test/TestTopology/test_stop_decommission_too_few_replicas_multi_dc/ > Error Message > test teardown failure > Stacktrace > {code} > Unexpected error found in node logs (see stdout for full details). Errors: > [[node1] 'ERROR [OptionalTasks:1] 2022-09-13 02:10:09,576 > JVMStabilityInspector.java:68 - Exception in thread > Thread[OptionalTasks:1,5,OptionalTasks]\njava.lang.AssertionError: Unknown > keyspace ks\n\tat > org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:339)\n\tat > org.apache.cassandra.db.Keyspace.lambda$open$0(Keyspace.java:163)\n\tat > org.apache.cassandra.utils.concurrent.LoadingMap.blockingLoadIfAbsent(LoadingMap.java:105)\n\tat > > org.apache.cassandra.schema.Schema.maybeAddKeyspaceInstance(Schema.java:228)\n\tat > org.apache.cassandra.db.Keyspace.open(Keyspace.java:163)\n\tat > org.apache.cassandra.db.Keyspace.open(Keyspace.java:152)\n\tat > com.google.common.collect.Iterators$6.transform(Iterators.java:785)\n\tat > com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)\n\tat > > org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:74)\n\tat > > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:124)\n\tat > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)\n\tat > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)\n\tat > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)\n\tat > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat > > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat > java.lang.Thread.run(Thread.java:748)'] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org