[ https://issues.apache.org/jira/browse/CASSANDRA-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818894#comment-17818894 ]
Stefan Miklosovic commented on CASSANDRA-19120: ----------------------------------------------- [CASSANDRA-19120-trunk|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19120-trunk] {noformat} java17_pre-commit_tests java17_separate_tests java11_pre-commit_tests ✓ j11_build 7m 58s ✓ j11_cqlsh_dtests_py311 9m 58s ✓ j11_cqlsh_dtests_py311_vnode 10m 23s ✓ j11_cqlsh_dtests_py38 7m 52s ✓ j11_cqlsh_dtests_py38_vnode 10m 31s ✓ j11_cqlshlib_cython_tests 11m 38s ✓ j11_cqlshlib_tests 12m 1s ✓ j11_dtests 37m 13s ✓ j11_dtests_vnode 41m 2s ✓ j11_jvm_dtests_vnode 20m 5s ✓ j11_simulator_dtests 9m 0s ✓ j17_cqlsh_dtests_py311 6m 53s ✓ j17_cqlsh_dtests_py311_vnode 7m 12s ✓ j17_cqlsh_dtests_py38 6m 51s ✓ j17_cqlsh_dtests_py38_vnode 7m 13s ✓ j17_cqlshlib_cython_tests 8m 19s ✓ j17_cqlshlib_tests 7m 4s ✓ j17_dtests 35m 30s ✓ j17_dtests_vnode 35m 34s ✕ j11_jvm_dtests 26m 8s org.apache.cassandra.fuzz.ring.ConsistentBootstrapTest coordinatorIsBehindTest ✕ j11_unit_tests 15m 46s org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest updateTest ✕ j11_utests_oa 15m 46s org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest updateTest ✕ j11_utests_system_keyspace_directory 16m 17s org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest updateTest ✕ j17_jvm_dtests 25m 12s org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest testEndpointVerificationEnabledIpNotInSAN TIMEOUTED ✕ j17_jvm_dtests_vnode 23m 24s junit.framework.TestSuite org.apache.cassandra.fuzz.harry.integration.model.InJVMTokenAwareExecutorTest TIMEOUTED org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest testEndpointVerificationEnabledIpNotInSAN TIMEOUTED ✕ j17_unit_tests 14m 54s org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest updateTest ✕ j17_utests_oa 13m 18s org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest updateTest java11_separate_tests {noformat} [java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3896/workflows/b64e0655-b604-4fad-a917-cbe6429819c6] [java17_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3896/workflows/afeaf2b1-3d7b-4231-b669-9be4d6e9d7a7] [java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3896/workflows/2d7b6b75-8bdd-4c81-959d-9e3db922565e] [java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3896/workflows/74474984-97f7-4a1f-8541-b66291e93ef9] > local consistencies may get timeout if blocking read repair is sending the > read repair mutation to other DC > ------------------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-19120 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19120 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair > Reporter: Runtian Liu > Assignee: Runtian Liu > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: image-2023-11-29-15-26-08-056.png, signature.asc > > Time Spent: 10m > Remaining Estimate: 0h > > For a two DCs cluster setup. When a new node is being added to DC1, for > blocking read repair triggered by local_quorum in DC1, it will require to > send read repair mutation to an extra node(1)(2). The selector for read > repair may select *ANY* node that has not been contacted before(3) instead of > selecting the DC1 nodes. If a node from DC2 is selected, this will cause 100% > timeout because of the bug described below: > When we initialized the latch(4) for blocking read repair, the shouldBlockOn > function will only return true for local nodes(5), the blockFor value will be > reduced if a local node doesn't require repair(6). The blockFor is same as > the number of read repair mutation sent out. But when the coordinator node > receives the response from the target nodes, the latch only count down for > nodes in same DC(7). The latch will wait till timeout and the read request > will timeout. > This can be reproduced if you have a constant load on a 3 + 3 cluster when > adding a node. If you have someway to trigger blocking read repair(maybe by > adding load using stress tool). If you use local_quorum consistency with a > constant read after write load in the same DC that you are adding node. You > will see read timeout issue from time to time because of the bug described > above > > I think for read repair when selecting the extra node to do repair, we should > prefer local nodes than the nodes from other region. Also, we need to fix the > latch part so even if we send mutation to the nodes in other DC, we don't get > a timeout. > (1)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/locator/ReplicaPlans.java#L455] > (2)[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L183] > (3)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/locator/ReplicaPlans.java#L458] > (4)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L96] > (5)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L71] > (6)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L88] > (7)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L113] > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org