[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434483#comment-17434483 ] Andres de la Peña commented on CASSANDRA-16334: --- Committed to 3.0 as [7f54fe02298b90e6152acc026384c033a96ce621|https://github.com/apache/cassandra/commit/7f54fe02298b90e6152acc026384c033a96ce621] and merged to [3.11|https://github.com/apache/cassandra/commit/c76a939c3eb9aa68abd0b892ab09bcbf09157e10], [4.0|https://github.com/apache/cassandra/commit/530bc914cdf28c9c10eb53e3614b16cb9da0787b] and [trunk|https://github.com/apache/cassandra/commit/37830770d1e54703c4b30a67c259b50317e3d4e3]. Dtest committed as [7c89cade286fa122bf347f9b8660370e57afb5fa|https://github.com/apache/cassandra-dtest/commit/7c89cade286fa122bf347f9b8660370e57afb5fa]. Thanks for the patch. > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434316#comment-17434316 ] Andres de la Peña commented on CASSANDRA-16334: --- Here is another round of CI after rebase, just in case: ||branch||CI|| |3.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1079/workflows/d6923796-59d4-49d9-b240-d14dd57aa46b]| |3.11|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1076/workflows/7e5eb711-9ec5-4ea7-9c8e-56aae8b5caff]| |4.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1078/workflows/42870d8b-58fc-46c1-9d5b-953f4bcc695d] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1078/workflows/4571c5af-8539-45b6-8d59-1595f682aa4b]| |trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1077/workflows/057382c9-5526-4e11-9bac-1f23b8f6e659] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1077/workflows/0db4af4e-8419-4d26-a860-f215c7d67c38]| The changes look good to me, and I'd say that the many test failures above are not related. I'll commit the changes in a bit. > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432952#comment-17432952 ] Paulo Motta commented on CASSANDRA-16334: - LGTM, nice work! > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430088#comment-17430088 ] Andres de la Peña commented on CASSANDRA-16334: --- Oh, I forgot to increase the resource class for the multiplexer (CASSANDRA-17043), here are the right runs: ||branch||CI|| |3.0 |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1070/workflows/2306fd16-704b-4fdf-ad79-c5f518ec2c11]| |3.11 |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1068/workflows/8773d6d4-b5e4-41c2-8287-676b479060d9]| |4.0 |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1069/workflows/0321dc65-0e8c-48ce-a0c6-2d3752f04ca1] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1069/workflows/2e2e8434-cd67-4594-994f-5ac8dc245064]| |trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1067/workflows/00415ca5-2c59-4b74-9c3d-8593ab66804c] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1067/workflows/ba5ea2b7-db21-442e-b973-1a563a97ea72]| > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430080#comment-17430080 ] Andres de la Peña commented on CASSANDRA-16334: --- Great, there is CI for the last changes, rebased and with some repeated runs for the new dtest: ||branch||CI|| |3.0 |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1064/workflows/5c44e8b5-6ff1-4e1c-a623-75478837d44b]| |3.11 |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1063/workflows/1dde4581-cd2c-4550-90c4-355b1855c62f]| |4.0 |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1065/workflows/31306cc0-b030-4a4e-8c8b-d70b110521c8] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1065/workflows/3182ca80-1892-4f95-a0c4-4109b3a48653]| |trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1066/workflows/2b484995-2716-4d34-91ce-b6cc22a1adbb] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1066/workflows/9f0129b1-3a36-44fe-aa77-1eedf4d6e496] > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17429575#comment-17429575 ] Aleksandr Sorokoumov commented on CASSANDRA-16334: -- Thank you for the review and running the CI [~adelapena]! I added the non-null check in 3.0 and 3.11 branches. The same check is not necessary in 4.0 onward, because BatchlogManager no longer passes null as a configured CL level. > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17428545#comment-17428545 ] Andres de la Peña commented on CASSANDRA-16334: --- It seems that the new dtest failures in the above CI runs for 3.0 and 3.11 are caused by a NPE [here|https://github.com/Ge/cassandra/blob/0cfe205c012605affb06d5a53edcd594e6682c76/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java#L125], where the {{consistencyLevel}} can be {{null}} in the new check for DC local. > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17428161#comment-17428161 ] Andres de la Peña commented on CASSANDRA-16334: --- Ok, the repeated dtest runs are failing because the MID resources config for Circle is using medium runners, while it should use large runners. Indeed, the test passes as part of the regular dtest jobs because those jobs correctly use large runners. I'll open a ticket for fixing this. In the meantime, I have manually set the right resource class and the repeated runs pass, as it was expected: ||branch||CI|| |3.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1006/workflows/4c3774df-f49a-4e0b-b1a4-9e5bfee06087]| |3.11|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1005/workflows/cd497fac-1348-4736-8b0d-fccb4dbaacbe]| |4.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1007/workflows/c54bf75a-8005-4843-a432-40487f77b435] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1007/workflows/8814397a-d3fe-41ab-bdb5-fa10ca021494]| |trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1004/workflows/0fc6f761-4eb1-4f88-a281-cccf0c79cb48] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1004/workflows/97e60123-d1af-46d6-9973-654f14f7eb21]| > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427856#comment-17427856 ] Andres de la Peña commented on CASSANDRA-16334: --- [~Ge] here are CircleCI runs for the patches: ||branch||CI|| |3.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/999/workflows/e3c6d75c-8c2c-4193-abea-7c1f582e27ee]| |3.11|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/997/workflows/c6696afa-d529-4bb8-9fae-2880985417d1]| |4.0|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/996/workflows/f37daddc-baec-48fa-8384-db5d15ae55b2] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/996/workflows/ec9dc32a-55d0-4032-a437-0a75d193ee99]| |trunk|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/998/workflows/16728796-0e1b-4991-b579-e45a9dd39715] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/998/workflows/31dd24b9-2428-4142-8925-38f68007483f]| They include 100 repeated runs for the new dtests. These runs are failing but I think that's because of a problem with the Circle environment since they work locally. I'll take a look at this. > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426575#comment-17426575 ] Aleksandr Sorokoumov commented on CASSANDRA-16334: -- I have described the root cause in the previous comment. Two distinct bugs make replica failures appear as timeouts, one for DC- local and -global consistency levels. Fixing the latter also resolves the "zombie-hint" issue I described at the end of the previous message. The reason replica failure appears as a timeout in DC-local consistency level is that {{AbstractWriteResponseHandler}} counts nodes in all DCs as potential candidates to wait for. The fix is to wait only for the DC-local nodes. The second bug that is responsible both for the "zombie-hints" and the timeout issue with global consistency levels is related to forwarding replica failures to the correct address. This patch makes replicas send request failures to the original coordinator rather than the DC-local one that forwarded them the message. Besides, in 3.0 and 3.11, I also added missing respond-on-failure flag to the forwarded messages. Patches: * [dtest|https://github.com/apache/cassandra-dtest/pull/165] * [3.0|https://github.com/apache/cassandra/pull/1259] * [3.11|https://github.com/apache/cassandra/pull/1260] * [4.0|https://github.com/apache/cassandra/pull/1261] * [trunk|https://github.com/apache/cassandra/pull/1262] [~paulo] Can you please start the CI? > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.0, 3.11, 4.0, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write
[ https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425115#comment-17425115 ] Aleksandr Sorokoumov commented on CASSANDRA-16334: -- This bug happens in the [AbstractWriteResponseHandler#onFailure|https://github.com/apache/cassandra/blob/2e2db4dc40c4935305b9a2d5d271580e96dabe42/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java#L252-L265]: {code} @Override public void onFailure(InetAddressAndPort from, RequestFailureReason failureReason) { logger.trace("Got failure from {}", from); int n = waitingFor(from) ? failuresUpdater.incrementAndGet(this) : failures; failureReasonByEndpoint.put(from, failureReason); if (blockFor() + n > candidateReplicaCount()) signal(); } {code} In the reproduction steps, {{INSERT INTO TEST}} uses CL {{LOCAL_ONE}}. Accordingly, [DatacenterWriteResponseHandler#waitingFor|https://github.com/apache/cassandra/blob/2e2db4dc40c4935305b9a2d5d271580e96dabe42/src/java/org/apache/cassandra/service/DatacenterWriteResponseHandler.java#L59-L63] only waits for the local nodes: {code} private final Predicate waitingFor = InOurDcTester.endpoints(); @Override protected boolean waitingFor(InetAddressAndPort from) { return waitingFor.test(from); } {code} [AbstractWriteResponseHandler#candidateReplicaCount()|https://github.com/apache/cassandra/blob/2e2db4dc40c4935305b9a2d5d271580e96dabe42/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java#L205-L213] in the condition above, however, counts live and down replicas in ALL DCs as valid candidates: {code} protected int candidateReplicaCount() { return replicaPlan.liveAndDown().size(); } {code} As a result, even after all local nodes respond with {{FAILURE_RSP}}, the coordinator waits for responses from nodes in other DCs... but never counts them in. There is more! Following the timeout or request failure, the coordinator creates hints for the nodes in other DCs which it will try to deliver forever. > Replica failure causes timeout on multi-DC write > > > Key: CASSANDRA-16334 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16334 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Messaging/Internode >Reporter: Paulo Motta >Assignee: Aleksandr Sorokoumov >Priority: Normal > > Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws > a write error on a single DC keyspace with RF=3: > {noformat} > cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to > execute write] message="Operation failed - received 0 responses and 3 > failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN > from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': > 1, 'received_responses': 0, 'failures': 3} > {noformat} > The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each): > {noformat} > cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'consistency': 'LOCAL_ONE', > 'required_responses': 1, 'received_responses': 0} > {noformat} > Reproduction steps: > {noformat} > # Setup cluster > ccm create -n 3:3 test > for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> > ~/.ccm/test/node$i/conf/cassandra.yaml; done > ccm start > # Create schema > ccm node1 cqlsh > CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', > 'dc1': 3, 'dc2': 3}; > CREATE TABLE test.test (key int PRIMARY KEY, val blob); > exit; > # Insert data > python > from cassandra.cluster import Cluster > session = cluster.connect('test') > blob = f = open("2mbBlob", "rb").read().hex() > session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob > + "'))") > {noformat} > Reproduced in 3.11, trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org