[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2023-03-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694981#comment-17694981
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

branch: [https://github.com/apache/cassandra/pull/1683]
ci: [https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2304/]

I do not think that this is happening in 4.0+.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2023-02-24 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693173#comment-17693173
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

[~jaid] is this happening in 4.0 too? I am trying to figure that out. The logic 
around this stuff was rewritten there.

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/repair/BlockingReadRepair.java#L82-L106

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Priority: Low
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2023-02-24 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693126#comment-17693126
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

Let's give it another shot.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-08-24 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17584077#comment-17584077
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

I do not plan to work on this. I am not sure how to move forward.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-08-24 Thread Jai Bheemsen Rao Dhanwada (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17584057#comment-17584057
 ] 

Jai Bheemsen Rao Dhanwada commented on CASSANDRA-14715:
---

[~stefan.miklosovic] do you have any estimate if this will be released anything 
sooner? thank you

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-06-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554144#comment-17554144
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

I added in-jvm dtest, in this branch.

The problem with that test is that it is flaky and I am not sure why. I tried 
to replicate same steps as [~cam1982] .

If exception is thrown, it can be either detected as part of shutdown process 
when cluster instance is being closed in try or it can be thrown right away. 
Sometimes it is not thrown at all and it just passes.

If you have any ideas [~cam1982] where this flakiness comes from that would be 
great.

https://github.com/apache/cassandra/pull/1683

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-06-07 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551078#comment-17551078
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

The proposed approch seems good to me but I had hard time to test this 
consistently. I tried the steps in the description of this ticket and sometimes 
I got the same response, sometimes I did not. Thinking about testing, I am not 
sure what test to write here. The options we have:

1) jvm dtest - this approach would be about setuping 2x3 cluster, inserting 
data, shutting down the node, removing the sstables of this node and starting 
it again, executing the query and watching its logs. I think that the step 
"removing of sstables" is not necessary because, I think, data dir of that node 
is automatically remove on the shutdown. I am not sure about the details and 
viability of this test approach yet.

2) same as 1 but it would be done in python dtests

3) Testing only RepairMergeListener and its close method. This would be very 
nice to do but what I noticed is that all inner classes in DataResolver 
(RepairMergeListener is inner class of DataResolver) are not static and they 
are all private. I can not just easilly test this class in isolation. I would 
need to rewrite it all to be static classes and so and this might have 
not-so-obvious consequences yet.

What I found interesting while I was testing this is that when I turned the 
node off, removed data, turned it on and listed the data dir of respective 
table, that SSTable was there again. How is this possible? Is not it like 
commit logs were flushed on the startup or something like that? I think we 
would need to remove commit logs too.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out 

[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-06-07 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550869#comment-17550869
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

I am on it.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-05-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542250#comment-17542250
 ] 

Stefan Miklosovic commented on CASSANDRA-14715:
---

[~brandon.williams] Not really, honestly, I forgot this one completely. I can 
take a look, indeed, sometimes next week.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-05-25 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542215#comment-17542215
 ] 

Brandon Williams commented on CASSANDRA-14715:
--

[~stefan.miklosovic] is this still on your radar?

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2022-05-25 Thread Jai Bheemsen Rao Dhanwada (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542210#comment-17542210
 ] 

Jai Bheemsen Rao Dhanwada commented on CASSANDRA-14715:
---

Any plans to fix this in the upcoming versions or atleast 4.0.x version? the 
error message is quite mis-leading.

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Low
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

2018-09-10 Thread Cameron Zemek (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610127#comment-16610127
 ] 

Cameron Zemek commented on CASSANDRA-14715:
---

I should also point out this means that the timeouts don't get captured in the 
read timeout metric either due to the timeout occuring on the close for the 
PartitionIterator returned by StorageProxy:read where the timeouts are caught 
(see readRegular)

> Read repairs can result in bogus timeout errors to the client
> -
>
> Key: CASSANDRA-14715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Cameron Zemek
>Priority: Minor
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
> FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
> // We got all responses, but timed out while repairing
> int blockFor = consistency.blockFor(keyspace);
> if (Tracing.isTracing())
> Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> else
> logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
> throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List results, long ms, 
> MutableInt counter) throws TimeoutException
> {
> for (AsyncOneResponse result : results)
> {
> result.get(ms, TimeUnit.MILLISECONDS);
> counter.increment();
> }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org