[jira] [Created] (CASSANDRA-16919) cassandra local_quorum query is inconsistent

HUANG DUICAN (Jira) Tue, 07 Sep 2021 18:45:08 -0700

HUANG DUICAN created CASSANDRA-16919:
----------------------------------------


             Summary: cassandra local_quorum query is inconsistent
                 Key: CASSANDRA-16919
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16919
             Project: Cassandra
          Issue Type: Bug
            Reporter: HUANG DUICAN


cassandra version: 2.0.15
Number of nodes: dc1: 80, dc2: 80
problem:
Our copy strategy is as follows:
WITH REPLICATION = \{'class':'NetworkTopologyStrategy','dc1': 3,'dc2': 3};
We encountered a problem with cassandra, and it was inconsistent when querying 
with local_quorum. We will only read and write in dc1.
We also use local_quorum for writing, and then use local_quorum for queries.
But there is a phenomenon, use the following statement:
select count(*) from table where partitionKey=?
The results of the query were initially inconsistent and eventually consistent.

Assuming that the first is 10000, the second is 9998, and the third is 9997, it 
may remain at 10001 in the end(Maybe it was triggered to read repair, which led 
to the final stabilization) .
During this period, we have done a large-scale expansion. And make sure that 
every machine is cleaned up. And we also found that the results of using 
getEndpoint <keyspace> <table> <key> on different machines are inconsistent. In 
the end, we found that the result of getEndpoint has 4 machines in dc1. 

Then we executed getSstable on the corresponding 4 machines, only 3 machines 
showed the results, and the other machine did not show the results. At the same 
time, we encountered a similar problem with another partitionKey, but this 
partitionKey was only queried once, because we recorded the total number of 
partitionKey in another place, and we can confirm that the total number of 
partitionKey is incorrect. 

After we restarted each machine of dc1 one by one, this problem was solved. 
The total number of partitionKey is consistent with the result recorded by us, 
and if the same query is done multiple times, the result will not change. 
Therefore, I suspect that the gossip synchronization node information is too 
slow, which may lead to inconsistent final results when selecting nodes for 
query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16919) cassandra local_quorum query is inconsistent

Reply via email to