I see. In that case, I suspect the repair wasn't fully successful. Try
repair the new joined node again, and make sure it actually finishes
successfully.
On 12/10/2021 12:23, Isaeed Mohanna wrote:
Hi
Yes I am sacrificing consistency to gain higher availability and
faster speed, but my problem is not with newly inserted data that is
not there for a very short period of time, my problem is the data that
was there before the RF change, still do not exist in all replicas
even after repair.
It looks like my cluster configuration is RF3 but the data itself is
still using RF2 and when the data is requested from the 3^rd (new)
replica, it is not there and an empty record is returned with read CL1.
What can I do to force this data to be synced to all replicas as it
should? So read CL1 request will actually return a correct result?
Thanks
*From:* Bowen Song <bo...@bso.ng>
*Sent:* Monday, October 11, 2021 5:13 PM
*To:* user@cassandra.apache.org
*Subject:* Re: Trouble After Changing Replication Factor
You have RF=3 and both read & write CL=1, which means you are asking
Cassandra to give up strong consistency in order to gain higher
availability and perhaps slight faster speed, and that's what you get.
If you want to have strong consistency, you will need to make sure
(read CL + write CL) > RF.
On 10/10/2021 11:55, Isaeed Mohanna wrote:
Hi
We had a cluster with 3 Nodes with Replication Factor 2 and we
were using read with consistency Level One.
We recently added a 4^th node and changed the replication factor
to 3, once this was done apps reading from DB with CL1 would
receive an empty record, Looking around I was surprised to learn
that upon changing the replication factor if the read request is
sent to a node the should own the record according to the new
replication factor while it still doesn’t have it yet then an
empty record will be returned because of CL1, the record will be
written to that node after the repair operation is over.
We ran the repair operation which took days in our case (we had to
change apps to CL2 to avoid serious data inconsistencies).
Now the repair operations are over and if I revert to CL1 we are
still getting errors that records do not exist in DB while they
do, using CL2 again it works fine.
Any ideas what I am missing?
Is there a way to validate that the repairs task has actually done
what is needed and that the data is actually now replicated RF3 ?
Could it it be a Cassandra Driver issue? Since if I issue the
request in cqlsh I do get the record but I cannot know if I am
hitting the replica that doesn’t hold the record
Thanks for your help