The most likely explanation is that repair failed and you didnt notice. Or that you didnt actually repair every host / every range.
Which version are you using? How did you run repair? On Tue, Oct 12, 2021 at 4:33 AM Isaeed Mohanna <isa...@xsense.co> wrote: > Hi > > Yes I am sacrificing consistency to gain higher availability and faster > speed, but my problem is not with newly inserted data that is not there for > a very short period of time, my problem is the data that was there before > the RF change, still do not exist in all replicas even after repair. > > It looks like my cluster configuration is RF3 but the data itself is still > using RF2 and when the data is requested from the 3rd (new) replica, it > is not there and an empty record is returned with read CL1. > > What can I do to force this data to be synced to all replicas as it > should? So read CL1 request will actually return a correct result? > > > > Thanks > > > > *From:* Bowen Song <bo...@bso.ng> > *Sent:* Monday, October 11, 2021 5:13 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Trouble After Changing Replication Factor > > > > You have RF=3 and both read & write CL=1, which means you are asking > Cassandra to give up strong consistency in order to gain higher > availability and perhaps slight faster speed, and that's what you get. If > you want to have strong consistency, you will need to make sure (read CL + > write CL) > RF. > > On 10/10/2021 11:55, Isaeed Mohanna wrote: > > Hi > > We had a cluster with 3 Nodes with Replication Factor 2 and we were using > read with consistency Level One. > > We recently added a 4th node and changed the replication factor to 3, > once this was done apps reading from DB with CL1 would receive an empty > record, Looking around I was surprised to learn that upon changing the > replication factor if the read request is sent to a node the should own the > record according to the new replication factor while it still doesn’t have > it yet then an empty record will be returned because of CL1, the record > will be written to that node after the repair operation is over. > > We ran the repair operation which took days in our case (we had to change > apps to CL2 to avoid serious data inconsistencies). > > Now the repair operations are over and if I revert to CL1 we are still > getting errors that records do not exist in DB while they do, using CL2 > again it works fine. > > Any ideas what I am missing? > > Is there a way to validate that the repairs task has actually done what is > needed and that the data is actually now replicated RF3 ? > > Could it it be a Cassandra Driver issue? Since if I issue the request in > cqlsh I do get the record but I cannot know if I am hitting the replica > that doesn’t hold the record > > Thanks for your help > >