Re: Trouble After Changing Replication Factor

Jeff Jirsa Tue, 12 Oct 2021 06:59:14 -0700

The most likely explanation is that repair failed and you didnt notice.
Or that you didnt actually repair every host / every range.


Which version are you using?
How did you run repair?


On Tue, Oct 12, 2021 at 4:33 AM Isaeed Mohanna <isa...@xsense.co> wrote:

> Hi
>
> Yes I am sacrificing consistency to gain higher availability and faster
> speed, but my problem is not with newly inserted data that is not there for
> a very short period of time, my problem is the data that was there before
> the RF change, still do not exist in all replicas even after repair.
>
> It looks like my cluster configuration is RF3 but the data itself is still
> using RF2 and when the data is requested from the 3rd (new) replica, it
> is not there and an empty record is returned with read CL1.
>
> What can I do to force this data to be synced to all replicas as it
> should? So read CL1 request will actually return a correct result?
>
>
>
> Thanks
>
>
>
> *From:* Bowen Song <bo...@bso.ng>
> *Sent:* Monday, October 11, 2021 5:13 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Trouble After Changing Replication Factor
>
>
>
> You have RF=3 and both read & write CL=1, which means you are asking
> Cassandra to give up strong consistency in order to gain higher
> availability and perhaps slight faster speed, and that's what you get. If
> you want to have strong consistency, you will need to make sure (read CL +
> write CL) > RF.
>
> On 10/10/2021 11:55, Isaeed Mohanna wrote:
>
> Hi
>
> We had a cluster with 3 Nodes with Replication Factor 2 and we were using
> read with consistency Level One.
>
> We recently added a 4th node and changed the replication factor to 3,
> once this was done apps reading from DB with CL1 would receive an empty
> record, Looking around I was surprised to learn that upon changing the
> replication factor if the read request is sent to a node the should own the
> record according to the new replication factor while it still doesn’t have
> it yet then an empty record will be returned because of CL1, the record
> will be written to that node after the repair operation is over.
>
> We ran the repair operation which took days in our case (we had to change
> apps to CL2 to avoid serious data inconsistencies).
>
> Now the repair operations are over and if I revert to CL1 we are still
> getting errors that records do not exist in DB while they do, using CL2
> again it works fine.
>
> Any ideas what I am missing?
>
> Is there a way to validate that the repairs task has actually done what is
> needed and that the data is actually now replicated RF3 ?
>
> Could it it be a Cassandra Driver issue? Since if I issue the request in
> cqlsh I do get the record but I cannot know if I am hitting the replica
> that doesn’t hold the record
>
> Thanks for your help
>
>

Re: Trouble After Changing Replication Factor

Reply via email to