Re: How to handle zookeeper data inconsistency

Flavio Junqueira Thu, 21 Jan 2016 07:35:02 -0800

> On 21 Jan 2016, at 15:03, Mohammad arshad <[email protected]> wrote:
> 
> Thanks for Flavio Junqueira for your response.
> 
> assume C received the commit request but before committing it failed, When C 
> will be synced? What event will at leader or follower will synch it up.


If C failed, then when it comes back up, it will sync up with the leader and 
learn everything that has been committed. This is part of the recovery process. 
Even though there are multiple steps, you can assume that once C is back 
online, it will have reflected in its state all committed have previously 
committed, excepted for the ones that are still in-flight.

> 
> Here is another scenario we faced.
> Node got deleted successfully in leader node B. But due to network issue in 
> Master node, the delete could not sync up to follower A and C.  At this 
> moment, Leader node also goes down as faulty. 
> 

If B successfully committed the delete operation, even if the commit message 
didn't go out, then it means that at least another node got the proposal. In 
your 3-server ensemble, a quorum has size 2, so any proposal needs to be 
persisted and acknowledged by a quorum before it is committed.

> Now one of the A and C becomes leader but it has inconsistent data. ( delete 
> is not executed here)
> 

It will be executed there because the new leader, A or C, needs to commit the 
initial state of the new epoch and it will do it based on its log state, which 
will include the delete operation.

> As I know, This behavior is fine as per current ZK design. But to solve above 
> data inconsistency issue, any suggestions ? I thought to commit the delete 
> not only in leader but to at least in N/2 nodes in the same client call and 
> then only mark delete as successful

No, not fine. If a quorum has acknowledged a txn, then we guarantee that the 
corresponding operation is durable. The thing that is ok as per ZK design is 
that the delete operation is acknowledged, and a particular server, say C, only 
receives it a little later. In this case, it could happen that a client reads 
the ZK state but misses the delete. However, if the client keeps reading, then 
it should eventually see the delete.

Another thing that is fine is that if no quorum acknowledges a txn, then the 
txn isn't durable. 

-Flavio
 
> 
> -----Original Message-----
> From: Flavio Junqueira [mailto:[email protected]] 
> Sent: 21 January 2016 19:11
> To: [email protected]
> Cc: dev
> Subject: Re: How to handle zookeeper data inconsistency
> 
> Hi Mohammad,
> 
> A delete operation only needs to reach a quorum to complete and A B form a 
> quorum in your 3-server ensemble. If the delete operation never gets 
> propagated to C and other write operations that have been ordered later 
> complete on C, then you have an issue. If C simply stops receiving updates, 
> then you have a problem with your C server and it could be a problem with ZK 
> or just the environment.
> 
> If there has been write operations ordered after the delete and server C has 
> seen those but not the delete, then I'd recommend that you have a look the 
> txn logs with the log formatter.
> 
>> shall I check exists from leader only? but even leader can have some 
>> node undeleted in the above scenario
> 
> There is no such a requirement, but you need to be aware that server C could 
> definitely make an update visible later compared to other servers. ZooKeeper 
> doesn't guarantee that updates are visible to all clients as soon as they are 
> acknowledged.
> 
> I'd also search for jiras, especially if you're deleting an ephemeral. 
> 
> -Flavio
> 
>> On 21 Jan 2016, at 13:24, Mohammad arshad <[email protected]> wrote:
>> 
>> Hi All,
>> I came across a scenario where zookeeper was left in inconsistent 
>> state(but that is valid as per the zookeeper theory) and because of 
>> that dependent application's behaved wrongly The scenario is as follow
>> 
>> 1) I have three server Zookeeper cluster, let's say servers are A, B 
>> and C. B is the leader
>> 2) In one successful delete operation, a znode znode1 was deleted from A and 
>> B but somehow not deleted from C. The reason for not deleted from C can be 
>> either proposal or commit failed.
>> 3) Now for application, which is connected to C, ZooKeeper.exists  
>> returns the znod1 and that is why application enters into node exists 
>> flow which is wrong
>> 
>> shall I check exists from leader only? but even leader can have some 
>> node undeleted in the above scenario Any guideline to handle the above said 
>> valid data inconsistency ??
>> 
>> Any suggestion/help is highly appreciated.
>> 
>> Best Regards
>> Mohammad Arshad
>> HUAWEI TECHNOLOGIES CO.LTD.
>> Huawei Tecnologies India Pvt. Ltd.
>> Near EPIP Industrial Area, Kundalahalli Village Whitefield, 
>> Bangalore-560066 www.huawei.com<http://www.huawei.com/>
>> ----------------------------------------------------------------------
>> -------------------------------------------
>> This e-mail and its attachments contain confidential information from 
>> HUAWEI, which is intended only for the person or entity whose address 
>> is listed above. Any use of the information contained herein in any 
>> way (including, but not limited to, total or partial disclosure, 
>> reproduction, or dissemination) by persons other than the intended
>> recipient(s) is prohibited. If you receive this e-mail in error, 
>> please notify the sender by phone or email immediately and delete it!
>> 
>

Re: How to handle zookeeper data inconsistency

Reply via email to