persist happens in 2. 2015-01-05 18:55 GMT+08:00 Rakesh R <[email protected]>:
> Hi, > > In your case only A and E has committed the latest transaction say am > calling it as txid=1000. B, C, D servers are down at this time and doesn't > have the changes of txid=1000. > Also, when restarting B,C,D the servers A, E are not available. Now the > newly elected Leader is seeing atmost txid=999 and when A, E rejoins the > quorum it will 'truncate' himself by deleting the txid=1000. As you said, > the write operation performed will be lost in this case. > > I could see this is a kinda tricky case of double failures or multiple > failures. But I agree this can happen. > My point is, if user wants to maintain a reliable cluster then he should > keep in mind that the failures more than the tolerated number of failures > may leads to unexpected results like this. > > > Best Regards, > Rakesh > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: 05 January 2015 15:56 > To: [email protected] > Subject: Re: Question about the two-phrase commit > > Could someone help on this question? Thanks. > > > > [email protected] > > From: [email protected] > Date: 2015-01-05 15:05 > To: [email protected] > Subject: Question about the two-phrase commit > > Hi,Zookeepers, > > I got a question about the two phrase commit in Zookeeper. When a write > operation happens > > 1. Leader proposes all the followers to accept the change(Proposal Vote > phrase) 2. Followers ack the proposal and writes the change to the disk(but > not persisted yet?) 3. When the Leader receives the majority of acks from > followers, the Leader asks the followers to commit the change 4. When each > follower receives the commit request, follower commits the changes(persist > the change for ever?) > > In the above process, something rare could happen a. Say,there are 5 nodes > in the quorum(1 leader E, 4 follower A,B,C,D). > b. The write operation is issued by the client that connects to Follower A > c. A commits the changes and response to the client that the writer > succeeds. > d. Assume that When the response from A is back to client telling the > client that the write is successful, But in the period, the other followers > (B,C,D) haven't even received the commit request, and B,C,D are down > without getting a chance to commit the change. > > > Then shut down A and E. > Restart B,C,D,making sure that they will elect a leader.and A start > later(A's latest tranactions will be lost,because A will sync with Lead). > > When this is done, the write operation done before is lost? > > Is there anything I miss in the above process? Thanks. > > > > > > [email protected] >
