Hi Ibrahim, Below example taken from your older mail thread.
>>>>> 1. leader (L) sends a proposal p with zxid =10 to F1 and F2. >>>>> 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2 crashes before receiving P10. L has not received any ACKs My thoughts for the above scenario is, In your case, zk client sees a successful response from F1. Then assume F2 joins quorum first and L become the leader again. But the newly formed quorum will not have the zxid=10 transaction. This will make the cluster inconsistent, isn't it? Apart from the above case I'm not seeing any other problems with 3 node cluster. The above data loss case can be avoided by putting an assumption that more than a tolerated number of server failures may affect the cluster consistency and results in data loss. But I feel this optimization would have more cases if we scale up the cluster size beyond 3 servers. Now, I'm not thinking in that direction as your case is limited to 3 node cluster. Regards, Rakesh On Tue, Sep 29, 2015 at 2:28 PM, Ibrahim El-sanosi (PGR) < i.s.el-san...@newcastle.ac.uk> wrote: > Yes Alex, in my post I mentioned that this (small) optimization can only > work with 3-servers cluster. > > Who could confirm the optimization can work? > > Ibrahim > > -----Original Message----- > From: Alexander Shraer [mailto:shra...@gmail.com] > Sent: Tuesday, September 29, 2015 12:11 ص > To: user@zookeeper.apache.org > Subject: Re: 3-server Zab cluster > > I'm not 100% sure whether operations that were pending on the leader are > sent out during sync when this leader looses quorum and re-elected. If so, > then maybe you're right. But in any case, this would not work for 5 or more > servers... > > On Mon, Sep 28, 2015 at 3:51 PM, Ibrahim El-sanosi (PGR) < > i.s.el-san...@newcastle.ac.uk> wrote: > > > Thank you Alex for replaying. > > > > When you said " the leader gets re-elected and the operation is > > truncated from logs at other servers". I though the new leader will > > sync the its logs with other followers (synchronization phase), > > resulting in the operation will commit by new quorum. Let me make the > scenarios as steps: > > > > 1. leader (L) sends a proposal p with zxid =10 to F1 and F2. > > 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2 > > crashes before receiving P10. L has not received any ACKs > > > > Possible solution (1) > > The leader will move to LOOKING phase as there is no quorum supporting > > its leadership. Now Assume F2 wakes up. F2 forms a quorum with the L > > (pervious leader), L becomes new leader again as it has latest zxid (10) > in its log. > > L syncs its state with F2, as a result L, F1 (before crashing) and F2 > > commit P10. Is that correct? > > > > Possible solution (2) > > The leader will move to LOOKING phase as there is no quorum supporting > > its leadership. Now Assume F1 (with Zxid =10 committed) wakes up. I > > am not sure who should be a leader (F1 with Zxid =10 committed or L > > (pervious > > leader) with Zxid = 10 logged), I think F1 become a new leader as it > > has Zxid = 10 committed. F1 forms a quorum with the L (pervious > > leader), F1 becomes new leader as it has latest zxid (10) . L (new > > leader) syncs its state with L (pervious leader now become a > > follower), as a result Zxid10 commits by new quorum. Is that correct? > > > > What do you think? > > > > Ibrahim > > > > > > > > > > > > -----Original Message----- > > From: Alexander Shraer [mailto:shra...@gmail.com] > > Sent: Monday, September 28, 2015 07:27 م > > To: user@zookeeper.apache.org > > Cc: d...@zookeeper.apache.org > > Subject: Re: 3-server Zab cluster > > > > Committing locally when sending an ACK at a server would lead to loss > > of consistency - it is possible that this is the only server that > > acks, e.g., this server is temporarily disconnected from the leader, > > the leader gets re-elected and the operation is truncated from logs at > > other servers. Its ok to ACK it but its not ok to commit since this > > exposes this to users as a committed operation that they can see. > > > > On Mon, Sep 28, 2015 at 4:19 AM, Ibrahim El-sanosi (PGR) < > > i.s.el-san...@newcastle.ac.uk> wrote: > > > > > In Zab, assume we have a cluster consists of 3-servers. To deliver a > > > write request, it must run 3 communication steps proposal, > > > acknowledgement and commit. > > > As Zab uses reliable FIFO, it is possible to remove commit round. As > > > soon as a follower receives a proposal, it logs, sends an ACK and > > > commits locally. Upon receiving ACK from any follower, leader > > > commits a proposal locally, no COMMIT message need to be sent to > > > followers. In this case, all servers commit a proposal in two > > > round-trips, resulting in reducing latency particularly in followers. > > > > > > Note that this optimization can only work in 3-servers cluster > > > (follower reaches a majority as soon as it acks). > > > Does anyone see any problems with such (small) optimization? > > > Ibrahim > > > > > >