Re: improving tolerance to network failures

2018-10-23 Thread Ted Dunning
Michael, I wouldn't characterize the current proposal as broken so much as it talks about connection balancing rather than server balancing. Other than that, I think I agree with what you are saying. So we have two folks with a feeling that server balancing from the client side is significantly

Re: improving tolerance to network failures

2018-10-23 Thread Michael Han
>> Will there be a code effect? There will be - the current rebalancing algorithm will be broken if no code is done to StaticHostProvider.updateServerList to teach it aware of multiple server addresses belong to the same server. For example, currently if we add a new server through reconfig, the

Re: improving tolerance to network failures

2018-10-23 Thread Ted Dunning
There have been several comments on the document. I will be porting discussions from the document back to the mailing list each day. Alex Shraer makes a good point that with the design as stated, there is no provision for dealing with the rebalancing of client connections during dynamic

improving tolerance to network failures

2018-10-22 Thread Ted Dunning
I am starting work on a project to improve the tolerance of Zookeeper to network failures and would like feedback on the idea. The problem is that with environments where link bonding is forbidden (they exist, trust me), Zookeeper is sensitive to the loss of a single switch or a few network