Jack: Yeah, I understood that you were only killing one ZK at a time.
I think Walter and Shawn are pointing you in the right direction. On Fri, Aug 31, 2018 at 12:53 PM Shawn Heisey <apa...@elyograg.org> wrote: > > On 8/31/2018 12:14 PM, Jack Schlederer wrote: > > Our working hypothesis is that Solr's JVM is caching the IP addresses for > > the ZK hosts' DNS names when it starts up, and doesn't re-query DNS for > > some reason when it finds that that IP address is no longer reachable > > (i.e., when a ZooKeeper node dies and spins up at a different IP). > > It might be the Solr JVM that's doing this, but it is NOT Solr code. It > is ZooKeeper code. > > Solr incorporates the ZooKeeper jar and uses the ZooKeeper API for all > interaction with ZooKeeper. There is nothing we can do for this DNS > problem -- it is a problem that must be raised with the ZooKeeper project. > > As Walter hinted, ZooKeeper 3.4.x is not capable of dynamically > adding/removing servers to/from the ensemble. To do this successfully, > all ZK servers and all ZK clients must be upgraded to 3.5.x. Solr is a > ZK client when running in cloud mode. The 3.5.x version of ZK is > currently in beta. When a stable version is released, Solr will have > its dependency upgraded in the next release. We do not know if you can > successfully replace the ZK jar in Solr with a 3.5.x version without > making changes to the code. > > Thanks, > Shawn >