Sounds fine with me, probably should make it a flaggable option. C
On Mon, Jan 9, 2012 at 3:33 PM, Neha Narkhede <neha.narkh...@gmail.com>wrote: > >> If you just have machine names in a list that you pass in, then yes, we > could re-resolve on every reconnect and you could just re-alias that name > to a new IP. But you'll have to put in logic that will do that but not > break people using DNS RR. > > Having a list of machine names that can be changed to point to new IPs > seems reasonable too. To be able to do the upgrade without having to > restart all clients, besides turning off DNS caching in the JVM, we > still have to solve the problem of zookeeper client caching the IPs in > code. Having 2 levels of DNS caching, one in the JVM and one in code > (which cannot be turned off) doesn't look like a good idea. Unless I'm > missing the purpose of such IP caching in zookeeper ? > > >> I realize that moving machines is difficult when you have lots of > clients. > I'm a bit surprised your admins can't maintain machine IP addresses on a > machine move given a cluster of that complexity, though > > Its not like it can't be done, it definitely has quite some > operational overhead. We are trying to brainstorm various approaches > and come up with one that will involve the least overhead on such > upgrades going forward. > > Having said that, seems like re-resolving host names in reconnect > doesn't look like a bad idea, provided it doesn't break the DNS RR use > case. If that sounds good, can I go ahead a file a JIRA for this ? > > Thanks, > Neha > > On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <cami...@apache.org> > wrote: > > We don't shuffle IPs after the initial resolution of IP addresses. > > > > In DNS RR, you resolve to a list of IPs, shuffle these, and then we round > > robin through them trying to connect. If you re-resolve on every > > round-robin, you have to put in logic to know which ones have changed and > > somehow maintain that shuffle order or you aren't doing a fair back end > > round robin, which people using the ZK client against DNS RR are relying > on > > today. > > > > If you just have machine names in a list that you pass in, then yes, we > > could re-resolve on every reconnect and you could just re-alias that name > > to a new IP. But you'll have to put in logic that will do that but not > > break people using DNS RR. > > > > I realize that moving machines is difficult when you have lots of > clients. > > I'm a bit surprised your admins can't maintain machine IP addresses on a > > machine move given a cluster of that complexity, though. I also think > that > > if we're going to be putting special cases like this in we might just > want > > to go all the way to a pluggable reconnection scheme, but maybe that is > too > > aggressive. > > > > C > > > > On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede <neha.narkh...@gmail.com > >wrote: > > > >> Maybe I didn't express myself clearly. When I said DNS RR, I meant its > >> simplest implementation which resolves a hostname to multiple IPs. > >> > >> Whatever method you use to map host names to IPs, the problem is that > >> the zookeeper client code will always cache the IPs. So to be able to > >> swap out a machine, all clients would have to be restarted, which if > >> you have 100s of clients, is a major pain. If you want to move the > >> entire cluster to new machines, this becomes even harder. > >> > >> I don't see why re-resolving host names to IPs in the reconnect logic > >> is a problem for zookeeper, since you shuffle the list of IPs anyways. > >> > >> Thanks, > >> Neha > >> > >> > >> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <cami...@apache.org> > >> wrote: > >> > You can't sensibly round robin within the client code if you > re-resolve > >> on > >> > every reconnect, if you're using dns rr. If that's your goal you'd > want a > >> > list of dns alias names and re-resolve each hostname when you hit it > on > >> > reconnect. But that will break people using dns rr. > >> > You can look into writing a pluggable reconnect logic into the zk > client, > >> > that's what would be required to do this but at the end of the day > you'll > >> > have to give your users special clients to make that work. > >> > > >> > C > >> > On Jan 9, 2012 1:16 PM, "Neha Narkhede" <neha.narkh...@gmail.com> > >> wrote: > >> > > >> >> I was reading through the client code and saw that zookeeper client > >> >> caches the server IPs during startup and maintains it for the rest of > >> >> its lifetime. If we go with the DNS RR approach or a load balancer > >> >> approach, and later swap out a server with a new one ( with a new IP > >> >> ), all clients would have to be restarted to be able to "forget" the > >> >> old IP and see the new one. That doesn't look like a clean approach > to > >> >> such upgrades. One way of getting around this problem, is adding the > >> >> resolution of host names to IPs in the "reconnect" logic in addition > >> >> to the constructor. So when such upgrades happen and the client > >> >> reconnects, it will see the new list of IPs, and wouldn't require to > >> >> be restarted. > >> >> > >> >> Does this approach sound good or am I missing something here ? > >> >> > >> >> Thanks, > >> >> Neha > >> >> > >> >> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier < > cami...@apache.org> > >> >> wrote: > >> >> > DNS RR is good. I had good experiences using that for my client > >> >> > configs for exactly the reasons you are listing. > >> >> > > >> >> > On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede < > >> neha.narkh...@gmail.com> > >> >> wrote: > >> >> >> Thanks for the responses! > >> >> >> > >> >> >>>> How are your clients configured to find the zks now? > >> >> >> > >> >> >> Our clients currently use the list of hostnames and ports that > >> >> >> comprise the zookeeper cluster. For example, > >> >> >> zoo1:port1,zoo2:port2,zoo3:port3 > >> >> >> > >> >> >>>> > - switch DNS, > >> >> >>> - wait for caches to die, > >> >> >> > >> >> >> This is something we thought about however, if I understand it > >> >> >> correctly, doesn't JVM cache DNS entries forever until it is > >> restarted > >> >> >> ? We haven't specifically turned DNS caching off on our clients. > So > >> >> >> this solution would require us to restart the clients to see the > new > >> >> >> list of zookeeper hosts. > >> >> >> > >> >> >> Another thought is to use DNS RR and have the client zk url have > one > >> >> >> name that resolves to and returns a list of IPs to the zookeeper > >> >> >> client. This has the advantage of being able to perform hardware > >> >> >> migration without changing the client connection url, in the > future. > >> >> >> Do people have thoughts about using a DNS RR ? > >> >> >> > >> >> >> Thanks, > >> >> >> Neha > >> >> >> > >> >> >> On Tue, Dec 20, 2011 at 1:06 PM, Ted Dunning < > ted.dunn...@gmail.com> > >> >> wrote: > >> >> >>> In particular, aren't you using DNS names? If you are, then you > can > >> >> >>> > >> >> >>> - expand the quorum with the new hardware on new IP addresses, > >> >> >>> - switch DNS, > >> >> >>> - wait for caches to die, > >> >> >>> - restart applications without reconfig or otherwise force new > >> >> connections, > >> >> >>> - decrease quorum size again > >> >> >>> > >> >> >>> On Tue, Dec 20, 2011 at 12:26 PM, Camille Fournier < > >> cami...@apache.org > >> >> >wrote: > >> >> >>> > >> >> >>>> How are your clients configured to find the zks now? How many > >> clients > >> >> do > >> >> >>>> you have? > >> >> >>>> > >> >> >>>> From my phone > >> >> >>>> On Dec 20, 2011 3:14 PM, "Neha Narkhede" < > neha.narkh...@gmail.com> > >> >> wrote: > >> >> >>>> > >> >> >>>> > Hi, > >> >> >>>> > > >> >> >>>> > As part of upgrading to Zookeeper 3.3.4, we also have to > migrate > >> our > >> >> >>>> > zookeeper cluster to new hardware. I'm trying to figure out > the > >> best > >> >> >>>> > strategy to achieve that with no downtime. > >> >> >>>> > Here are some possible solutions I see at the moment, I could > >> have > >> >> >>>> > missed a few though - > >> >> >>>> > > >> >> >>>> > 1. Swap each machine out with a new machine, but with the same > >> >> host/IP. > >> >> >>>> > > >> >> >>>> > Pros: No client side config needs to be changed. > >> >> >>>> > Cons: Relatively tedious task for Operations > >> >> >>>> > > >> >> >>>> > 2. Add new machines, with different host/IPs to the existing > >> >> cluster, > >> >> >>>> > and remove the older machines, taking care to maintain the > >> quorum at > >> >> >>>> > all times > >> >> >>>> > > >> >> >>>> > Pros: Easier for Operations > >> >> >>>> > Cons: Client side configs need to be changed and clients need > to > >> be > >> >> >>>> > restarted/bounced. Another problem is having a large quorum > for > >> >> >>>> > sometime (potentially 9 nodes). > >> >> >>>> > > >> >> >>>> > 3. Hide the new cluster behind either a Hardware load balancer > >> or a > >> >> >>>> > DNS server resolving to all host ips. > >> >> >>>> > > >> >> >>>> > Pros: Makes it easier to move hardware around in the future > >> >> >>>> > Cons: Possible timeout issues with load balancers messing with > >> >> >>>> > zookeeper functionality or performance > >> >> >>>> > > >> >> >>>> > Read this and found it helpful - > >> >> >>>> > > >> >> >>>> > > >> >> >>>> > >> >> > >> > http://apache.markmail.org/message/44tbj53q2jufplru?q=load+balancer+list:org%2Eapache%2Ehadoop%2Ezookeeper-user&page=1 > >> >> >>>> > But would like to hear from the authors and the users who > might > >> have > >> >> >>>> > tried this in a real production setup. > >> >> >>>> > > >> >> >>>> > I'm very interested in finding a long term solution for > masking > >> the > >> >> >>>> > zookeeper host names. Any inputs here are appreciated ! > >> >> >>>> > > >> >> >>>> > In addition to this, it will also be great to know what people > >> think > >> >> >>>> > about options 1 and 2, as a solution for hardware changes in > >> >> >>>> > Zookeeper. > >> >> >>>> > > >> >> >>>> > Thanks, > >> >> >>>> > Neha > >> >> >>>> > > >> >> >>>> > >> >> > >> >