Neha Filing a jira is a great way to further the discussion.
Sent from my iPhone On Jan 9, 2012, at 15:33, Neha Narkhede <[email protected]> wrote: >>> If you just have machine names in a list that you pass in, then yes, we > could re-resolve on every reconnect and you could just re-alias that name > to a new IP. But you'll have to put in logic that will do that but not > break people using DNS RR. > > Having a list of machine names that can be changed to point to new IPs > seems reasonable too. To be able to do the upgrade without having to > restart all clients, besides turning off DNS caching in the JVM, we > still have to solve the problem of zookeeper client caching the IPs in > code. Having 2 levels of DNS caching, one in the JVM and one in code > (which cannot be turned off) doesn't look like a good idea. Unless I'm > missing the purpose of such IP caching in zookeeper ? > >>> I realize that moving machines is difficult when you have lots of clients. > I'm a bit surprised your admins can't maintain machine IP addresses on a > machine move given a cluster of that complexity, though > > Its not like it can't be done, it definitely has quite some > operational overhead. We are trying to brainstorm various approaches > and come up with one that will involve the least overhead on such > upgrades going forward. > > Having said that, seems like re-resolving host names in reconnect > doesn't look like a bad idea, provided it doesn't break the DNS RR use > case. If that sounds good, can I go ahead a file a JIRA for this ? > > Thanks, > Neha > > On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <[email protected]> wrote: >> We don't shuffle IPs after the initial resolution of IP addresses. >> >> In DNS RR, you resolve to a list of IPs, shuffle these, and then we round >> robin through them trying to connect. If you re-resolve on every >> round-robin, you have to put in logic to know which ones have changed and >> somehow maintain that shuffle order or you aren't doing a fair back end >> round robin, which people using the ZK client against DNS RR are relying on >> today. >> >> If you just have machine names in a list that you pass in, then yes, we >> could re-resolve on every reconnect and you could just re-alias that name >> to a new IP. But you'll have to put in logic that will do that but not >> break people using DNS RR. >> >> I realize that moving machines is difficult when you have lots of clients. >> I'm a bit surprised your admins can't maintain machine IP addresses on a >> machine move given a cluster of that complexity, though. I also think that >> if we're going to be putting special cases like this in we might just want >> to go all the way to a pluggable reconnection scheme, but maybe that is too >> aggressive. >> >> C >> >> On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede <[email protected]>wrote: >> >>> Maybe I didn't express myself clearly. When I said DNS RR, I meant its >>> simplest implementation which resolves a hostname to multiple IPs. >>> >>> Whatever method you use to map host names to IPs, the problem is that >>> the zookeeper client code will always cache the IPs. So to be able to >>> swap out a machine, all clients would have to be restarted, which if >>> you have 100s of clients, is a major pain. If you want to move the >>> entire cluster to new machines, this becomes even harder. >>> >>> I don't see why re-resolving host names to IPs in the reconnect logic >>> is a problem for zookeeper, since you shuffle the list of IPs anyways. >>> >>> Thanks, >>> Neha >>> >>> >>> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <[email protected]> >>> wrote: >>>> You can't sensibly round robin within the client code if you re-resolve >>> on >>>> every reconnect, if you're using dns rr. If that's your goal you'd want a >>>> list of dns alias names and re-resolve each hostname when you hit it on >>>> reconnect. But that will break people using dns rr. >>>> You can look into writing a pluggable reconnect logic into the zk client, >>>> that's what would be required to do this but at the end of the day you'll >>>> have to give your users special clients to make that work. >>>> >>>> C >>>> On Jan 9, 2012 1:16 PM, "Neha Narkhede" <[email protected]> >>> wrote: >>>> >>>>> I was reading through the client code and saw that zookeeper client >>>>> caches the server IPs during startup and maintains it for the rest of >>>>> its lifetime. If we go with the DNS RR approach or a load balancer >>>>> approach, and later swap out a server with a new one ( with a new IP >>>>> ), all clients would have to be restarted to be able to "forget" the >>>>> old IP and see the new one. That doesn't look like a clean approach to >>>>> such upgrades. One way of getting around this problem, is adding the >>>>> resolution of host names to IPs in the "reconnect" logic in addition >>>>> to the constructor. So when such upgrades happen and the client >>>>> reconnects, it will see the new list of IPs, and wouldn't require to >>>>> be restarted. >>>>> >>>>> Does this approach sound good or am I missing something here ? >>>>> >>>>> Thanks, >>>>> Neha >>>>> >>>>> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier <[email protected]> >>>>> wrote: >>>>>> DNS RR is good. I had good experiences using that for my client >>>>>> configs for exactly the reasons you are listing. >>>>>> >>>>>> On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede < >>> [email protected]> >>>>> wrote: >>>>>>> Thanks for the responses! >>>>>>> >>>>>>>>> How are your clients configured to find the zks now? >>>>>>> >>>>>>> Our clients currently use the list of hostnames and ports that >>>>>>> comprise the zookeeper cluster. For example, >>>>>>> zoo1:port1,zoo2:port2,zoo3:port3 >>>>>>> >>>>>>>>>> - switch DNS, >>>>>>>> - wait for caches to die, >>>>>>> >>>>>>> This is something we thought about however, if I understand it >>>>>>> correctly, doesn't JVM cache DNS entries forever until it is >>> restarted >>>>>>> ? We haven't specifically turned DNS caching off on our clients. So >>>>>>> this solution would require us to restart the clients to see the new >>>>>>> list of zookeeper hosts. >>>>>>> >>>>>>> Another thought is to use DNS RR and have the client zk url have one >>>>>>> name that resolves to and returns a list of IPs to the zookeeper >>>>>>> client. This has the advantage of being able to perform hardware >>>>>>> migration without changing the client connection url, in the future. >>>>>>> Do people have thoughts about using a DNS RR ? >>>>>>> >>>>>>> Thanks, >>>>>>> Neha >>>>>>> >>>>>>> On Tue, Dec 20, 2011 at 1:06 PM, Ted Dunning <[email protected]> >>>>> wrote: >>>>>>>> In particular, aren't you using DNS names? If you are, then you can >>>>>>>> >>>>>>>> - expand the quorum with the new hardware on new IP addresses, >>>>>>>> - switch DNS, >>>>>>>> - wait for caches to die, >>>>>>>> - restart applications without reconfig or otherwise force new >>>>> connections, >>>>>>>> - decrease quorum size again >>>>>>>> >>>>>>>> On Tue, Dec 20, 2011 at 12:26 PM, Camille Fournier < >>> [email protected] >>>>>> wrote: >>>>>>>> >>>>>>>>> How are your clients configured to find the zks now? How many >>> clients >>>>> do >>>>>>>>> you have? >>>>>>>>> >>>>>>>>> From my phone >>>>>>>>> On Dec 20, 2011 3:14 PM, "Neha Narkhede" <[email protected]> >>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> As part of upgrading to Zookeeper 3.3.4, we also have to migrate >>> our >>>>>>>>>> zookeeper cluster to new hardware. I'm trying to figure out the >>> best >>>>>>>>>> strategy to achieve that with no downtime. >>>>>>>>>> Here are some possible solutions I see at the moment, I could >>> have >>>>>>>>>> missed a few though - >>>>>>>>>> >>>>>>>>>> 1. Swap each machine out with a new machine, but with the same >>>>> host/IP. >>>>>>>>>> >>>>>>>>>> Pros: No client side config needs to be changed. >>>>>>>>>> Cons: Relatively tedious task for Operations >>>>>>>>>> >>>>>>>>>> 2. Add new machines, with different host/IPs to the existing >>>>> cluster, >>>>>>>>>> and remove the older machines, taking care to maintain the >>> quorum at >>>>>>>>>> all times >>>>>>>>>> >>>>>>>>>> Pros: Easier for Operations >>>>>>>>>> Cons: Client side configs need to be changed and clients need to >>> be >>>>>>>>>> restarted/bounced. Another problem is having a large quorum for >>>>>>>>>> sometime (potentially 9 nodes). >>>>>>>>>> >>>>>>>>>> 3. Hide the new cluster behind either a Hardware load balancer >>> or a >>>>>>>>>> DNS server resolving to all host ips. >>>>>>>>>> >>>>>>>>>> Pros: Makes it easier to move hardware around in the future >>>>>>>>>> Cons: Possible timeout issues with load balancers messing with >>>>>>>>>> zookeeper functionality or performance >>>>>>>>>> >>>>>>>>>> Read this and found it helpful - >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>> >>> http://apache.markmail.org/message/44tbj53q2jufplru?q=load+balancer+list:org%2Eapache%2Ehadoop%2Ezookeeper-user&page=1 >>>>>>>>>> But would like to hear from the authors and the users who might >>> have >>>>>>>>>> tried this in a real production setup. >>>>>>>>>> >>>>>>>>>> I'm very interested in finding a long term solution for masking >>> the >>>>>>>>>> zookeeper host names. Any inputs here are appreciated ! >>>>>>>>>> >>>>>>>>>> In addition to this, it will also be great to know what people >>> think >>>>>>>>>> about options 1 and 2, as a solution for hardware changes in >>>>>>>>>> Zookeeper. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Neha >>>>>>>>>> >>>>>>>>> >>>>> >>>
