dup of https://issues.apache.org/jira/browse/ZOOKEEPER-338 ?
Patrick On Mon, Jan 9, 2012 at 3:17 PM, Ted Dunning <[email protected]> wrote: > Neha > > Filing a jira is a great way to further the discussion. > > Sent from my iPhone > > On Jan 9, 2012, at 15:33, Neha Narkhede <[email protected]> wrote: > >>>> If you just have machine names in a list that you pass in, then yes, we >> could re-resolve on every reconnect and you could just re-alias that name >> to a new IP. But you'll have to put in logic that will do that but not >> break people using DNS RR. >> >> Having a list of machine names that can be changed to point to new IPs >> seems reasonable too. To be able to do the upgrade without having to >> restart all clients, besides turning off DNS caching in the JVM, we >> still have to solve the problem of zookeeper client caching the IPs in >> code. Having 2 levels of DNS caching, one in the JVM and one in code >> (which cannot be turned off) doesn't look like a good idea. Unless I'm >> missing the purpose of such IP caching in zookeeper ? >> >>>> I realize that moving machines is difficult when you have lots of clients. >> I'm a bit surprised your admins can't maintain machine IP addresses on a >> machine move given a cluster of that complexity, though >> >> Its not like it can't be done, it definitely has quite some >> operational overhead. We are trying to brainstorm various approaches >> and come up with one that will involve the least overhead on such >> upgrades going forward. >> >> Having said that, seems like re-resolving host names in reconnect >> doesn't look like a bad idea, provided it doesn't break the DNS RR use >> case. If that sounds good, can I go ahead a file a JIRA for this ? >> >> Thanks, >> Neha >> >> On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <[email protected]> wrote: >>> We don't shuffle IPs after the initial resolution of IP addresses. >>> >>> In DNS RR, you resolve to a list of IPs, shuffle these, and then we round >>> robin through them trying to connect. If you re-resolve on every >>> round-robin, you have to put in logic to know which ones have changed and >>> somehow maintain that shuffle order or you aren't doing a fair back end >>> round robin, which people using the ZK client against DNS RR are relying on >>> today. >>> >>> If you just have machine names in a list that you pass in, then yes, we >>> could re-resolve on every reconnect and you could just re-alias that name >>> to a new IP. But you'll have to put in logic that will do that but not >>> break people using DNS RR. >>> >>> I realize that moving machines is difficult when you have lots of clients. >>> I'm a bit surprised your admins can't maintain machine IP addresses on a >>> machine move given a cluster of that complexity, though. I also think that >>> if we're going to be putting special cases like this in we might just want >>> to go all the way to a pluggable reconnection scheme, but maybe that is too >>> aggressive. >>> >>> C >>> >>> On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede >>> <[email protected]>wrote: >>> >>>> Maybe I didn't express myself clearly. When I said DNS RR, I meant its >>>> simplest implementation which resolves a hostname to multiple IPs. >>>> >>>> Whatever method you use to map host names to IPs, the problem is that >>>> the zookeeper client code will always cache the IPs. So to be able to >>>> swap out a machine, all clients would have to be restarted, which if >>>> you have 100s of clients, is a major pain. If you want to move the >>>> entire cluster to new machines, this becomes even harder. >>>> >>>> I don't see why re-resolving host names to IPs in the reconnect logic >>>> is a problem for zookeeper, since you shuffle the list of IPs anyways. >>>> >>>> Thanks, >>>> Neha >>>> >>>> >>>> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <[email protected]> >>>> wrote: >>>>> You can't sensibly round robin within the client code if you re-resolve >>>> on >>>>> every reconnect, if you're using dns rr. If that's your goal you'd want a >>>>> list of dns alias names and re-resolve each hostname when you hit it on >>>>> reconnect. But that will break people using dns rr. >>>>> You can look into writing a pluggable reconnect logic into the zk client, >>>>> that's what would be required to do this but at the end of the day you'll >>>>> have to give your users special clients to make that work. >>>>> >>>>> C >>>>> On Jan 9, 2012 1:16 PM, "Neha Narkhede" <[email protected]> >>>> wrote: >>>>> >>>>>> I was reading through the client code and saw that zookeeper client >>>>>> caches the server IPs during startup and maintains it for the rest of >>>>>> its lifetime. If we go with the DNS RR approach or a load balancer >>>>>> approach, and later swap out a server with a new one ( with a new IP >>>>>> ), all clients would have to be restarted to be able to "forget" the >>>>>> old IP and see the new one. That doesn't look like a clean approach to >>>>>> such upgrades. One way of getting around this problem, is adding the >>>>>> resolution of host names to IPs in the "reconnect" logic in addition >>>>>> to the constructor. So when such upgrades happen and the client >>>>>> reconnects, it will see the new list of IPs, and wouldn't require to >>>>>> be restarted. >>>>>> >>>>>> Does this approach sound good or am I missing something here ? >>>>>> >>>>>> Thanks, >>>>>> Neha >>>>>> >>>>>> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier <[email protected]> >>>>>> wrote: >>>>>>> DNS RR is good. I had good experiences using that for my client >>>>>>> configs for exactly the reasons you are listing. >>>>>>> >>>>>>> On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede < >>>> [email protected]> >>>>>> wrote: >>>>>>>> Thanks for the responses! >>>>>>>> >>>>>>>>>> How are your clients configured to find the zks now? >>>>>>>> >>>>>>>> Our clients currently use the list of hostnames and ports that >>>>>>>> comprise the zookeeper cluster. For example, >>>>>>>> zoo1:port1,zoo2:port2,zoo3:port3 >>>>>>>> >>>>>>>>>>> - switch DNS, >>>>>>>>> - wait for caches to die, >>>>>>>> >>>>>>>> This is something we thought about however, if I understand it >>>>>>>> correctly, doesn't JVM cache DNS entries forever until it is >>>> restarted >>>>>>>> ? We haven't specifically turned DNS caching off on our clients. So >>>>>>>> this solution would require us to restart the clients to see the new >>>>>>>> list of zookeeper hosts. >>>>>>>> >>>>>>>> Another thought is to use DNS RR and have the client zk url have one >>>>>>>> name that resolves to and returns a list of IPs to the zookeeper >>>>>>>> client. This has the advantage of being able to perform hardware >>>>>>>> migration without changing the client connection url, in the future. >>>>>>>> Do people have thoughts about using a DNS RR ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Neha >>>>>>>> >>>>>>>> On Tue, Dec 20, 2011 at 1:06 PM, Ted Dunning <[email protected]> >>>>>> wrote: >>>>>>>>> In particular, aren't you using DNS names? If you are, then you can >>>>>>>>> >>>>>>>>> - expand the quorum with the new hardware on new IP addresses, >>>>>>>>> - switch DNS, >>>>>>>>> - wait for caches to die, >>>>>>>>> - restart applications without reconfig or otherwise force new >>>>>> connections, >>>>>>>>> - decrease quorum size again >>>>>>>>> >>>>>>>>> On Tue, Dec 20, 2011 at 12:26 PM, Camille Fournier < >>>> [email protected] >>>>>>> wrote: >>>>>>>>> >>>>>>>>>> How are your clients configured to find the zks now? How many >>>> clients >>>>>> do >>>>>>>>>> you have? >>>>>>>>>> >>>>>>>>>> From my phone >>>>>>>>>> On Dec 20, 2011 3:14 PM, "Neha Narkhede" <[email protected]> >>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> As part of upgrading to Zookeeper 3.3.4, we also have to migrate >>>> our >>>>>>>>>>> zookeeper cluster to new hardware. I'm trying to figure out the >>>> best >>>>>>>>>>> strategy to achieve that with no downtime. >>>>>>>>>>> Here are some possible solutions I see at the moment, I could >>>> have >>>>>>>>>>> missed a few though - >>>>>>>>>>> >>>>>>>>>>> 1. Swap each machine out with a new machine, but with the same >>>>>> host/IP. >>>>>>>>>>> >>>>>>>>>>> Pros: No client side config needs to be changed. >>>>>>>>>>> Cons: Relatively tedious task for Operations >>>>>>>>>>> >>>>>>>>>>> 2. Add new machines, with different host/IPs to the existing >>>>>> cluster, >>>>>>>>>>> and remove the older machines, taking care to maintain the >>>> quorum at >>>>>>>>>>> all times >>>>>>>>>>> >>>>>>>>>>> Pros: Easier for Operations >>>>>>>>>>> Cons: Client side configs need to be changed and clients need to >>>> be >>>>>>>>>>> restarted/bounced. Another problem is having a large quorum for >>>>>>>>>>> sometime (potentially 9 nodes). >>>>>>>>>>> >>>>>>>>>>> 3. Hide the new cluster behind either a Hardware load balancer >>>> or a >>>>>>>>>>> DNS server resolving to all host ips. >>>>>>>>>>> >>>>>>>>>>> Pros: Makes it easier to move hardware around in the future >>>>>>>>>>> Cons: Possible timeout issues with load balancers messing with >>>>>>>>>>> zookeeper functionality or performance >>>>>>>>>>> >>>>>>>>>>> Read this and found it helpful - >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>> >>>> http://apache.markmail.org/message/44tbj53q2jufplru?q=load+balancer+list:org%2Eapache%2Ehadoop%2Ezookeeper-user&page=1 >>>>>>>>>>> But would like to hear from the authors and the users who might >>>> have >>>>>>>>>>> tried this in a real production setup. >>>>>>>>>>> >>>>>>>>>>> I'm very interested in finding a long term solution for masking >>>> the >>>>>>>>>>> zookeeper host names. Any inputs here are appreciated ! >>>>>>>>>>> >>>>>>>>>>> In addition to this, it will also be great to know what people >>>> think >>>>>>>>>>> about options 1 and 2, as a solution for hardware changes in >>>>>>>>>>> Zookeeper. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Neha >>>>>>>>>>> >>>>>>>>>> >>>>>> >>>>
