dup of https://issues.apache.org/jira/browse/ZOOKEEPER-338 ?

Patrick

On Mon, Jan 9, 2012 at 3:17 PM, Ted Dunning <[email protected]> wrote:
> Neha
>
> Filing a jira is a great way to further the discussion.
>
> Sent from my iPhone
>
> On Jan 9, 2012, at 15:33, Neha Narkhede <[email protected]> wrote:
>
>>>> If you just have machine names in a list that you pass in, then yes, we
>> could re-resolve on every reconnect and you could just re-alias that name
>> to a new IP. But you'll have to put in logic that will do that but not
>> break people using DNS RR.
>>
>> Having a list of machine names that can be changed to point to new IPs
>> seems reasonable too. To be able to do the upgrade without having to
>> restart all clients, besides turning off DNS caching in the JVM, we
>> still have to solve the problem of zookeeper client caching the IPs in
>> code. Having 2 levels of DNS caching, one in the JVM and one in code
>> (which cannot be turned off) doesn't look like a good idea. Unless I'm
>> missing the purpose of such IP caching in zookeeper ?
>>
>>>> I realize that moving machines is difficult when you have lots of clients.
>> I'm a bit surprised your admins can't maintain machine IP addresses on a
>> machine move given a cluster of that complexity, though
>>
>> Its not like it can't be done, it definitely has quite some
>> operational overhead. We are trying to brainstorm various approaches
>> and come up with one that will involve the least overhead on such
>> upgrades going forward.
>>
>> Having said that, seems like re-resolving host names in reconnect
>> doesn't look like a bad idea, provided it doesn't break the DNS RR use
>> case. If that sounds good, can I go ahead a file a JIRA for this ?
>>
>> Thanks,
>> Neha
>>
>> On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <[email protected]> wrote:
>>> We don't shuffle IPs after the initial resolution of IP addresses.
>>>
>>> In DNS RR, you resolve to a list of IPs, shuffle these, and then we round
>>> robin through them trying to connect. If you re-resolve on every
>>> round-robin, you have to put in logic to know which ones have changed and
>>> somehow maintain that shuffle order or you aren't doing a fair back end
>>> round robin, which people using the ZK client against DNS RR are relying on
>>> today.
>>>
>>> If you just have machine names in a list that you pass in, then yes, we
>>> could re-resolve on every reconnect and you could just re-alias that name
>>> to a new IP. But you'll have to put in logic that will do that but not
>>> break people using DNS RR.
>>>
>>> I realize that moving machines is difficult when you have lots of clients.
>>> I'm a bit surprised your admins can't maintain machine IP addresses on a
>>> machine move given a cluster of that complexity, though. I also think that
>>> if we're going to be putting special cases like this in we might just want
>>> to go all the way to a pluggable reconnection scheme, but maybe that is too
>>> aggressive.
>>>
>>> C
>>>
>>> On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede 
>>> <[email protected]>wrote:
>>>
>>>> Maybe I didn't express myself clearly. When I said DNS RR, I meant its
>>>> simplest implementation which resolves a hostname to multiple IPs.
>>>>
>>>> Whatever method you use to map host names to IPs, the problem is that
>>>> the zookeeper client code will always cache the IPs. So to be able to
>>>> swap out a machine, all clients would have to be restarted, which if
>>>> you have 100s of clients, is a major pain. If you want to move the
>>>> entire cluster to new machines, this becomes even harder.
>>>>
>>>> I don't see why re-resolving host names to IPs in the reconnect logic
>>>> is a problem for zookeeper, since you shuffle the list of IPs anyways.
>>>>
>>>> Thanks,
>>>> Neha
>>>>
>>>>
>>>> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <[email protected]>
>>>> wrote:
>>>>> You can't sensibly round robin within the client code if you re-resolve
>>>> on
>>>>> every reconnect, if you're using dns rr. If that's your goal you'd want a
>>>>> list of dns alias names and re-resolve each hostname when you hit it on
>>>>> reconnect. But that will break people using dns rr.
>>>>> You can look into writing a pluggable reconnect logic into the zk client,
>>>>> that's what would be required to do this but at the end of the day you'll
>>>>> have to give your users special clients to make that work.
>>>>>
>>>>> C
>>>>>  On Jan 9, 2012 1:16 PM, "Neha Narkhede" <[email protected]>
>>>> wrote:
>>>>>
>>>>>> I was reading through the client code and saw that zookeeper client
>>>>>> caches the server IPs during startup and maintains it for the rest of
>>>>>> its lifetime. If we go with the DNS RR approach or a load balancer
>>>>>> approach, and later swap out a server with a new one ( with a new IP
>>>>>> ), all clients would have to be restarted to be able to "forget" the
>>>>>> old IP and see the new one. That doesn't look like a clean approach to
>>>>>> such upgrades. One way of getting around this problem, is adding the
>>>>>> resolution of host names to IPs in the "reconnect" logic in addition
>>>>>> to the constructor. So when such upgrades happen and the client
>>>>>> reconnects, it will see the new list of IPs, and wouldn't require to
>>>>>> be restarted.
>>>>>>
>>>>>> Does this approach sound good or am I missing something here ?
>>>>>>
>>>>>> Thanks,
>>>>>> Neha
>>>>>>
>>>>>> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier <[email protected]>
>>>>>> wrote:
>>>>>>> DNS RR is good. I had good experiences using that for my client
>>>>>>> configs for exactly the reasons you are listing.
>>>>>>>
>>>>>>> On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede <
>>>> [email protected]>
>>>>>> wrote:
>>>>>>>> Thanks for the responses!
>>>>>>>>
>>>>>>>>>> How are your clients configured to find the zks now?
>>>>>>>>
>>>>>>>> Our clients currently use the list of hostnames and ports that
>>>>>>>> comprise the zookeeper cluster. For example,
>>>>>>>> zoo1:port1,zoo2:port2,zoo3:port3
>>>>>>>>
>>>>>>>>>>> - switch DNS,
>>>>>>>>> - wait for caches to die,
>>>>>>>>
>>>>>>>> This is something we thought about however, if I understand it
>>>>>>>> correctly, doesn't JVM cache DNS entries forever until it is
>>>> restarted
>>>>>>>> ? We haven't specifically turned DNS caching off on our clients. So
>>>>>>>> this solution would require us to restart the clients to see the new
>>>>>>>> list of zookeeper hosts.
>>>>>>>>
>>>>>>>> Another thought is to use DNS RR and have the client zk url have one
>>>>>>>> name that resolves to and returns a list of IPs to the zookeeper
>>>>>>>> client. This has the advantage of being able to perform hardware
>>>>>>>> migration without changing the client connection url, in the future.
>>>>>>>> Do people have thoughts about using a DNS RR ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Neha
>>>>>>>>
>>>>>>>> On Tue, Dec 20, 2011 at 1:06 PM, Ted Dunning <[email protected]>
>>>>>> wrote:
>>>>>>>>> In particular, aren't you using DNS names?  If you are, then you can
>>>>>>>>>
>>>>>>>>> - expand the quorum with the new hardware on new IP addresses,
>>>>>>>>> - switch DNS,
>>>>>>>>> - wait for caches to die,
>>>>>>>>> - restart applications without reconfig or otherwise force new
>>>>>> connections,
>>>>>>>>> - decrease quorum size again
>>>>>>>>>
>>>>>>>>> On Tue, Dec 20, 2011 at 12:26 PM, Camille Fournier <
>>>> [email protected]
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> How are your clients configured to find the zks now? How many
>>>> clients
>>>>>> do
>>>>>>>>>> you have?
>>>>>>>>>>
>>>>>>>>>> From my phone
>>>>>>>>>> On Dec 20, 2011 3:14 PM, "Neha Narkhede" <[email protected]>
>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> As part of upgrading to Zookeeper 3.3.4, we also have to migrate
>>>> our
>>>>>>>>>>> zookeeper cluster to new hardware. I'm trying to figure out the
>>>> best
>>>>>>>>>>> strategy to achieve that with no downtime.
>>>>>>>>>>> Here are some possible solutions I see at the moment, I could
>>>> have
>>>>>>>>>>> missed a few though -
>>>>>>>>>>>
>>>>>>>>>>> 1. Swap each machine out with a new machine, but with the same
>>>>>> host/IP.
>>>>>>>>>>>
>>>>>>>>>>> Pros: No client side config needs to be changed.
>>>>>>>>>>> Cons: Relatively tedious task for Operations
>>>>>>>>>>>
>>>>>>>>>>> 2. Add new machines, with different host/IPs to the existing
>>>>>> cluster,
>>>>>>>>>>> and remove the older machines, taking care to maintain the
>>>> quorum at
>>>>>>>>>>> all times
>>>>>>>>>>>
>>>>>>>>>>> Pros: Easier for Operations
>>>>>>>>>>> Cons: Client side configs need to be changed and clients need to
>>>> be
>>>>>>>>>>> restarted/bounced. Another problem is having a large quorum for
>>>>>>>>>>> sometime (potentially 9 nodes).
>>>>>>>>>>>
>>>>>>>>>>> 3. Hide the new cluster behind either a Hardware load balancer
>>>> or a
>>>>>>>>>>> DNS server resolving to all host ips.
>>>>>>>>>>>
>>>>>>>>>>> Pros: Makes it easier to move hardware around in the future
>>>>>>>>>>> Cons: Possible timeout issues with load balancers messing with
>>>>>>>>>>> zookeeper functionality or performance
>>>>>>>>>>>
>>>>>>>>>>> Read this and found it helpful -
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>> http://apache.markmail.org/message/44tbj53q2jufplru?q=load+balancer+list:org%2Eapache%2Ehadoop%2Ezookeeper-user&page=1
>>>>>>>>>>> But would like to hear from the authors and the users who might
>>>> have
>>>>>>>>>>> tried this in a real production setup.
>>>>>>>>>>>
>>>>>>>>>>> I'm very interested in finding a long term solution for masking
>>>> the
>>>>>>>>>>> zookeeper host names. Any inputs here are appreciated !
>>>>>>>>>>>
>>>>>>>>>>> In addition to this, it will also be great to know what people
>>>> think
>>>>>>>>>>> about options 1 and 2, as a solution for hardware changes in
>>>>>>>>>>> Zookeeper.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Neha
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>

Reply via email to