Patrick,

Looks like https://issues.apache.org/jira/browse/ZOOKEEPER-1356 is a
duplicate of 338 ? If yes, then I'll mark it to reflect the same.

Thanks,
Neha

On Mon, Jan 9, 2012 at 5:36 PM, Patrick Hunt <ph...@apache.org> wrote:
> dup of https://issues.apache.org/jira/browse/ZOOKEEPER-338 ?
>
> Patrick
>
> On Mon, Jan 9, 2012 at 3:17 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>> Neha
>>
>> Filing a jira is a great way to further the discussion.
>>
>> Sent from my iPhone
>>
>> On Jan 9, 2012, at 15:33, Neha Narkhede <neha.narkh...@gmail.com> wrote:
>>
>>>>> If you just have machine names in a list that you pass in, then yes, we
>>> could re-resolve on every reconnect and you could just re-alias that name
>>> to a new IP. But you'll have to put in logic that will do that but not
>>> break people using DNS RR.
>>>
>>> Having a list of machine names that can be changed to point to new IPs
>>> seems reasonable too. To be able to do the upgrade without having to
>>> restart all clients, besides turning off DNS caching in the JVM, we
>>> still have to solve the problem of zookeeper client caching the IPs in
>>> code. Having 2 levels of DNS caching, one in the JVM and one in code
>>> (which cannot be turned off) doesn't look like a good idea. Unless I'm
>>> missing the purpose of such IP caching in zookeeper ?
>>>
>>>>> I realize that moving machines is difficult when you have lots of clients.
>>> I'm a bit surprised your admins can't maintain machine IP addresses on a
>>> machine move given a cluster of that complexity, though
>>>
>>> Its not like it can't be done, it definitely has quite some
>>> operational overhead. We are trying to brainstorm various approaches
>>> and come up with one that will involve the least overhead on such
>>> upgrades going forward.
>>>
>>> Having said that, seems like re-resolving host names in reconnect
>>> doesn't look like a bad idea, provided it doesn't break the DNS RR use
>>> case. If that sounds good, can I go ahead a file a JIRA for this ?
>>>
>>> Thanks,
>>> Neha
>>>
>>> On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <cami...@apache.org> 
>>> wrote:
>>>> We don't shuffle IPs after the initial resolution of IP addresses.
>>>>
>>>> In DNS RR, you resolve to a list of IPs, shuffle these, and then we round
>>>> robin through them trying to connect. If you re-resolve on every
>>>> round-robin, you have to put in logic to know which ones have changed and
>>>> somehow maintain that shuffle order or you aren't doing a fair back end
>>>> round robin, which people using the ZK client against DNS RR are relying on
>>>> today.
>>>>
>>>> If you just have machine names in a list that you pass in, then yes, we
>>>> could re-resolve on every reconnect and you could just re-alias that name
>>>> to a new IP. But you'll have to put in logic that will do that but not
>>>> break people using DNS RR.
>>>>
>>>> I realize that moving machines is difficult when you have lots of clients.
>>>> I'm a bit surprised your admins can't maintain machine IP addresses on a
>>>> machine move given a cluster of that complexity, though. I also think that
>>>> if we're going to be putting special cases like this in we might just want
>>>> to go all the way to a pluggable reconnection scheme, but maybe that is too
>>>> aggressive.
>>>>
>>>> C
>>>>
>>>> On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede 
>>>> <neha.narkh...@gmail.com>wrote:
>>>>
>>>>> Maybe I didn't express myself clearly. When I said DNS RR, I meant its
>>>>> simplest implementation which resolves a hostname to multiple IPs.
>>>>>
>>>>> Whatever method you use to map host names to IPs, the problem is that
>>>>> the zookeeper client code will always cache the IPs. So to be able to
>>>>> swap out a machine, all clients would have to be restarted, which if
>>>>> you have 100s of clients, is a major pain. If you want to move the
>>>>> entire cluster to new machines, this becomes even harder.
>>>>>
>>>>> I don't see why re-resolving host names to IPs in the reconnect logic
>>>>> is a problem for zookeeper, since you shuffle the list of IPs anyways.
>>>>>
>>>>> Thanks,
>>>>> Neha
>>>>>
>>>>>
>>>>> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <cami...@apache.org>
>>>>> wrote:
>>>>>> You can't sensibly round robin within the client code if you re-resolve
>>>>> on
>>>>>> every reconnect, if you're using dns rr. If that's your goal you'd want a
>>>>>> list of dns alias names and re-resolve each hostname when you hit it on
>>>>>> reconnect. But that will break people using dns rr.
>>>>>> You can look into writing a pluggable reconnect logic into the zk client,
>>>>>> that's what would be required to do this but at the end of the day you'll
>>>>>> have to give your users special clients to make that work.
>>>>>>
>>>>>> C
>>>>>>  On Jan 9, 2012 1:16 PM, "Neha Narkhede" <neha.narkh...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> I was reading through the client code and saw that zookeeper client
>>>>>>> caches the server IPs during startup and maintains it for the rest of
>>>>>>> its lifetime. If we go with the DNS RR approach or a load balancer
>>>>>>> approach, and later swap out a server with a new one ( with a new IP
>>>>>>> ), all clients would have to be restarted to be able to "forget" the
>>>>>>> old IP and see the new one. That doesn't look like a clean approach to
>>>>>>> such upgrades. One way of getting around this problem, is adding the
>>>>>>> resolution of host names to IPs in the "reconnect" logic in addition
>>>>>>> to the constructor. So when such upgrades happen and the client
>>>>>>> reconnects, it will see the new list of IPs, and wouldn't require to
>>>>>>> be restarted.
>>>>>>>
>>>>>>> Does this approach sound good or am I missing something here ?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Neha
>>>>>>>
>>>>>>> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier <cami...@apache.org>
>>>>>>> wrote:
>>>>>>>> DNS RR is good. I had good experiences using that for my client
>>>>>>>> configs for exactly the reasons you are listing.
>>>>>>>>
>>>>>>>> On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede <
>>>>> neha.narkh...@gmail.com>
>>>>>>> wrote:
>>>>>>>>> Thanks for the responses!
>>>>>>>>>
>>>>>>>>>>> How are your clients configured to find the zks now?
>>>>>>>>>
>>>>>>>>> Our clients currently use the list of hostnames and ports that
>>>>>>>>> comprise the zookeeper cluster. For example,
>>>>>>>>> zoo1:port1,zoo2:port2,zoo3:port3
>>>>>>>>>
>>>>>>>>>>>> - switch DNS,
>>>>>>>>>> - wait for caches to die,
>>>>>>>>>
>>>>>>>>> This is something we thought about however, if I understand it
>>>>>>>>> correctly, doesn't JVM cache DNS entries forever until it is
>>>>> restarted
>>>>>>>>> ? We haven't specifically turned DNS caching off on our clients. So
>>>>>>>>> this solution would require us to restart the clients to see the new
>>>>>>>>> list of zookeeper hosts.
>>>>>>>>>
>>>>>>>>> Another thought is to use DNS RR and have the client zk url have one
>>>>>>>>> name that resolves to and returns a list of IPs to the zookeeper
>>>>>>>>> client. This has the advantage of being able to perform hardware
>>>>>>>>> migration without changing the client connection url, in the future.
>>>>>>>>> Do people have thoughts about using a DNS RR ?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Neha
>>>>>>>>>
>>>>>>>>> On Tue, Dec 20, 2011 at 1:06 PM, Ted Dunning <ted.dunn...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>> In particular, aren't you using DNS names?  If you are, then you can
>>>>>>>>>>
>>>>>>>>>> - expand the quorum with the new hardware on new IP addresses,
>>>>>>>>>> - switch DNS,
>>>>>>>>>> - wait for caches to die,
>>>>>>>>>> - restart applications without reconfig or otherwise force new
>>>>>>> connections,
>>>>>>>>>> - decrease quorum size again
>>>>>>>>>>
>>>>>>>>>> On Tue, Dec 20, 2011 at 12:26 PM, Camille Fournier <
>>>>> cami...@apache.org
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> How are your clients configured to find the zks now? How many
>>>>> clients
>>>>>>> do
>>>>>>>>>>> you have?
>>>>>>>>>>>
>>>>>>>>>>> From my phone
>>>>>>>>>>> On Dec 20, 2011 3:14 PM, "Neha Narkhede" <neha.narkh...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> As part of upgrading to Zookeeper 3.3.4, we also have to migrate
>>>>> our
>>>>>>>>>>>> zookeeper cluster to new hardware. I'm trying to figure out the
>>>>> best
>>>>>>>>>>>> strategy to achieve that with no downtime.
>>>>>>>>>>>> Here are some possible solutions I see at the moment, I could
>>>>> have
>>>>>>>>>>>> missed a few though -
>>>>>>>>>>>>
>>>>>>>>>>>> 1. Swap each machine out with a new machine, but with the same
>>>>>>> host/IP.
>>>>>>>>>>>>
>>>>>>>>>>>> Pros: No client side config needs to be changed.
>>>>>>>>>>>> Cons: Relatively tedious task for Operations
>>>>>>>>>>>>
>>>>>>>>>>>> 2. Add new machines, with different host/IPs to the existing
>>>>>>> cluster,
>>>>>>>>>>>> and remove the older machines, taking care to maintain the
>>>>> quorum at
>>>>>>>>>>>> all times
>>>>>>>>>>>>
>>>>>>>>>>>> Pros: Easier for Operations
>>>>>>>>>>>> Cons: Client side configs need to be changed and clients need to
>>>>> be
>>>>>>>>>>>> restarted/bounced. Another problem is having a large quorum for
>>>>>>>>>>>> sometime (potentially 9 nodes).
>>>>>>>>>>>>
>>>>>>>>>>>> 3. Hide the new cluster behind either a Hardware load balancer
>>>>> or a
>>>>>>>>>>>> DNS server resolving to all host ips.
>>>>>>>>>>>>
>>>>>>>>>>>> Pros: Makes it easier to move hardware around in the future
>>>>>>>>>>>> Cons: Possible timeout issues with load balancers messing with
>>>>>>>>>>>> zookeeper functionality or performance
>>>>>>>>>>>>
>>>>>>>>>>>> Read this and found it helpful -
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>> http://apache.markmail.org/message/44tbj53q2jufplru?q=load+balancer+list:org%2Eapache%2Ehadoop%2Ezookeeper-user&page=1
>>>>>>>>>>>> But would like to hear from the authors and the users who might
>>>>> have
>>>>>>>>>>>> tried this in a real production setup.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm very interested in finding a long term solution for masking
>>>>> the
>>>>>>>>>>>> zookeeper host names. Any inputs here are appreciated !
>>>>>>>>>>>>
>>>>>>>>>>>> In addition to this, it will also be great to know what people
>>>>> think
>>>>>>>>>>>> about options 1 and 2, as a solution for hardware changes in
>>>>>>>>>>>> Zookeeper.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Neha
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>>

Reply via email to