Re: Dynamic adding/removing ZK servers on client
Another benefit of ZOOKEEPER-146 - we could use this for some sort of load balancing amongst the ensemble members. The first version could return a static list, however I can see where the HTTPD might be updated to monitor the load on the servers/ensemble and prioritize the list for each client request... Patrick On 05/03/2010 09:34 AM, Patrick Hunt wrote: On 05/03/2010 07:03 AM, Dave Wright wrote: I've got a situation where I essentially need dynamic cluster membership, which has been talked about in ZOOKEEPER-107 but doesn't look like it's going to happen any time soon. Could you provide some insight into why you need this? Just so we have addl background, I'm interested to know the use case. For now, I'm planning on working around this by having a simple coordinator service on the server nodes that will re-write the configs and bounce the servers when membership changes. Clients will may get an error or two and need to reconnect, but that should be handled by the normal error logic. Are you expecting all of the servers to change each time, or just incremental changes (add/remove a single server, vs say move the entire cluster from 3 hosts a/b/c to x/y/z) On the client side, I'd really like to dynamically update the server list w/o having to re-create the entire Zookeeper object. Looking at the code, it seems like it would be pretty trivial to add RemoveServer()/AddServer() functions for Zookeeper that calls down to ClientCnxn, where they are just maintained in a list. Of course if the server being removed is the one currently connected, we'd need to disconnect, but a simple call to disconnect() seems like it would resolve that and trigger the automatic re-connection logic. You would hook this (add/remove) into JMX? That seems like a good option to provide. Any chance you could use DNS for this? ie change the mapping for the hostname from a - x ip? Since the server a will go down anyway, this would cause the client to reconnect to b/c (eventually when dns ttl expires the client would also potentially connect to x). If this is an option be sure to see (a bit of work to do): https://issues.apache.org/jira/browse/ZOOKEEPER-328 https://issues.apache.org/jira/browse/ZOOKEEPER-338 You might also look at this patch, we never committed it but it might be interesting to you: https://issues.apache.org/jira/browse/ZOOKEEPER-146 The benefit is that you'd only have one place to make the change, esp given that clients might be down/unreachable when this change occurs. Clients would have to poll this service whenever they get disconnected from the ensemble. One drawback of this approach is that the HTTP now becomes a potential SPOF. (although I guess you could always fall back to something, or potentially have a list of HTTP hosts to do the lookup, etc...). Does anyone see an issue with that approach? Were I to create the patch, do you think it would be interesting enough to merge? It seems like that functionality will eventually be needed for whatever full dynamic server support is eventually implemented. It does sound interesting, however once we add something like this it's hard to change given that we try very hard to maintain b/w compatibility. If you did the testing and were able to verify I don't see why we couldn't add it - as it's optional in the sense that it would only be called in the use case you describe. I would feel more confident if we had more concrete detail on how we intend to do 107 (a basic functional/design doc that at least reviews all the issues), and how this would fit in. But I don't see that should necessarily be a blocker (although others might feel differently). (fyi it's good to discuss this sort of thing on zookeeper-dev, please move responses to that list) Sounds like an useful project, I'm interested to her what others think about it. Regards, Patrick
Re: Dynamic adding/removing ZK servers on client
Could you provide some insight into why you need this? Just so we have addl background, I'm interested to know the use case. Sure, we're building a clustered application that will use zookeeper as part of it. We need to manage ZK ourself. The cluster running the app ZK may change over time (nodes added or removed) and we need to keep ZK itself in-sync with any changes. They won't be common, but we can't shut the app down to make the changes, it needs to be transparent. Are you expecting all of the servers to change each time, or just incremental changes (add/remove a single server, vs say move the entire cluster from 3 hosts a/b/c to x/y/z) I'd expect a small number of changes at any time - a few nodes being added, a few nodes being removed. Most of the nodes will stay the same. Any chance you could use DNS for this? ie change the mapping for the hostname from a - x ip? Since the server a will go down anyway, this would cause the client to reconnect to b/c (eventually when dns ttl expires the client would also potentially connect to x). https://issues.apache.org/jira/browse/ZOOKEEPER-328 https://issues.apache.org/jira/browse/ZOOKEEPER-338 Well, there are a lot of issues with DNS (including security cache) so I'd prefer to avoid it. Also, the real issue is the # of servers are changing, not just their IP. Although we probably wouldn't use it, I do think it would be nice to support a single hostname for the ZK cluster with one A records for each member, and have the ZK client handle resolving that properly each time it connects. You might also look at this patch, we never committed it but it might be interesting to you: https://issues.apache.org/jira/browse/ZOOKEEPER-146 The benefit is that you'd only have one place to make the change, esp given that clients might be down/unreachable when this change occurs. Clients would have to poll this service whenever they get disconnected from the ensemble. One drawback of this approach is that the HTTP now becomes a potential SPOF. (although I guess you could always fall back to something, or potentially have a list of HTTP hosts to do the lookup, etc...). Well, that just handles distribution of the list (which isn't really our problem), it doesn't help with restarting the ZK client when the list changes - it only pulls the list once, so you still have to completely shutdown and restart the ZK client. It does sound interesting, however once we add something like this it's hard to change given that we try very hard to maintain b/w compatibility. If you did the testing and were able to verify I don't see why we couldn't add it - as it's optional in the sense that it would only be called in the use case you describe. I would feel more confident if we had more concrete detail on how we intend to do 107 (a basic functional/design doc that at least reviews all the issues), and how this would fit in. But I don't see that should necessarily be a blocker (although others might feel differently). Have you ever considered adding features like this via a protected interface (i.e. the are useful but aren't fully standardized, so if a client wants to use it they can sub-class ZK and make them public)? The ability to dynamically modify the server list on the client side seems like it would be required no matter what approach were taken to dynamic clusters. -Dave Wright
Re: Dynamic adding/removing ZK servers on client
The ability to dynamically modify the server list on the client side seems like it would be required no matter what approach were taken to dynamic clusters. Hasn't come up before, but yes I agree it's a useful feature. I agree with Dave that this is quite important for a truly dynamic membership experience. I think I improperly imagined the two as being inherently part of the same problem before, but I see they could be split into different ones now that you mention it. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/blog http://niemeyer.net/identi.ca http://niemeyer.net/twitter
Re: Dynamic adding/removing ZK servers on client
Well, that just handles distribution of the list (which isn't really our problem), it doesn't help with restarting the ZK client when the list changes - it only pulls the list once, so you still have to completely shutdown and restart the ZK client. Well the old server is being shutdown right? If the client were connected to that server this would force the client to reconnect to another server, what I was suggesting is that the client would ping the server lookup service as part of this. (so lookup on every disconnect say) Perhaps we should clarify what you mean by client (..would ping..). If you mean the ZK client library, then that would make sense - rather than use a static list of servers, each time it was disconnected it would refresh it's list and pick one. I took it to mean the client application (using the ZK library). The issue is that the client application has no way to tell the ZK client lib to use a different list of servers, other than a complete teardown of the ZK object session, which I'm trying to avoid. Hasn't come up before, but yes I agree it's a useful feature. Ok, thanks. We don't have a specific ETA to implement it, I just wanted to explore the option a bit before we finalized some aspects of our design. Should we do the work I'll submit matches for the Java and C client. -Dave
Re: Dynamic adding/removing ZK servers on client
On 05/03/2010 11:29 AM, Dave Wright wrote: Well, that just handles distribution of the list (which isn't really our problem), it doesn't help with restarting the ZK client when the list changes - it only pulls the list once, so you still have to completely shutdown and restart the ZK client. Well the old server is being shutdown right? If the client were connected to that server this would force the client to reconnect to another server, what I was suggesting is that the client would ping the server lookup service as part of this. (so lookup on every disconnect say) Perhaps we should clarify what you mean by client (..would ping..). If you mean the ZK client library, then that would make sense - rather than use a static list of servers, each time it was disconnected it would refresh it's list and pick one. I took it to mean the client application (using the ZK library). The issue is that the client application has no way to tell the ZK client lib to use a different list of servers, other than a complete teardown of the ZK object session, which I'm trying to avoid. Yes, that's what I meant - we could update the ZK client lib to do this. It would be invisible to the client application (your code) itself. Hasn't come up before, but yes I agree it's a useful feature. Ok, thanks. We don't have a specific ETA to implement it, I just wanted to explore the option a bit before we finalized some aspects of our design. Should we do the work I'll submit matches for the Java and C client. That would be great. Patrick
Re: Dynamic adding/removing ZK servers on client
On 05/03/2010 12:07 PM, Dave Wright wrote: Yes, that's what I meant - we could update the ZK client lib to do this. It would be invisible to the client application (your code) itself. I don't think that's a bad idea, and the general approach in ZK-146 of using an interface that gets called to retrieve the list of hosts seems good (so that you aren't tied to a specific implementation of hosts lists, be it HTTP or DNS). That said, I don't think the actual implementation of ZK-146 is a good solution, since it only resolves the host list once. An implementation that resolved it on each disconnection would be better but require deeper changes to the ClientCnxn. You could update 146 as appropriate, handling changes to the ensemble members wasn't an original goal. Notice there was some discussion on how to do this in a way that would be as flexible as possible going forward, and so that we don't end up with all kinds of constructors (etc...) on top of ZK client for the different schemes. That is still a concern, something that we should come to agreement on before implementation is started I mean. Patrick
Re: Dynamic adding/removing ZK servers on client
Yes, that's what I meant - we could update the ZK client lib to do this. It would be invisible to the client application (your code) itself. I don't think that's a bad idea, and the general approach in ZK-146 of using an interface that gets called to retrieve the list of hosts seems good (so that you aren't tied to a specific implementation of hosts lists, be it HTTP or DNS). That said, I don't think the actual implementation of ZK-146 is a good solution, since it only resolves the host list once. An implementation that resolved it on each disconnection would be better but require deeper changes to the ClientCnxn. -Dave
Re: Dynamic adding/removing ZK servers on client
On 3 May 2010 16:40, Dave Wright wrig...@gmail.com wrote: Should this be a znode in the privileged namespace? I think having a znode for the current cluster members is part of the ZOOKEEPER-107 proposal, with the idea being that you could get/set the membership just by writing to that node. On the client side, you could watch that znode and update your server list when it changes. This is tricky: what happens if the server your client is connected to is decommissioned by a view change, and you are unable to locate another server to connect to because other view changes committed while you are reconnecting have removed all the servers you knew about. We'd need to make sure that watches on this znode were fired before a view change, but it's hard to know how to avoid having to wait for a session timeout before a client that might just be migrating servers reappears in order to make sure it sees the veiw change. Even then, the problem of 'locating' the cluster still exists in the case that there are no clients connected to tell anyone about it. Henry -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: Dynamic adding/removing ZK servers on client
Should this be a znode in the privileged namespace? I think having a znode for the current cluster members is part of the ZOOKEEPER-107 proposal, with the idea being that you could get/set the membership just by writing to that node. On the client side, you could watch that znode and update your server list when it changes. I think it would be a great solution, but I was thinking the ability to manually manage the server list would be useful in the interim, or if ZK-107 takes a different path. -Dave
Re: Dynamic adding/removing ZK servers on client
This is tricky: what happens if the server your client is connected to is decommissioned by a view change, and you are unable to locate another server to connect to because other view changes committed while you are reconnecting have removed all the servers you knew about. We'd need to make sure that watches on this znode were fired before a view change, but it's hard to know how to avoid having to wait for a session timeout before a client that might just be migrating servers reappears in order to make sure it sees the veiw change. Even then, the problem of 'locating' the cluster still exists in the case that there are no clients connected to tell anyone about it. Yes, this doesn't completely solve two issues: 1. Bootstrapping the cluster itself clients 2. Major cluster reconfiguration (e.g. switching out every node before clients can pickup the changes). That said, I think it gets close and could still be useful. For #1, you could simply require that the initial servers in the cluster be manually configured, then servers could be added and removed as needed. New servers would just need the address of one other server to join and get the full server list. For clients, you'd have a similar situation - you still need a way to pass an initial server list (or at least 1 valid server) in to the client, but that could be via HTTP, DNS, or manual list, then the clients themselves could stay in sync with changes. For #2, you could simply document that there are limits to how fast you want to change the cluster, and that if you make too many changes too fast, clients or servers may not pick up the change fast enough and need to be restarted. In reality I don't think this will be much of an issue - as long as at least one server from the starting state stays up until everyone else gets reconnected, everyone should eventually find that node and get the full server list. -Dave