Re: finding Cassandra servers
2010/3/3 Ted Zlatanov t...@lifelogs.com: On Mon, 01 Mar 2010 12:15:11 -0600 Ted Zlatanov t...@lifelogs.com wrote: TZ I need to find Cassandra servers on my network from several types of TZ clients and platforms. The goal is to make adding and removing servers TZ painless, assuming a leading window of at least 1 hour. The discovery TZ should be automatic and distributed. I want to minimize management. TZ Round-robin DNS with a 1-hour TTL would work all right, but I was TZ wondering if Bonjour/Zeroconf is a better idea and what else should I TZ consider. So... is this a dumb question or is there no good answer currently to discovering Cassandra servers? Ted Nothing in the current codebase currently meets these needs. But then again, cassandra doesn't need the described functionality. Zeroconf confines itself to a single subnet (would require router configuration to work across subnets so that multicast goes through). RRDNS would work, but something would need to keep that updated when servers go away (it wouldn't be automatic). If you can count on one of your (seed nodes) to be up, RRDNS could be used to connect to one of them and fetch the token range list. To do this, create a thrift client and call describe_ring. In older versions you can get a jsonified endpoint map by calling get_string_property('token map'). Hope that helps. Gary.
Re: finding Cassandra servers
On Wed, 3 Mar 2010 08:41:18 -0600 Gary Dusbabek gdusba...@gmail.com wrote: GD It wouldn't be a lot work for you to write a mdns service that would GD query the seeds for endpoints and publish it to interested clients. GD It could go in contrib. This requires knowledge of the seeds so I need to at least look in storage-conf.xml to find them. Are you saying there's no chance of Cassandra nodes (or just seeds) announcing themselves, even if it's optional behavior that's off by default? If so I'll do the contrib mDNS service but it really seems like a backward way to do things. Ted
Re: finding Cassandra servers
2010/3/3 Ted Zlatanov t...@lifelogs.com: On Wed, 3 Mar 2010 08:41:18 -0600 Gary Dusbabek gdusba...@gmail.com wrote: GD It wouldn't be a lot work for you to write a mdns service that would GD query the seeds for endpoints and publish it to interested clients. GD It could go in contrib. This requires knowledge of the seeds so I need to at least look in storage-conf.xml to find them. Are you saying there's no chance of Cassandra nodes (or just seeds) announcing themselves, even if it's optional behavior that's off by default? If so I'll do the contrib mDNS service but it really seems like a backward way to do things. Ted Nodes already announce themselves, only just to the cluster. That's what gossip is for. I don't see the point of making the announcement to the subnet at large. The decision rests with the community. Obviously, if there is enough merit to this work, it will find its way into the codebase. I just think it falls into the realm of shiny-and-neat (mdns and automatic discovery is cool) and not in the realm of pragmatic (not reliable across subnets). Gary.
Re: finding Cassandra servers
On Wed, 3 Mar 2010 09:32:33 -0600 Gary Dusbabek gdusba...@gmail.com wrote: GD 2010/3/3 Ted Zlatanov t...@lifelogs.com: This requires knowledge of the seeds so I need to at least look in storage-conf.xml to find them. Are you saying there's no chance of Cassandra nodes (or just seeds) announcing themselves, even if it's optional behavior that's off by default? If so I'll do the contrib mDNS service but it really seems like a backward way to do things. GD Nodes already announce themselves, only just to the cluster. That's GD what gossip is for. I don't see the point of making the announcement GD to the subnet at large. GD The decision rests with the community. Obviously, if there is enough GD merit to this work, it will find its way into the codebase. I just GD think it falls into the realm of shiny-and-neat (mdns and automatic GD discovery is cool) and not in the realm of pragmatic (not reliable GD across subnets). It's currently not possible to find a usable node without running centralized services like RRDNS or a special mDNS broadcaster as you suggested. I don't think this is shiny and neat, it's a matter of running in a true decentralized environment (which Cassandra is supposed to fit into). The subnet limitation is not an issue in my environment (we forward much, much larger multicast volumes routinely) but I understand routing multicasts is not everyone's cup of tea. IMHO it's better than the current situation and, mDNS being a well-known standard, can at least be handled at the switch level without code changes. I can do a patch+ticket for this in the core, making it optional and off by default, or do the same for a contrib/ service as you suggested. So I'd appreciate a +1/-1 quick vote on whether this can go in the core to save me from rewriting the patch later. Ted
Re: finding Cassandra servers
On Wed, 2010-03-03 at 10:05 -0600, Ted Zlatanov wrote: I can do a patch+ticket for this in the core, making it optional and off by default, or do the same for a contrib/ service as you suggested. So I'd appreciate a +1/-1 quick vote on whether this can go in the core to save me from rewriting the patch later. I don't think voting is going to help. Voting doesn't do anything to develop consensus and it seems pretty clear that no consensus exists here. It's entirely possible that you've identified a problem that others can't see, or haven't yet encountered. I don't see it, but then maybe I'm just thick. Either way, if you think this is important, the onus is on you to demonstrate the merit of your idea and contrib/ or a github project is one way to do that (the latter has the advantage of not needing to rely on anyone else). -- Eric Evans eev...@rackspace.com
Re: finding Cassandra servers
So is the current general practice to connect to a known node, e.g. by ip address? If so, what happens if that node is down? Is the entire cluster effectively broken at that point? Or do clients simply maintain a list of nodes a just connect to the first available in the list? Thanks in advance. Cheers Chris On 3 Mar 2010 16:43, Eric Evans eev...@rackspace.com wrote: On Wed, 2010-03-03 at 10:05 -0600, Ted Zlatanov wrote: I can do a patch+ticket for this in the cor... I don't think voting is going to help. Voting doesn't do anything to develop consensus and it seems pretty clear that no consensus exists here. It's entirely possible that you've identified a problem that others can't see, or haven't yet encountered. I don't see it, but then maybe I'm just thick. Either way, if you think this is important, the onus is on you to demonstrate the merit of your idea and contrib/ or a github project is one way to do that (the latter has the advantage of not needing to rely on anyone else). -- Eric Evans eev...@rackspace.com
Re: finding Cassandra servers
On Wed, 03 Mar 2010 10:43:19 -0600 Eric Evans eev...@rackspace.com wrote: EE It's entirely possible that you've identified a problem that others EE can't see, or haven't yet encountered. I don't see it, but then maybe EE I'm just thick. Getting back to my original question, how do you (and others) find usable Cassandra nodes from your clients? It's supposed to be a decentralized database and yet I only know of centralized ways (RRDNS) to locate nodes. Contacting the seeds is not a decentralized solution and sidesteps the issue. It also complicates the client unnecessarily. EE Either way, if you think this is important, the onus is on you to EE demonstrate the merit of your idea and contrib/ or a github project is EE one way to do that (the latter has the advantage of not needing to rely EE on anyone else). I'll submit a core patch in a jira ticket. It's much easier than writing a full application and IMHO much more useful because it just works. If it gets rejected I'll move to contrib/ as you and Gary suggested. Ted
Re: finding Cassandra servers
+1 on erics comments We could create a branch or git fork where you guys could develop it, and if it reaches a usable state and others find it interesting it could get integrated in then On 3/3/10, Eric Evans eev...@rackspace.com wrote: On Wed, 2010-03-03 at 10:05 -0600, Ted Zlatanov wrote: I can do a patch+ticket for this in the core, making it optional and off by default, or do the same for a contrib/ service as you suggested. So I'd appreciate a +1/-1 quick vote on whether this can go in the core to save me from rewriting the patch later. I don't think voting is going to help. Voting doesn't do anything to develop consensus and it seems pretty clear that no consensus exists here. It's entirely possible that you've identified a problem that others can't see, or haven't yet encountered. I don't see it, but then maybe I'm just thick. Either way, if you think this is important, the onus is on you to demonstrate the merit of your idea and contrib/ or a github project is one way to do that (the latter has the advantage of not needing to rely on anyone else). -- Eric Evans eev...@rackspace.com -- Sent from my mobile device
Re: finding Cassandra servers
On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: RK Something like RRDNS is no more complex that managing a list of seed nodes. How do your clients at Twitter find server nodes? Do you just run them local to each node? My concern is that both RRDNS and seed node lists are vulnerable to individual node failure. Updating DNS when a node dies means you have to wait until the TTL expires, and if you lower the TTL too much your server will get killed. With seed node lists, if I get unlucky I'd be trying to hit a downed node in which case I may as well just use RRDNS and deal with connection failure from the start. Ted
Re: finding Cassandra servers
At Digg we have automated infrastructure. We use Puppet + our own in-house system that allows us to query pools of nodes for 'seeds'. Config files like storage-conf.xml are auto generated on the fly, and we randomly pick a set of seeds. Seeds can be per datacenter as well. As soon as a machine is decommissioned, it no longer gets picked as seed. -Chris On Mar 3, 2010, at 9:12 AM, Ted Zlatanov wrote: On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: RK Something like RRDNS is no more complex that managing a list of seed nodes. How do your clients at Twitter find server nodes? Do you just run them local to each node? My concern is that both RRDNS and seed node lists are vulnerable to individual node failure. Updating DNS when a node dies means you have to wait until the TTL expires, and if you lower the TTL too much your server will get killed. With seed node lists, if I get unlucky I'd be trying to hit a downed node in which case I may as well just use RRDNS and deal with connection failure from the start. Ted
Re: finding Cassandra servers
2010/3/3 Ted Zlatanov t...@lifelogs.com On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: RK Something like RRDNS is no more complex that managing a list of seed nodes. My concern is that both RRDNS and seed node lists are vulnerable to individual node failure. They're not. That's why they're lists. If one doesn't work out, move along to the next. Updating DNS when a node dies means you have to wait until the TTL expires, and if you lower the TTL too much your server will get killed. Don't do that. Make your clients keep trying. Any failure is likely to be transient anyway, so running around messing with DNS every time a machine is offline doesn't make much sense. -Brandon
Re: finding Cassandra servers
On Wed, 3 Mar 2010 12:08:06 -0500 Ian Holsman i...@holsman.net wrote: IH We could create a branch or git fork where you guys could develop it, IH and if it reaches a usable state and others find it interesting it IH could get integrated in then Thanks, Ian. Would it be OK to do it as a patch in http://issues.apache.org/jira/browse/CASSANDRA-846? Or is there a reason for using a branch/fork instead? Ted
Re: finding Cassandra servers
We appear to be reaching consensus that this is solving a non-problem, so I have closed that ticket. 2010/3/3 Ted Zlatanov t...@lifelogs.com: On Wed, 3 Mar 2010 12:08:06 -0500 Ian Holsman i...@holsman.net wrote: IH We could create a branch or git fork where you guys could develop it, IH and if it reaches a usable state and others find it interesting it IH could get integrated in then Thanks, Ian. Would it be OK to do it as a patch in http://issues.apache.org/jira/browse/CASSANDRA-846? Or is there a reason for using a branch/fork instead? Ted
Re: finding Cassandra servers
On Wed, 2010-03-03 at 16:49 +, Christopher Brind wrote: So is the current general practice to connect to a known node, e.g. by ip address? There are so many ways you could tackle this but... If you're talking about provisioning/startup of new nodes, just use the IPs of 2-4 nodes in the seeds section of configs. If you're talking about clients, then round-robin DNS is one option. Load-balancers are another. Either could be used with a subset of higher-capacity/higher-availability nodes, or for the entire cluster. If so, what happens if that node is down? Is the entire cluster effectively broken at that point? You don't use just one node, see above. Or do clients simply maintain a list of nodes a just connect to the first available in the list? It's possible to obtain a list of nodes over Thrift. So, yet another option would be to use a short-list of well-known nodes (discovered via round-robin DNS for example), to obtain a current node list and distribute among them. -- Eric Evans eev...@rackspace.com
Re: finding Cassandra servers
On Wed, 3 Mar 2010 09:19:28 -0800 Chris Goffinet goffi...@digg.com wrote: CG At Digg we have automated infrastructure. We use Puppet + our own CG in-house system that allows us to query pools of nodes for CG 'seeds'. Config files like storage-conf.xml are auto generated on CG the fly, and we randomly pick a set of seeds. CG Seeds can be per datacenter as well. As soon as a machine is CG decommissioned, it no longer gets picked as seed. On Wed, 3 Mar 2010 11:20:07 -0600 Brandon Williams dri...@gmail.com wrote: BW 2010/3/3 Ted Zlatanov t...@lifelogs.com My concern is that both RRDNS and seed node lists are vulnerable to individual node failure. BW They're not. That's why they're lists. If one doesn't work out, move along BW to the next. Updating DNS when a node dies means you have to wait until the TTL expires, and if you lower the TTL too much your server will get killed. BW Don't do that. Make your clients keep trying. Any failure is likely to be BW transient anyway, so running around messing with DNS every time a machine is BW offline doesn't make much sense. Thanks for the advice. I am probably being paranoid about the connection timeout; we're using Puppet as well so I'll just use it to generate the seeds portion of the config file *and* a plain list of seed nodes that each client can retrieve (so they don't have to parse the XML). On Wed, 3 Mar 2010 11:22:45 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE We appear to be reaching consensus that this is solving a non-problem, JE so I have closed that ticket. Sure. Thanks for everyone's opinion, I really appreciate it. Ted
Re: finding Cassandra servers
2010/3/3 Ted Zlatanov t...@lifelogs.com: On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: RK Something like RRDNS is no more complex that managing a list of seed nodes. How do your clients at Twitter find server nodes? Do you just run them local to each node? RRDNS + loading the token map to discover more servers. Our implementation is open source: http://github.com/fauna/cassandra/blob/master/lib/cassandra/cassandra.rb My concern is that both RRDNS and seed node lists are vulnerable to individual node failure. Updating DNS when a node dies means you have to wait until the TTL expires, and if you lower the TTL too much your server will get killed. If you combine it with a fault-tolerate thrift client and loading the token map, it works fine. With seed node lists, if I get unlucky I'd be trying to hit a downed node in which case I may as well just use RRDNS and deal with connection failure from the start. Why would you not deal with connection failure? -ryan
Re: finding Cassandra servers
On Wed, Mar 3, 2010 at 9:27 AM, Eric Evans eev...@rackspace.com wrote: On Wed, 2010-03-03 at 16:49 +, Christopher Brind wrote: So is the current general practice to connect to a known node, e.g. by ip address? There are so many ways you could tackle this but... If you're talking about provisioning/startup of new nodes, just use the IPs of 2-4 nodes in the seeds section of configs. If you're talking about clients, then round-robin DNS is one option. Load-balancers are another. Either could be used with a subset of higher-capacity/higher-availability nodes, or for the entire cluster. If so, what happens if that node is down? Is the entire cluster effectively broken at that point? You don't use just one node, see above. Or do clients simply maintain a list of nodes a just connect to the first available in the list? It's possible to obtain a list of nodes over Thrift. So, yet another option would be to use a short-list of well-known nodes (discovered via round-robin DNS for example), to obtain a current node list and distribute among them. This is exactly what we do. -ryan
Re: finding Cassandra servers
On Wed, 3 Mar 2010 09:35:31 -0800 Ryan King r...@twitter.com wrote: With seed node lists, if I get unlucky I'd be trying to hit a downed node in which case I may as well just use RRDNS and deal with connection failure from the start. RK Why would you not deal with connection failure? I mean it's simpler to deal with one type of connection failure (to any node in RRDNS) than multiples (to seed node to get node list, then to random active node from that list). Sorry if my phrasing was confusing. Ted