On Wed, 3 Mar 2010 07:57:32 -0600 Gary Dusbabek <gdusba...@gmail.com> wrote:
GD> 2010/3/3 Ted Zlatanov <t...@lifelogs.com>: TZ> I need to find Cassandra servers on my network from several types of TZ> clients and platforms. The goal is to make adding and removing servers TZ> painless, assuming a leading window of at least 1 hour. The discovery TZ> should be automatic and distributed. I want to minimize management. GD> Nothing in the current codebase currently meets these needs. But then GD> again, cassandra doesn't need the described functionality. Zeroconf GD> confines itself to a single subnet (would require router configuration GD> to work across subnets so that multicast goes through). I looked it up and today, mDNS seems to be the standard name for this protocol (Bonjour/Rednezvous on Apple). Zeroconf seems to be the older name and there's a *lot* of name confusion so I'll just stick to "mDNS." Here's a decent Java implementation: http://sourceforge.net/projects/jmdns/ I don't think routing multicasts across subnets is a burden. GD> RRDNS would work, but something would need to keep that updated when GD> servers go away (it wouldn't be automatic). GD> If you can count on one of your (seed nodes) to be up, RRDNS could be GD> used to connect to one of them and fetch the token range list. To do GD> this, create a thrift client and call describe_ring. In older GD> versions you can get a jsonified endpoint map by calling GD> get_string_property('token map'). It would really be much more efficient if I didn't have to maintain RRDNS, but could instead look at the mDNS broadcasts for the Cassandra service. What you describe is a centralized model, no? With mDNS I wouldn't have to know which nodes are up or down, and I wouldn't have to do extra queries, it would just work. I don't see why Cassandra doesn't need that functionality. How else could you be guaranteed to find a live node if there is one on your subnet? Ted