On Wed, 3 Mar 2010 07:57:32 -0600 Gary Dusbabek <gdusba...@gmail.com> wrote: 

GD> 2010/3/3 Ted Zlatanov <t...@lifelogs.com>:
TZ> I need to find Cassandra servers on my network from several types of
TZ> clients and platforms.  The goal is to make adding and removing servers
TZ> painless, assuming a leading window of at least 1 hour.  The discovery
TZ> should be automatic and distributed.  I want to minimize management.

GD> Nothing in the current codebase currently meets these needs.  But then
GD> again, cassandra doesn't need the described functionality.  Zeroconf
GD> confines itself to a single subnet (would require router configuration
GD> to work across subnets so that multicast goes through).  

I looked it up and today, mDNS seems to be the standard name for this
protocol (Bonjour/Rednezvous on Apple).  Zeroconf seems to be the older
name and there's a *lot* of name confusion so I'll just stick to "mDNS."

Here's a decent Java implementation: http://sourceforge.net/projects/jmdns/

I don't think routing multicasts across subnets is a burden.

GD> RRDNS would work, but something would need to keep that updated when
GD> servers go away (it wouldn't be automatic).

GD> If you can count on one of your (seed nodes) to be up, RRDNS could be
GD> used to connect to one of them and fetch the token range list.  To do
GD> this, create a thrift client and call describe_ring.  In older
GD> versions you can get a jsonified endpoint map by calling
GD> get_string_property('token map').

It would really be much more efficient if I didn't have to maintain
RRDNS, but could instead look at the mDNS broadcasts for the Cassandra
service.  What you describe is a centralized model, no?

With mDNS I wouldn't have to know which nodes are up or down, and I
wouldn't have to do extra queries, it would just work.  I don't see why
Cassandra doesn't need that functionality.  How else could you be
guaranteed to find a live node if there is one on your subnet?

Ted

Reply via email to