On 04/01/2015 11:20 AM, Christos Kozyrakis wrote:
Service discovery is a topic where it's unlikely that a single solution
will satisfy every need and every constraint. It's also good for the
Mesos community to have multiple successful alternatives, even when they
overlap in some ways.
I will comment a little on Mesos-DNS since I designed it and currently
maintain it. The project is still in early stages and the functionality
is not where it should be. But it's a start and we are improving it
every day.
* Naming/ports info: the release of Mesos 0.22 allows frameworks to
provide rich service discovery info for tasks and executors, including
naming the ports (e.g., 80 is http, 90 is RPC, etc) and providing
interesting labels for environment (prod/test/...), location, version,
etc. Once frameworks start using this feature, we will use it in
Mesos-DNS to provide more intuitive names for tasks and services and
give you more meaningful info on service discovery requests.
* SRV records: you are correct that SRV records are not the most
convenient way to get port information. Very little software exists to
take advantage of these records and the fact that you need two requests
to get both a port and an IP address is annoying. We are adding adding
an HTTP interface to Mesos-DNS to allow you to get more compact and
useful port info. See the
https://github.com/mesosphere/mesos-dns/tree/http branch. It is not
ready for production use yet, but it will give you an idea.
* Namezone and coordination with other name servers: yes, your
suggestion makes a lot of sense. Looking into such a setup is on our
todo list. If you have time to investigate this and contribute the
changes/setup needed, that would be great.
Regards
Well, here is a article that is good in it's a partial summary, but
surely it needs to be updated?
http://jasonwilder.com/blog/2014/02/04/service-discovery-in-the-cloud/
What I'd like, is to read a "survey article on service/resource
discovery" where some knowledgeable person extols on building a cluster
up for general purpose uses (this is common) vs Big-Data:Big-Science.
The different options from a macro point of view would be keen.
Most technical organization that I interact with on an ongoing basis
pretty much have the same goals that I do: Heterogeneous hardware in the
local cluster and the ability to *seemlessly* rent outside (vendor)
resources for needs above the capability of the local cluster.
*Everyone* wants their own local cluster, and the dynamic ability to
supplement that in-house cluster with cloud services.
That's really the white paper I'm searching for; as are many others. The
more details the better. If mesos addresses that need, it's gonna
be a very big hit. In fact, it greatly behoves the newer projects to
explain (as precisely as possible) the project goals and what it
purports to achieve (what is fixed or enhanced) over the existing codes
(projects) in some very clear detail, imho.
James
Mesos-DNS
This project came to my attention this week, and I am looking to get
it installed today to have hands on time with it. Basically, it's a
binary that queries the mesos-master and develops A records that are
hostnames, based on the framework names, and SRV records based on
the assigned ports.
This is where I get confused. I can see the A records being useful,
however, you would have to have your entire network be able to be
use the mesos-dns (including non-mesos systems). Otherwise how
would a client know to connect to a .mesos domain name? Perhaps
there should be a way to integrate mesos-dns as the authoritative
zone for .mesos in your standard enterprise DNS servers. This also
saves the configuration issues of having to add DNS services to all
the nodes. I need to research DNS a bit more, but couldn't you
setup, say in bind, that any requests in .mesos are forwarded to the
mesos-dns service, and then sent through your standard dns back to
the client? Wouldn't this be preferable to setting the .mesos name
services as the first DNS server and then THAT forwards off to your
standard enterprise DNS servers?
Another issue I see with DNS is it works well for hostnames, but
what about ports. Yes I see there there SRV records that will return
the ports, but how would that even be used? Consider the hive
thrift service example above. We could assume hive thrift would run
on port 10000 on all nodes in the cluster, and use the port, but
then you run into the same issues as ha proxy. You can't really
specify a port via DNS in a jdbc connection URL can you? How do you
get applications that want to connect to a integer port do a DNS
lookup to resolve a port? Or are we back to you have one cluster,
and you get 65536 ports for all the services you could want on that
cluster? Basically hard coding the ports? This then loses
flexibility from a docker port bridging perspective too, in that in
my above haproxy example, all the docker containers would have to
expose port 10000 which would have caused a conflict on node2.
Summary
So while I have a nice long email here, it seems I am either missing
something critical in how service discovery could work with a mesos
cluster, or there are still some pretty big difficulties that we
need to over come for an enterprise. Haproxy seems cool, and to work
well except for those "long running TCP connections" like thrift. I
am at a loss how to handle that. Mesos DNS is neat too, except for
the port conflicts etc that would occur if you used native ports on
nodes, and if you didn't use native ports, (mesos random ports) how
do your applications know which port to connect to (yes it's in the
SRV record, however, how do you make apps aware to look up a DNS
record for a port?)
Am I missing something? How are others handling these issues?
--
Christos