OK, guys. Thanks for the input! Here is my proposal:

1) If the container uses host network, Mesos agent will set
LIBPROCESS_ADVERTISE_IP
to agent IP. This is for the case where DNS is not configured properly on
the host (we don't need to do that if DNS is configured properly). By doing
this, libprocess will skip hostname lookup and advertise
LIBPROCESS_ADVERTISE_IP
directly.

2) If the container uses non-host network, and defines port mapping (e.g.,
bridge). Mesos agent will not set any libprocess env variables. Given that
there could be multiple mapped ports, Mesos agent don't know how to
set LIBPROCESS_ADVERTISE_PORT.
So it's framework's responsibility to set LIBPROCESS_ADVERTISE_IP and
LIBPROCESS_ADVERTISE_PORT
properly in this case (through CommandInfo.environment)

3) If the container uses non-host network, and does not define port mapping
(e.g., ip per container). Mesos agent will not set any libprocess env
variables. In this case, both CNI isolator and docker engine will properly
setup DNS in the container so hostname lookup should work properly.

- Jie

On Sat, Oct 15, 2016 at 4:01 PM, tommy xiao <xia...@gmail.com> wrote:

> good point, +1
>
> 2016-10-13 0:27 GMT+08:00 Jie Yu <yujie....@gmail.com>:
>
> > Stephan,
> >
> > I think the only time the framework needs to set LIBPROCESS_ADVERTISE_IP
> is
> > when DNAT is necessary for the container (e.g., bridge). In that
> > case, LIBPROCESS_ADVERTISE_IP should always be agent ip and
> > the relevant host port allocated for the container. For other cases,
> > framework should not do anything.
> >
> > - Jie
> >
> > On Wed, Oct 12, 2016 at 4:43 AM, Erb, Stephan <
> stephan....@blue-yonder.com
> > >
> > wrote:
> >
> > > >Framework should be the one that sets
> > > >LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT appropriately if
> > it
> > > >tries to launch another Mesos framework so that Master can reach the
> new
> > > >framework.
> > >
> > > As a framework/executor author this is not possible in all scenarios:
> > > There is no way to discover IP addresses assigned via CNI before the
> > first
> > > StatusUpdate has been received. It is therefore not possible to set
> > > LIBPROCESS_ADVERTISE_IP appropriately at launch time.
> > >
> > > Please see https://issues.apache.org/jira/browse/MESOS-6281 for
> details.
> > >
> > >
> > > On 12/10/16 06:42, "Avinash Sridharan" <avin...@mesosphere.io> wrote:
> > >
> > >     Valid point. Makes sense to drive this decision from the user and
> the
> > >     framework.
> > >
> > >     On Tue, Oct 11, 2016 at 9:32 PM, Jie Yu <yujie....@gmail.com>
> wrote:
> > >
> > >     > >
> > >     > > While I believe this particular logic of setting
> > > LIBPROCESS_ADVERTISE_IP
> > >     > > to agent IP can be done in the agent (it could look at the port
> > > mapping
> > >     > > as well)
> > >     >
> > >     >
> > >     > What if there are multiple port mappings? How can the agent
> decide
> > > which
> > >     > port to be used as  LIBPROCESS_ADVERTISE_PORT?
> > >     >
> > >     > On Tue, Oct 11, 2016 at 9:27 PM, Avinash Sridharan <
> > > avin...@mesosphere.io>
> > >     > wrote:
> > >     >
> > >     > > Definitely a +1 for executor binding to 0.0.0.0, instead of
> > doing a
> > >     > > `gethostname` and `getaddrinfo`. But I am assuming this
> semantics
> > > would
> > >     > > kick in only if LIBPROCESS_IP is not set, which should be the
> > norm.
> > >     > >
> > >     > > +1 for LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT
> and
> > > the onus
> > >     > > being on the frameworks to set these variables. I guess the
> > > framework can
> > >     > > set the LIBPROCESS_ADVERTISE_IP to the agent IP and
> > >     > > LIBPROCESS_ADVERTISE_PORT to the host port when it specifies a
> > >     > > port-mapping. While I believe this particular logic of
> > >     > > setting LIBPROCESS_ADVERTISE_IP to agent IP can be done in the
> > > agent (it
> > >     > > could look at the port mapping as well), when to actually set
> > these
> > >     > > variables (whether the executors even need to advertise their
> IP
> > >     > addresses,
> > >     > > is a decision that the Frameworks should be privy too and not
> > left
> > > to the
> > >     > > agent.
> > >     > >
> > >     > > On Tue, Oct 11, 2016 at 7:31 PM, haosdent <haosd...@gmail.com>
> > > wrote:
> > >     > >
> > >     > > > > libprocess should always bind to 0.0.0.0
> > >     > > > + 1 for this
> > >     > > >
> > >     > > > On Wed, Oct 12, 2016 at 2:33 AM, Jie Yu <yujie....@gmail.com
> >
> > > wrote:
> > >     > > >
> > >     > > > > Hi folks,
> > >     > > > >
> > >     > > > > I was in the process of cleaning up some tech debt related
> to
> > > env
> > >     > > > variables
> > >     > > > > in our code base. I created an epic ticket
> > >     > > > > <https://issues.apache.org/jira/browse/MESOS-6341> to
> > track. I
> > >     > > searched
> > >     > > > > relevant tickets fired previously, and found MESOS-3740
> > >     > > > > <https://issues.apache.org/jira/browse/MESOS-3740>. I did
> > some
> > >     > digging
> > >     > > > on
> > >     > > > > how we handle LIBPROCESS_IP currently, and here are my
> > > findings:
> > >     > > > >
> > >     > > > > 1) We always set LIBPROCESS_IP in the executor environment
> > > variables:
> > >     > > > > https://github.com/apache/mesos/blob/master/src/slave/
> > >     > > > > slave.cpp#L6793-L6796
> > >     > > > >
> > >     > > > > This is not an issue for an executor that runs on host
> > network.
> > >     > > However,
> > >     > > > if
> > >     > > > > the executor wants to run on non-host network (e.g.,
> > overlay),
> > > this
> > >     > > might
> > >     > > > > be problematic, because libprocess for the executor will
> try
> > > to bind
> > >     > to
> > >     > > > > LIBPROCESS_IP, but the IP is not valid inside the
> container.
> > >     > > > >
> > >     > > > > 2) As mentioned in MESOS-3740
> > >     > > > > <https://issues.apache.org/jira/browse/MESOS-3740>, some
> > user
> > > wants
> > >     > to
> > >     > > > run
> > >     > > > > a Mesos framework in a Mesos container. The old style
> > framework
> > >     > driver
> > >     > > > > assumes a 2 way communication channel between the framework
> > > and the
> > >     > > Mesos
> > >     > > > > master. In order for the master to reach the framework
> > running
> > >     > inside a
> > >     > > > > Mesos container, the framework's libprocess should
> advertise
> > > its ip
> > >     > and
> > >     > > > > port properly. This problem gets tricky because the
> > networking
> > > for
> > >     > the
> > >     > > > > Mesos container:
> > >     > > > >
> > >     > > > > 2.a) If the container uses host network, libprocess should
> > > bind to
> > >     > > > 0.0.0.0,
> > >     > > > > and advertise itself using the agent ip and the relevant
> port
> > >     > > > > 2.b) If the container has a routable ip (e.g., using calico
> > or
> > >     > > overlay),
> > >     > > > > libprocess should still bind to 0.0.0.0, and advertise
> itself
> > > using
> > >     > the
> > >     > > > > container ip and the relevant port. Currently, it binds to
> > > agent ip
> > >     > > > (which
> > >     > > > > will fail), and advertise itself using agnet ip and the
> port
> > > in the
> > >     > > > > container (which will fail as well)
> > >     > > > > 2.c) If the container has a private ip (e.g., bridge),
> > > libprocess
> > >     > > should
> > >     > > > > still bind to 0.0.0.0, and advertise itself using the agent
> > ip
> > > and
> > >     > > > _mapped_
> > >     > > > > host port. Currently, it binds to agent ip (which will
> fail),
> > > and
> > >     > > > advertise
> > >     > > > > itself using agent ip and the port in the container (which
> > > will fail
> > >     > as
> > >     > > > > well)
> > >     > > > >
> > >     > > > > Therefore, the workaround
> > >     > > > > <https://github.com/mesosphere/mesos/commit/
> > >     > > > b9c622b53b3ffcc27911fcdcefc37a
> > >     > > > > 52ebe33bdd>
> > >     > > > > suggested in MESOS-3740 <https://issues.apache.org/
> > >     > > > jira/browse/MESOS-3740>
> > >     > > > > is not ideal. It does not consider 2.b) and 2.c)
> > >     > > > >
> > >     > > > > Libprocess now supports both LIBPROCESS_IP and
> > >     > LIBPROCESS_ADVERTISE_IP
> > >     > > so
> > >     > > > > the bind address does not have to be the address that is
> > being
> > >     > > > advertised.
> > >     > > > >
> > >     > > > > For the 2.c) case, Mesos don't have a way to determine the
> > > advertise
> > >     > > port
> > >     > > > > (mapped port). This information is only known to the
> > framework
> > > (which
> > >     > > > host
> > >     > > > > port it'll use to serve as the mapped port for the
> > libprocess).
> > >     > > > >
> > >     > > > > Given that, I think Mesos should not bindly set
> LIBPROCESS_IP
> > > to
> > >     > agent
> > >     > > IP
> > >     > > > > in executor environment variables. Framework should be the
> > one
> > > that
> > >     > > sets
> > >     > > > > LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT
> > > appropriately
> > >     > if
> > >     > > it
> > >     > > > > tries to launch another Mesos framework so that Master can
> > > reach the
> > >     > > new
> > >     > > > > framework. If the framework just wants to launch a regular
> > > container
> > >     > > that
> > >     > > > > does not depends on libprocess, it should simply not set
> > these
> > > env
> > >     > > > > variables.
> > >     > > > >
> > >     > > > > Also, I think libprocess should always bind to 0.0.0.0,
> > rather
> > > than
> > >     > > > doing a
> > >     > > > > hostname lookup and bind to the IP found for the hostname.
> > >     > > > > LIBPROCESS_ADVERTISE_IP can be used to overwrite the ip
> > > address it
> > >     > > wants
> > >     > > > to
> > >     > > > > advertise to peers. If that's not specified, it'll try to
> do
> > a
> > >     > hostname
> > >     > > > > lookup to guess a routable ip.
> > >     > > > >
> > >     > > > > Thoughts?
> > >     > > > > - Jie
> > >     > > > >
> > >     > > >
> > >     > > >
> > >     > > >
> > >     > > > --
> > >     > > > Best Regards,
> > >     > > > Haosdent Huang
> > >     > > >
> > >     > >
> > >     > >
> > >     > >
> > >     > > --
> > >     > > Avinash Sridharan, Mesosphere
> > >     > > +1 (323) 702 5245
> > >     > >
> > >     >
> > >
> > >
> > >
> > >     --
> > >     Avinash Sridharan, Mesosphere
> > >     +1 (323) 702 5245
> > >
> > >
> > >
> > >
> >
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>

Reply via email to