[ 
https://issues.apache.org/jira/browse/FLINK-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959444#comment-15959444
 ] 

ASF GitHub Bot commented on FLINK-5974:
---------------------------------------

GitHub user vijikarthi opened a pull request:

    https://github.com/apache/flink/pull/3692

    FLINK-5974 Added configurations to support mesos-dns hostname resolution

    This PR addresses FLINK-5974 requirements which takes care of handling 
dynamic host name resolution for JM and TM components especially in some 
deployment environment like Mesos/DCOS.
    
    It addresses two main functionalities.
    
    a) Dynamic host name configuration
    
    Support for specifying hostname for JM/TM is already available through 
`-jobmanager.rpc.address` and `taskmanager.hostname` configurations.
    
    However in Mesos DC/OS type of environment, each task container can be 
looked up using an hostname alias which is derived using the format 
`<task>.<service>.mesos` where the service discovery is managed through 
`mesos-dns`. To support these dynamic hostname lookup, we have introduced a new 
configuration `mesos.resourcemanager.tasks.hostname` which takes the format 
`_TASK.<ANY_VALUE>`. 
    
    When this property is supplied, the `_TASK` token will be replaced with the 
`TASK_ID` of the TM container and the final derived string will be used to 
populate `taskmanager.hostname` configuration.
    
    For example, in DCOS setup one could supply the configuration as 
`-Dmesos.resourcemanager.tasks.hostname=_TASK.{{FRAMEWORK_NAME}}.mesos` where 
`FRAMEWORK_NAME` could be `flink`
    
    Please refer to 
https://docs.mesosphere.com/1.9/usage/service-discovery/mesos-dns/service-naming/#a-records
 for more details on how Mesos service discovery works.
    
    b) Support to run *any* bootstrap script prior to execute TM startup script
    
    Currently, the TM boot script `mesos-taskmanager.sh` is the only script 
that is passed to Mesos launcher for booting TM container. 
    
    In DC/OS environment where service discovery is common, we need a mechanism 
to wait for the service discovery records to be available and the hostname is 
indeed resolvable before launching the TM boot script. 
    
    DCOS deployment offers a way to validate and wait for the service discovery 
records to be available before launching the tasks. Please see below links for 
more details on how it works.
    
https://mesosphere.github.io/dcos-commons/developer-guide.html#task-bootstrap
    https://github.com/mesosphere/dcos-commons/blob/master/sdk/bootstrap/main.go
    
    To support this, we have introduced a new configuration 
`mesos.resourcemanager.tasks.cmd-prefix=$FLINK_HOME/bin/bootstrap` to provide 
any executable/script that can be configured to run prior to executing the TM 
bootstrap command. 
    
    This feature *currently* works *only for Docker based image* where the 
bootstrap script can be pre-baked in to a specific location that can be used to 
configure `mesos.resourcemanager.tasks.cmd-prefix'.
    
    While both the implementations are helping in addressing the Mesos/DCOS 
type of deployment but the implementation is agnostic of these environments and 
can be used for any generic deployment that may need such a facility.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vijikarthi/flink FLINK-5974-Master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3692.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3692
    
----
commit aeb432dc7fe8bcdd5faa49b8ad5dfb5630ea0747
Author: Vijay Srinivasaraghavan <vijayaraghavan.srinivasaragha...@emc.com>
Date:   2017-04-06T16:48:39Z

    FLINK-5974 Added configurations to support mesos-dns hostname resolution

----


> Support Mesos DNS
> -----------------
>
>                 Key: FLINK-5974
>                 URL: https://issues.apache.org/jira/browse/FLINK-5974
>             Project: Flink
>          Issue Type: Improvement
>          Components: Cluster Management, Mesos
>            Reporter: Eron Wright 
>            Assignee: Vijay Srinivasaraghavan
>
> In certain Mesos/DCOS environments, the slave hostnames aren't resolvable.  
> For this and other reasons, Mesos DNS names would ideally be used for 
> communication within the Flink cluster, not the hostname discovered via 
> `InetAddress.getLocalHost`.
> Some parts of Flink are already configurable in this respect, notably 
> `jobmanager.rpc.address`.  However, the Mesos AppMaster doesn't use that 
> setting for everything (e.g. artifact server), it uses the hostname.
> Similarly, the `taskmanager.hostname` setting isn't used in Mesos deployment 
> mode.   To effectively use Mesos DNS, the TM should use 
> `<task-name>.<framework-name>.mesos` as its hostname.   This could be derived 
> from an interpolated configuration string.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to