GitHub user vijikarthi opened a pull request: https://github.com/apache/flink/pull/3692
FLINK-5974 Added configurations to support mesos-dns hostname resolution This PR addresses FLINK-5974 requirements which takes care of handling dynamic host name resolution for JM and TM components especially in some deployment environment like Mesos/DCOS. It addresses two main functionalities. a) Dynamic host name configuration Support for specifying hostname for JM/TM is already available through `-jobmanager.rpc.address` and `taskmanager.hostname` configurations. However in Mesos DC/OS type of environment, each task container can be looked up using an hostname alias which is derived using the format `<task>.<service>.mesos` where the service discovery is managed through `mesos-dns`. To support these dynamic hostname lookup, we have introduced a new configuration `mesos.resourcemanager.tasks.hostname` which takes the format `_TASK.<ANY_VALUE>`. When this property is supplied, the `_TASK` token will be replaced with the `TASK_ID` of the TM container and the final derived string will be used to populate `taskmanager.hostname` configuration. For example, in DCOS setup one could supply the configuration as `-Dmesos.resourcemanager.tasks.hostname=_TASK.{{FRAMEWORK_NAME}}.mesos` where `FRAMEWORK_NAME` could be `flink` Please refer to https://docs.mesosphere.com/1.9/usage/service-discovery/mesos-dns/service-naming/#a-records for more details on how Mesos service discovery works. b) Support to run *any* bootstrap script prior to execute TM startup script Currently, the TM boot script `mesos-taskmanager.sh` is the only script that is passed to Mesos launcher for booting TM container. In DC/OS environment where service discovery is common, we need a mechanism to wait for the service discovery records to be available and the hostname is indeed resolvable before launching the TM boot script. DCOS deployment offers a way to validate and wait for the service discovery records to be available before launching the tasks. Please see below links for more details on how it works. https://mesosphere.github.io/dcos-commons/developer-guide.html#task-bootstrap https://github.com/mesosphere/dcos-commons/blob/master/sdk/bootstrap/main.go To support this, we have introduced a new configuration `mesos.resourcemanager.tasks.cmd-prefix=$FLINK_HOME/bin/bootstrap` to provide any executable/script that can be configured to run prior to executing the TM bootstrap command. This feature *currently* works *only for Docker based image* where the bootstrap script can be pre-baked in to a specific location that can be used to configure `mesos.resourcemanager.tasks.cmd-prefix'. While both the implementations are helping in addressing the Mesos/DCOS type of deployment but the implementation is agnostic of these environments and can be used for any generic deployment that may need such a facility. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vijikarthi/flink FLINK-5974-Master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3692.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3692 ---- commit aeb432dc7fe8bcdd5faa49b8ad5dfb5630ea0747 Author: Vijay Srinivasaraghavan <vijayaraghavan.srinivasaragha...@emc.com> Date: 2017-04-06T16:48:39Z FLINK-5974 Added configurations to support mesos-dns hostname resolution ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---