There are we issues with 1.7.1 "job as a cluster" set up that I need guidance on
1. In HA set up, the TMs are not able to resolve the job manager's random port through the jobmanager.rpc.port <https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#jobmanager-rpc-port> setting. The setting does work in the non HA mode ( The containerPort /TCP with the same port facilitates that ), but then we loose the job if the JM was to reboot. This is a high priority for us and I am sure there is a work around but I rather ask the experts. 2. The metrics on JM are not visible possibly due to https://issues.apache.org/jira/browse/FLINK-11127 . It is an open issue and both a service per TM and stateful set approach appear non production ready (not scalable and kludgey ). Do you have a time line when these will be resolved. Thanks. Vishal