Andrey created FLINK-5757: ----------------------------- Summary: Initial JobManager address required during access from cli to HA cluster Key: FLINK-5757 URL: https://issues.apache.org/jira/browse/FLINK-5757 Project: Flink Issue Type: Bug Reporter: Andrey
Steps to reproduce: * setup flink cluster in HA mode (recovery.mode: zookeeper &etc) * setup cli on separate server (or dev machine) * configure cli to discover jobmanager in HA mode (recovery.mode: zookeeper &etc) * comment out "jobmanager.rpc.address" and "jobmanager.rpc.port" parameters int ./conf/flink-conf.yaml. They are not needed anymore since service discovery now performed using zookeeper. Also port is dynamically selected, so static configuration makes no sense. * execute ./bin/flink list In logs: {code} java.lang.RuntimeException: The initial JobManager address has not been set correctly. at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:179) at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:646) at org.apache.flink.client.CliFrontend.getJobManagerGateway(CliFrontend.java:868) at org.apache.flink.client.CliFrontend.list(CliFrontend.java:387) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1008) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048) {code} If I put dummy configuration : * jobmanager.rpc.address: localhost * jobmanager.rpc.port: 6123 Connection will be established successfully. However there will be second issue: * unnecessary connection attempt using dummy configuration In logs: {code} 2017-02-09 11:29:59,344 INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=zk1:10020,zk2:10020,zk3:10020 sessionTimeout=60000 watcher=org.apache.flink.shaded.org.apache.curator.ConnectionState@30aec673 2017-02-09 11:29:59,354 INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server zk1/ip/:10020. Will not attempt to authenticate using SASL (unknown error) 2017-02-09 11:29:59,356 INFO org.apache.zookeeper.ClientCnxn - Socket connection established to zk1/ip:10020, initiating session 2017-02-09 11:29:59,360 INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server zk1/ip:10020, sessionid = 0x35a228afd320011, negotiated timeout = 60000 2017-02-09 11:29:59,360 INFO org.apache.flink.shaded.org.apache.curator.framework.state.ConnectionStateManager - State change: CONNECTED 2017-02-09 11:29:59,363 INFO org.apache.flink.client.program.StandaloneClusterClient - Starting client actor system. 2017-02-09 11:30:00,815 INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to dummyhost/dummyip 2017-02-09 11:30:01,016 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'dummyhost/dummyip': connect timed out 2017-02-09 11:30:01,106 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Network is unreachable: connect 2017-02-09 11:30:01,107 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/0:0:0:0:0:0:0:1': Cannot assign requested address: connect 2017-02-09 11:30:02,207 WARN org.apache.flink.runtime.net.ConnectionUtils - Could not connect to dummyhost/dummyip:dummyport. Selecting a local address using heuristics. {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)