Unable to start session cluster using Docker

Vinay Patil Thu, 04 Oct 2018 11:31:07 -0700

Hi,

I have used the docker-compose file for creating the cluster as shown in
the documentation. The web ui is started successfully, however, the task
managers are unable to join.


Job Manager container logs:

018-10-04 18:13:13,907 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint    - Rest
endpoint listening at cluster:8081

2018-10-04 18:13:13,907 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint    -
http://cluster:8081 was granted leadership with
leaderSessionID=00000000-0000-0000-0000-000000000000

2018-10-04 18:13:13,907 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint    - Web
frontend listening at http://cluster:8081

2018-10-04 18:13:14,012 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  -
ResourceManager akka.tcp://flink@cluster:6123/user/resourcemanager was
granted leadership with fencing token 00000000000000000000000000000000

2018-10-04 18:13:14,013 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  -
Starting the SlotManager.

2018-10-04 18:13:14,026 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Dispatcher
akka.tcp://flink@cluster:6123/user/dispatcher was granted leadership with
fencing token 00000000-0000-0000-0000-000000000000

Not sure why it says Web Frontend listening at cluster:8081 when the job
manager rpc address is specified to jobmanager

Task Manager Container Logs:

018-10-04 18:19:18,818 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Connecting
to ResourceManager akka.tcp://flink@jobmanager
:6123/user/resourcemanager(00000000000000000000000000000000).

2018-10-04 18:19:18,818 INFO  org.apache.flink.runtime.filecache.FileCache
                - User file cache uses directory
/tmp/flink-dist-cache-1bd95c51-3031-42ab-b782-14a0023921e5

2018-10-04 18:19:28,850 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not
resolve ResourceManager address
akka.tcp://flink@jobmanager:6123/user/resourcemanager,
retrying in 10000 ms: Ask timed out on
[ActorSelection[Anchor(akka.tcp://flink@jobmanager:6123/),
Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message
of type "akka.actor.Identify".


I have even tried to set JOB_MANAGER_RPC_ADDRESS=cluster in   in
docker-compose file, it does not work.
Even "cluster" and "jobmanager" points to localhost in /etc/hosts file.

Can you please let me know what is the issue here.

Regards,
Vinay Patil

Unable to start session cluster using Docker

Reply via email to