The pull request https://github.com/apache/flink/pull/1758 should improve the TaskManager's network interface selection.
On Fri, Mar 4, 2016 at 10:19 AM, Stephan Ewen <se...@apache.org> wrote: > Hi! > > This registration phase means that the TaskManager tries to tell the > JobManager that it is available. > If that fails, there can be two reasons > > 1) Network communication not possible to the port > 1.1) JobManager IP really not reachable (not the case, as you > described) > 1.2) TaskManager selected a wrong network interface to work with > 2) JobManager not listening > > > To look into 1.2, can you check the TaskManager log at the beginning, > where it says what interface/hostname the TaskManager selected to use? > > Thanks, > Stephan > > > > > > > On Fri, Mar 4, 2016 at 2:48 AM, Deepak Jha <dkjhan...@gmail.com> wrote: > >> Hi All, >> I've created 2 docker containers on my local machine, one running >> JM(192.168.99.104) and other running TM. I was expecting to see TM in the >> JM UI but it did not happen. On looking into the TM logs I see following >> lines >> >> >> 01:29:50,862 DEBUG org.apache.flink.runtime.taskmanager.TaskManager >> - Starting TaskManager process reaper >> 01:29:50,868 INFO org.apache.flink.runtime.filecache.FileCache >> - User file cache uses directory >> /tmp/flink-dist-cache-be63f351-2bce-48ef-bbc4-fb0f40fecd49 >> 01:29:51,093 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Starting TaskManager actor at >> akka://flink/user/taskmanager#1222392284. >> 01:29:51,095 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - TaskManager data connection information: 140efeb188cc >> (dataPort=6122) >> 01:29:51,096 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - TaskManager has 1 task slot(s). >> 01:29:51,097 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Memory usage stats: [HEAP: 386/494/494 MB, NON HEAP: 30/31/-1 MB >> (used/committed/max)] >> 01:29:51,104 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Trying to register at JobManager akka.tcp:// >> flink@192.168.99.104:6123/user/jobmanager (attempt 1, timeout: 500 >> milliseconds) >> 01:29:51,633 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Trying to register at JobManager akka.tcp:// >> flink@192.168.99.104:6123/user/jobmanager (attempt 2, timeout: 1000 >> milliseconds) >> 01:29:52,652 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Trying to register at JobManager akka.tcp:// >> flink@192.168.99.104:6123/user/jobmanager (attempt 3, timeout: 2000 >> milliseconds) >> 01:29:54,672 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Trying to register at JobManager akka.tcp:// >> flink@192.168.99.104:6123/user/jobmanager (attempt 4, timeout: 4000 >> milliseconds) >> 01:29:58,693 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Trying to register at JobManager akka.tcp:// >> flink@192.168.99.104:6123/user/jobmanager (attempt 5, timeout: 8000 >> milliseconds) >> 01:30:06,702 INFO org.apache.flink.runtime.taskmanager.TaskManager >> - Trying to register at JobManager akka.tcp:// >> flink@192.168.99.104:6123/user/jobmanager (attempt 6, timeout: 16000 >> milliseconds) >> >> >> However, from TM i am able to reach JM on port 6123 >> root@140efeb188cc:/# nc -v 192.168.99.104 6123 >> Connection to 192.168.99.104 6123 port [tcp/*] succeeded! >> >> >> masters file on TM contains >> 192.168.99.104:8080 >> >> Did anyone face this issue with remote JM/TM combination ? >> >> -- >> Thanks, >> Deepak Jha >> > >