Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/
> On Feb 21, 2019, at 2:05 AM, Konstantin Knauf <konstan...@ververica.com> > wrote: > > Hi Boris, > > the exact command depends on the docker-entrypoint.sh script and the image > you are using. For the example contained in the Flink repository it is > "task-manager", I think. The important thing is to pass "taskmanager.host" to > the Taskmanager process. You can verify by checking the Taskmanager logs. > These should contain lines like below: > > 2019-02-21 08:03:00,004 INFO > org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - Program > Arguments: > 2019-02-21 08:03:00,008 INFO > org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - > -Dtaskmanager.host=10.12.10.173 > > In the Jobmanager logs you should see that the Taskmanager is registered > under the IP above in a line similar to: > > 2019-02-21 08:03:26,874 INFO > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - > Registering TaskManager with ResourceID a0513ba2c472d2d1efc07626da9c1bda > (akka.tcp://flink@10.12.10.173:46531/user/taskmanager_0 > <http://flink@10.12.10.173:46531/user/taskmanager_0>) at ResourceManager > > A service per Taskmanager is not required. The purpose of the config > parameter is that the Jobmanager addresses the taskmanagers by IP instead of > hostname. > > Hope this helps! > > Cheers, > > Konstantin > > > > On Wed, Feb 20, 2019 at 4:37 PM Boris Lublinsky > <boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>> wrote: > Also, The suggested workaround does not quite work. > 2019-02-20 15:27:43,928 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@flink-taskmanager-1:6170 <>] has failed, address is > now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@flink-taskmanager-1:6170 <>]] Caused by: > [flink-taskmanager-1: No address associated with hostname] > 2019-02-20 15:27:48,750 ERROR > org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHandler - > Caught exception > > I think the problem is that its trying to connect to flink-task-manager-1 > > Using busybody to experiment with nslookup, I can see > / # nslookup flink-taskmanager-1.flink-taskmanager > Server: 10.0.11.151 > Address 1: 10.0.11.151 ip-10-0-11-151.us > <http://ip-10-0-11-151.us/>-west-2.compute.internal > > Name: flink-taskmanager-1.flink-taskmanager > Address 1: 10.131.2.136 > flink-taskmanager-1.flink-taskmanager.flink.svc.cluster.local > / # nslookup flink-taskmanager-1 > Server: 10.0.11.151 > Address 1: 10.0.11.151 ip-10-0-11-151.us > <http://ip-10-0-11-151.us/>-west-2.compute.internal > > nslookup: can't resolve 'flink-taskmanager-1' > / # nslookup flink-taskmanager-0.flink-taskmanager > Server: 10.0.11.151 > Address 1: 10.0.11.151 ip-10-0-11-151.us > <http://ip-10-0-11-151.us/>-west-2.compute.internal > > Name: flink-taskmanager-0.flink-taskmanager > Address 1: 10.131.0.111 > flink-taskmanager-0.flink-taskmanager.flink.svc.cluster.local > / # nslookup flink-taskmanager-0 > Server: 10.0.11.151 > Address 1: 10.0.11.151 ip-10-0-11-151.us > <http://ip-10-0-11-151.us/>-west-2.compute.internal > > nslookup: can't resolve 'flink-taskmanager-0' > / # > > So the name should be postfixed with the service name. How do I force it? I > suspect I am missing config parameter > > > Boris Lublinsky > FDP Architect > boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com> > https://www.lightbend.com/ <https://www.lightbend.com/> >> On Feb 19, 2019, at 4:33 AM, Konstantin Knauf <konstan...@ververica.com >> <mailto:konstan...@ververica.com>> wrote: >> >> Hi Boris, >> >> the solution is actually simpler than it sounds from the ticket. The only >> thing you need to do is to set the "taskmanager.host" to the Pod's IP >> address in the Flink configuration. The easiest way to do this is to pass >> this config dynamically via a command-line parameter. >> >> The Deployment spec could looks something like this: >> containers: >> - name: taskmanager >> [...] >> args: >> - "taskmanager.sh" >> - "start-foreground" >> - "-Dtaskmanager.host=$(K8S_POD_IP)" >> [...] >> env: >> - name: K8S_POD_IP >> valueFrom: >> fieldRef: >> fieldPath: status.podIP >> >> Hope this helps and let me know if this works. >> >> Best, >> >> Konstantin >> >> On Sun, Feb 17, 2019 at 9:51 PM Boris Lublinsky >> <boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>> wrote: >> I was looking at this issue >> https://issues.apache.org/jira/browse/FLINK-11127 >> <https://issues.apache.org/jira/browse/FLINK-11127> >> Apparently there is a workaround for it. >> Is it possible provide the complete helm chart for it. >> Bits and pieces are in the ticket, but it would be nice to see the full chart >> >> Boris Lublinsky >> FDP Architect >> boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com> >> https://www.lightbend.com/ <https://www.lightbend.com/> >> >> >> -- >> Konstantin Knauf | Solutions Architect >> +49 160 91394525 >> >> <https://www.ververica.com/> >> Follow us @VervericaData >> -- >> Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference >> Stream Processing | Event Driven | Real Time >> -- >> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany >> -- >> Data Artisans GmbH >> Registered at Amtsgericht Charlottenburg: HRB 158244 B >> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen > > > > -- > Konstantin Knauf | Solutions Architect > +49 160 91394525 > <https://www.ververica.com/> > Follow us @VervericaData > -- > Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference > Stream Processing | Event Driven | Real Time > -- > Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > -- > Data Artisans GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen