Re: Flink on Kubernetes - Hostname resolution between job/tasks-managers

2019-01-15 Thread bastien dine
Nevermind..
Problem already discussed in thread :
Flink 1.7 jobmanager tries to lookup taskmanager by its hostname in k8s
environment"


--

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io


Le mar. 15 janv. 2019 à 15:16, bastien dine  a
écrit :

> Hello,
> I am trying to install Flink on Kube, it's almost working..
> I am using the kube files on flink 1.7.1 doc
>
> My cluster is starting well, my 2 tasksmanagers are registering
> successfully to job manager
> On webUI, i see them :
> akka.tcp://flink@dev-flink-taskmanager-3717639837-gvwh4
> :37057/user/taskmanager_0
>
> I can submit a job too..
> But when I am going in job detail, or try to load the logs.. I have
> nothing.. and log on jobmanager give me plenty of error like :
>
> 2019-01-15 14:12:40.111 [flink-metrics-96] WARN
> akka.remote.ReliableDeliverySupervisor
> flink-metrics-akka.remote.default-remote-dispatcher-113 - Association with
> remote system
> [akka.tcp://flink-metrics@dev-flink-taskmanager-3717639837-gvwh4:40508]
> has failed, address is now gated for [50] ms. Reason: [Association failed
> with [akka.tcp://flink-metrics@dev-flink-taskmanager-3717639837-gvwh4:40508]]
> Caused by: [dev-flink-taskmanager-3717639837-gvwh4: Name does not resolve]
>
> -> Name does not resolve..
> So trying to ping on the pod hostname and it's not working
> Thus, ping on the pod's IP is working
>
> So, my question is :
> - Can we force usage of IPv4 over hostname resolution ? (will be better
> for perf also)
> - If no, do I need to had a service or something to make it work ?
>
> Best Regards,
> Bastien
>
> --
>
> Bastien DINE
> Data Architect / Software Engineer / Sysadmin
> bastiendine.io
>


Flink on Kubernetes - Hostname resolution between job/tasks-managers

2019-01-15 Thread bastien dine
Hello,
I am trying to install Flink on Kube, it's almost working..
I am using the kube files on flink 1.7.1 doc

My cluster is starting well, my 2 tasksmanagers are registering
successfully to job manager
On webUI, i see them :
akka.tcp://flink@dev-flink-taskmanager-3717639837-gvwh4
:37057/user/taskmanager_0

I can submit a job too..
But when I am going in job detail, or try to load the logs.. I have
nothing.. and log on jobmanager give me plenty of error like :

2019-01-15 14:12:40.111 [flink-metrics-96] WARN
akka.remote.ReliableDeliverySupervisor
flink-metrics-akka.remote.default-remote-dispatcher-113 - Association with
remote system
[akka.tcp://flink-metrics@dev-flink-taskmanager-3717639837-gvwh4:40508] has
failed, address is now gated for [50] ms. Reason: [Association failed with
[akka.tcp://flink-metrics@dev-flink-taskmanager-3717639837-gvwh4:40508]]
Caused by: [dev-flink-taskmanager-3717639837-gvwh4: Name does not resolve]

-> Name does not resolve..
So trying to ping on the pod hostname and it's not working
Thus, ping on the pod's IP is working

So, my question is :
- Can we force usage of IPv4 over hostname resolution ? (will be better for
perf also)
- If no, do I need to had a service or something to make it work ?

Best Regards,
Bastien

--

Bastien DINE
Data Architect / Software Engineer / Sysadmin
bastiendine.io