Hmm, this is indeed strange. Could you share the logs of the TaskManager with us? Ideally you set the log level to debug. Thanks a lot.
Cheers, Till On Wed, Sep 2, 2020 at 12:45 PM art <superainbo...@163.com> wrote: > Hi Till, > > The full information when I run command ' kubectl get all’ like this: > > NAME READY STATUS RESTARTS AGE > pod/flink-jobmanager-85bdbd98d8-ppjmf 1/1 Running 0 2m34s > pod/flink-taskmanager-74c68c6f48-6jb5v 1/1 Running 0 2m34s > > NAME TYPE CLUSTER-IP EXTERNAL-IP > PORT(S) AGE > service/flink-jobmanager ClusterIP 10.103.207.75 <none> > 6123/TCP,6124/TCP,8081/TCP 2m34s > service/kubernetes ClusterIP 10.96.0.1 <none> > 443/TCP 5d2h > > NAME READY UP-TO-DATE AVAILABLE AGE > deployment.apps/flink-jobmanager 1/1 1 1 2m34s > deployment.apps/flink-taskmanager 1/1 1 1 2m34s > > NAME DESIRED CURRENT READY > AGE > replicaset.apps/flink-jobmanager-85bdbd98d8 1 1 1 > 2m34s > replicaset.apps/flink-taskmanager-74c68c6f48 1 1 1 > 2m34s > > And I can open flink ui but the task manger is 0 ,so the job manger is > work well > I think the problem is taksmanger can not register itself to jobmanger, > did I miss some configure? > > > 在 2020年9月2日,下午5:24,Till Rohrmann <trohrm...@apache.org> 写道: > > Hi art, > > could you check what `kubectl get services` returns? Usually if you run > `kubectl get all` you should also see the services. But in your case there > are no services listed. You have see something like > service/flink-jobmanager otherwise the flink-jobmanager service (K8s > service) is not running. > > Cheers, > Till > > On Wed, Sep 2, 2020 at 11:15 AM art <superainbo...@163.com> wrote: > >> Hi Till, >> >> I’m sure the job manager-service is started, I can find it in Kubernetes >> DashBoard >> >> When I run command ' kubectl get deployment’ I can got this: >> flink-jobmanager 1/1 1 1 33s >> flink-taskmanager 1/1 1 1 33s >> >> When I run command ' kubectl get all’ I can got this: >> NAME READY STATUS RESTARTS AGE >> pod/flink-jobmanager-85bdbd98d8-ppjmf 1/1 Running 0 >> 2m34s >> pod/flink-taskmanager-74c68c6f48-6jb5v 1/1 Running 0 >> 2m34s >> >> So, I think flink-jobmanager works well, but taskmannger is restarted >> every few minutes >> >> My minikube version: v1.12.3 >> Flink version:v1.11.1 >> >> 在 2020年9月2日,下午4:27,Till Rohrmann <trohrm...@apache.org> 写道: >> >> Hi art, >> >> could you verify that the jobmanager-service has been started? It looks >> as if the name flink-jobmanager is not resolvable. It could also help to >> know the Minikube and K8s version you are using. >> >> Cheers, >> Till >> >> On Wed, Sep 2, 2020 at 9:50 AM art <superainbo...@163.com> wrote: >> >>> Hi,I’m going to deploy flink on minikube referring to >>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/ops/deployment/kubernetes.html >>> ; >>> kubectl create -f flink-configuration-configmap.yaml >>> kubectl create -f jobmanager-service.yaml >>> kubectl create -f jobmanager-session-deployment.yaml >>> kubectl create -f taskmanager-session-deployment.yaml >>> >>> But I got this >>> >>> 2020-09-02 06:45:42,664 WARN akka.remote.ReliableDeliverySupervisor >>> [] - Association with remote system [ >>> akka.tcp://flink@flink-jobmanager:6123] has failed, address is now >>> gated for [50] ms. Reason: [Association failed with [ >>> akka.tcp://flink@flink-jobmanager:6123]] Caused by: >>> [java.net.UnknownHostException: flink-jobmanager: Temporary failure in name >>> resolution] >>> 2020-09-02 06:45:42,691 INFO >>> org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could >>> not resolve ResourceManager address >>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*, >>> retrying in 10000 ms: Could not connect to rpc endpoint under address >>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*. >>> 2020-09-02 06:46:02,731 INFO >>> org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could >>> not resolve ResourceManager address >>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*, >>> retrying in 10000 ms: Could not connect to rpc endpoint under address >>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*. >>> 2020-09-02 06:46:12,731 INFO akka.remote.transport.ProtocolStateActor >>> [] - No response from remote for outbound association. >>> Associate timed out after [20000 ms]. >>> >>> And when I run the command 'kubectl exec -ti >>> flink-taskmanager-74c68c6f48-9tkvd -- /bin/bash’ && ‘ping flink-jobmanager’ >>> , I find I cannot ping flink-jobmanager from taskmanager >>> >>> I am new to k8s, can anyone give me some tutorial? Thanks a lot ! >>> >> >> >