Hi Till,
I find something may be helpful.
The kubernetes Dashboard show job-manager ip 172.18.0.5, task-manager ip 
172.18.0.6
When I run command 'kubectl exec -ti flink-taskmanager-74c68c6f48-jqpbn -- 
/bin/bash’ && ‘ping 172.18.0.5’ 
I can get response
But when I ping flink-jobmanager ,there is no response


| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制


On 09/3/2020 09:03,superainbower<superainbo...@163.com> wrote:
Hi Till,
This is the taskManager log
As you see, the logs print  ‘line 92 -- Could not connect to 
flink-jobmanager:6123’
then print ‘line 128 --Could not resolve ResourceManager address 
akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 
10000 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*.’   And 
repeat print this


A few minutes later, the taskmanger shut down and restart


This is my yaml files, could u help me to confirm did I omitted something? 
Thanks a lot!
---------------------------------------------------
flink-configuration-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: flink-config
  labels:
    app: flink
data:
  flink-conf.yaml: |+
    jobmanager.rpc.address: flink-jobmanager
    taskmanager.numberOfTaskSlots: 1
    blob.server.port: 6124
    jobmanager.rpc.port: 6123
    taskmanager.rpc.port: 6122
    queryable-state.proxy.ports: 6125
    jobmanager.memory.process.size: 1024m
    taskmanager.memory.process.size: 1024m
    parallelism.default: 1
  log4j-console.properties: |+
    rootLogger.level = INFO
    rootLogger.appenderRef.console.ref = ConsoleAppender
    rootLogger.appenderRef.rolling.ref = RollingFileAppender
    logger.akka.name = akka
    logger.akka.level = INFO
    logger.kafka.name= org.apache.kafka
    logger.kafka.level = INFO
    logger.hadoop.name = org.apache.hadoop
    logger.hadoop.level = INFO
    logger.zookeeper.name = org.apache.zookeeper
    logger.zookeeper.level = INFO
    appender.console.name = ConsoleAppender
    appender.console.type = CONSOLE
    appender.console.layout.type = PatternLayout
    appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x 
- %m%n
    appender.rolling.name = RollingFileAppender
    appender.rolling.type = RollingFile
    appender.rolling.append = false
    appender.rolling.fileName = ${sys:log.file}
    appender.rolling.filePattern = ${sys:log.file}.%i
    appender.rolling.layout.type = PatternLayout
    appender.rolling.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x 
- %m%n
    appender.rolling.policies.type = Policies
    appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
    appender.rolling.policies.size.size=100MB
    appender.rolling.strategy.type = DefaultRolloverStrategy
    appender.rolling.strategy.max = 10
    logger.netty.name = 
org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
    logger.netty.level = OFF
---------------------------------------------------
jobmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: flink-jobmanager
spec:
  type: ClusterIP
  ports:
  - name: rpc
    port: 6123
  - name: blob-server
    port: 6124
  - name: webui
    port: 8081
  selector:
    app: flink
    component: jobmanager
--------------------------------------------------
jobmanager-session-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: flink-jobmanager
spec:
  replicas: 1
  selector:
    matchLabels:
      app: flink
      component: jobmanager
  template:
    metadata:
      labels:
        app: flink
        component: jobmanager
    spec:
      containers:
      - name: jobmanager
        image: registry.cn-hangzhou.aliyuncs.com/superainbower/flink:1.11.1
        args: ["jobmanager"]
        ports:
        - containerPort: 6123
          name: rpc
        - containerPort: 6124
          name: blob-server
        - containerPort: 8081
          name: webui
        livenessProbe:
          tcpSocket:
            port: 6123
          initialDelaySeconds: 30
          periodSeconds: 60
        volumeMounts:
        - name: flink-config-volume
          mountPath: /opt/flink/conf
        securityContext:
          runAsUser: 9999  # refers to user _flink_ from official flink image, 
change if necessary
      volumes:
      - name: flink-config-volume
        configMap:
          name: flink-config
          items:
          - key: flink-conf.yaml
            path: flink-conf.yaml
          - key: log4j-console.properties
            path: log4j-console.properties
      imagePullSecrets:
        - name: regcred
---------------------------------------------------
taskmanager-session-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: flink-taskmanager
spec:
  replicas: 1
  selector:
    matchLabels:
      app: flink
      component: taskmanager
  template:
    metadata:
      labels:
        app: flink
        component: taskmanager
    spec:
      containers:
      - name: taskmanager
        image: registry.cn-hangzhou.aliyuncs.com/superainbower/flink:1.11.1
        args: ["taskmanager"]
        ports:
        - containerPort: 6122
          name: rpc
        - containerPort: 6125
          name: query-state
        livenessProbe:
          tcpSocket:
            port: 6122
          initialDelaySeconds: 30
          periodSeconds: 60
        volumeMounts:
        - name: flink-config-volume
          mountPath: /opt/flink/conf/
        securityContext:
          runAsUser: 9999  # refers to user _flink_ from official flink image, 
change if necessary
      volumes:
      - name: flink-config-volume
        configMap:
          name: flink-config
          items:
          - key: flink-conf.yaml
            path: flink-conf.yaml
          - key: log4j-console.properties
            path: log4j-console.properties
      imagePullSecrets:
        - name: regcred
       


| |
superainbower
|
|
superainbo...@163.com
|
签名由网易邮箱大师定制


On 09/2/2020 20:38,Till Rohrmann<trohrm...@apache.org> wrote:
Hmm, this is indeed strange. Could you share the logs of the TaskManager with 
us? Ideally you set the log level to debug. Thanks a lot.


Cheers,
Till


On Wed, Sep 2, 2020 at 12:45 PM art <superainbo...@163.com> wrote:

Hi Till,
  
The full information when I run command ' kubectl get all’  like this:


NAME                                     READY   STATUS    RESTARTS   AGE
pod/flink-jobmanager-85bdbd98d8-ppjmf    1/1     Running   0          2m34s
pod/flink-taskmanager-74c68c6f48-6jb5v   1/1     Running   0          2m34s


NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    
                  AGE
service/flink-jobmanager   ClusterIP   10.103.207.75   <none>        
6123/TCP,6124/TCP,8081/TCP   2m34s
service/kubernetes         ClusterIP   10.96.0.1       <none>        443/TCP    
                  5d2h


NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/flink-jobmanager    1/1     1            1           2m34s
deployment.apps/flink-taskmanager   1/1     1            1           2m34s


NAME                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/flink-jobmanager-85bdbd98d8    1         1         1       2m34s
replicaset.apps/flink-taskmanager-74c68c6f48   1         1         1       2m34s


And I can open flink ui but the task manger is 0 ,so the job manger is work well
I think the problem is taksmanger can not register itself to jobmanger,  did I 
miss some configure?




在 2020年9月2日,下午5:24,Till Rohrmann <trohrm...@apache.org> 写道:


Hi art,


could you check what `kubectl get services` returns? Usually if you run 
`kubectl get all` you should also see the services. But in your case there are 
no services listed. You have see something like service/flink-jobmanager 
otherwise the flink-jobmanager service (K8s service) is not running.


Cheers,
Till


On Wed, Sep 2, 2020 at 11:15 AM art <superainbo...@163.com> wrote:

Hi Till,


I’m sure the job manager-service is started, I can find it in Kubernetes 
DashBoard


When I run command ' kubectl get deployment’ I can got this:
flink-jobmanager    1/1     1            1           33s
flink-taskmanager   1/1     1            1           33s


When I run command ' kubectl get all’ I can got this:
NAME                                     READY   STATUS    RESTARTS   AGE
pod/flink-jobmanager-85bdbd98d8-ppjmf    1/1     Running   0          2m34s
pod/flink-taskmanager-74c68c6f48-6jb5v   1/1     Running   0          2m34s


So, I think flink-jobmanager works well, but taskmannger is restarted every few 
minutes 


My minikube version: v1.12.3
Flink version:v1.11.1



在 2020年9月2日,下午4:27,Till Rohrmann <trohrm...@apache.org> 写道:


Hi art,


could you verify that the jobmanager-service has been started? It looks as if 
the name flink-jobmanager is not resolvable. It could also help to know the 
Minikube and K8s version you are using.


Cheers,
Till


On Wed, Sep 2, 2020 at 9:50 AM art <superainbo...@163.com> wrote:

Hi,I’m going to deploy flink on minikube referring to 
https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/ops/deployment/kubernetes.html;
kubectl create -f flink-configuration-configmap.yaml
kubectl create -f jobmanager-service.yaml
kubectl create -f jobmanager-session-deployment.yaml
kubectl create -f taskmanager-session-deployment.yaml


But I got this


2020-09-02 06:45:42,664 WARN  akka.remote.ReliableDeliverySupervisor            
           [] - Association with remote system 
[akka.tcp://flink@flink-jobmanager:6123] has failed, address is now gated for 
[50] ms. Reason: [Association failed with 
[akka.tcp://flink@flink-jobmanager:6123]] Caused by: 
[java.net.UnknownHostException: flink-jobmanager: Temporary failure in name 
resolution]
2020-09-02 06:45:42,691 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 
10000 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*.
2020-09-02 06:46:02,731 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 
10000 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*.
2020-09-02 06:46:12,731 INFO  akka.remote.transport.ProtocolStateActor          
           [] - No response from remote for outbound association. Associate 
timed out after [20000 ms]. 


And when I run the command 'kubectl exec -ti flink-taskmanager-74c68c6f48-9tkvd 
-- /bin/bash’ && ‘ping flink-jobmanager’ , I find I cannot ping 
flink-jobmanager from taskmanager


I am new to k8s, can anyone give me some tutorial? Thanks a lot !



Reply via email to