本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:


git clone 
https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
docker build --tag flink:1.12.0-scala_2.12-java8 .


cd flink-1.12.0
./bin/kubernetes-session.sh \ 
-Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \ 
-Dkubernetes.rest-service.exposed.type=NodePort \ 
-Dtaskmanager.numberOfTaskSlots=2 \ 
-Dkubernetes.cluster-id=flink-session-cluster


显示JM启起来了,但无法通过web访问

2020-12-27 22:08:12,387 INFO  
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create flink 
session cluster session001 successfully, JobManager Web Interface: 
http://192.168.99.100:8081




通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态

NAME                                               READY   STATUS              
RESTARTS   AGE

flink-session-cluster-858bd55dff-bzjk2             0/1     ContainerCreating   
0          5m59s

kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running             
0          6d14h




于是通过 `kubectl describe pod flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:




Name:         flink-session-cluster-858bd55dff-bzjk2

Namespace:    default

Priority:     0

Node:         minikube/192.168.99.100

Start Time:   Sun, 27 Dec 2020 22:21:56 +0800

Labels:       app=flink-session-cluster

              component=jobmanager

              pod-template-hash=858bd55dff

              type=flink-native-kubernetes

Annotations:  <none>

Status:       Pending

IP:           172.17.0.4

IPs:

  IP:           172.17.0.4

Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff

Containers:

  flink-job-manager:

    Container ID:  

    Image:         flink:1.12.0-scala_2.12-java8

    Image ID:      

    Ports:         8081/TCP, 6123/TCP, 6124/TCP

    Host Ports:    0/TCP, 0/TCP, 0/TCP

    Command:

      /docker-entrypoint.sh

    Args:

      native-k8s

      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 
-Xms1073741824 -XX:MaxMetaspaceSize=268435456 
-Dlog.file=/opt/flink/log/jobmanager.log 
-Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml 
-Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties 
-Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties 
org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint -D 
jobmanager.memory.off-heap.size=134217728b -D 
jobmanager.memory.jvm-overhead.min=201326592b -D 
jobmanager.memory.jvm-metaspace.size=268435456b -D 
jobmanager.memory.heap.size=1073741824b -D 
jobmanager.memory.jvm-overhead.max=201326592b

    State:          Waiting

      Reason:       ImagePullBackOff

    Ready:          False

    Restart Count:  0

    Limits:

      cpu:     1

      memory:  1600Mi

    Requests:

      cpu:     1

      memory:  1600Mi

    Environment:

      _POD_IP_ADDRESS:   (v1:status.podIP)

      HADOOP_CONF_DIR:  /opt/hadoop/conf

    Mounts:

      /opt/flink/conf from flink-config-volume (rw)

      /opt/hadoop/conf from hadoop-config-volume (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s47ht 
(ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             False 

  ContainersReady   False 

  PodScheduled      True 

Volumes:

  hadoop-config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      hadoop-config-flink-session-cluster

    Optional:  false

  flink-config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      flink-config-flink-session-cluster

    Optional:  false

  default-token-s47ht:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  default-token-s47ht

    Optional:    false

QoS Class:       Guaranteed

Node-Selectors:  <none>

Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s

                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

Events:

  Type     Reason       Age                  From               Message

  ----     ------       ----                 ----               -------

  Normal   Scheduled    21m                  default-scheduler  Successfully 
assigned default/flink-session-cluster-858bd55dff-bzjk2 to minikube

  Warning  FailedMount  21m (x2 over 21m)    kubelet            
MountVolume.SetUp failed for volume "flink-config-volume" : configmap 
"flink-config-flink-session-cluster" not found

  Warning  FailedMount  21m (x2 over 21m)    kubelet            
MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap 
"hadoop-config-flink-session-cluster" not found

  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling image 
"flink:1.12.0-scala_2.12-java8"

  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to pull 
image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc = Error 
response from daemon: manifest for flink:1.12.0-scala_2.12-java8 not found: 
manifest unknown: manifest unknown

  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off 
pulling image "flink:1.12.0-scala_2.12-java8"

  Warning  Failed       11m (x5 over 15m)    kubelet            Error: 
ErrImagePull

  Warning  Failed       100s (x53 over 15m)  kubelet            Error: 
ImagePullBackOff




一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看

REPOSITORY                                             TAG                      
 IMAGE ID       CREATED        SIZE

flink                                                  1.12.0-scala_2.12-java8  
 f7dd9b9e020b   12 hours ago   642MB




显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?

第一次用k8s,还请各位指点,谢谢!








回复