环境:MacBook Pro 单机安装了 minkube v1.15.1 和 kubernetes v1.19.4 我在flink v1.11.3发行版下执行如下命令 kubectl create namespace flink-session-cluster
kubectl create serviceaccount flink -n flink-session-cluster kubectl create clusterrolebinding flink-role-binding-flink \ --clusterrole=edit \ --serviceaccount=flink-session-cluster:flink ./bin/kubernetes-session.sh \ -Dkubernetes.namespace=flink-session-cluster \ -Dkubernetes.jobmanager.service-account=flink \ -Dkubernetes.cluster-id=session001 \ -Dtaskmanager.memory.process.size=8192m \ -Dkubernetes.taskmanager.cpu=1 \ -Dtaskmanager.numberOfTaskSlots=4 \ -Dresourcemanager.taskmanager-timeout=3600000 屏幕打印的结果显示flink web UI启在了 http://192.168.64.2:8081 而不是类似于 http://192.168.50.135:31753 这样的5位数端口,是哪里有问题?这里的host ip应该是minikube ip吗?我本地浏览器访问不了http://192.168.64.2:8081 2021-01-02 10:28:04,177 INFO org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead 2021-01-02 10:28:04,907 INFO org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create flink session cluster session001 successfully, JobManager Web Interface: http://192.168.64.2:8081 查看了pods, service, deployment都正常启动好了,显示全绿色的 接下来提交任务 ./bin/flink run -d \ -e kubernetes-session \ -Dkubernetes.namespace=flink-session-cluster \ -Dkubernetes.cluster-id=session001 \ examples/streaming/WindowJoin.jar Using windowSize=2000, data rate=3 To customize example, use: WindowJoin [--windowSize <window-size-in-millis>] [--rate <elements-per-second>] 2021-01-02 10:21:48,658 INFO org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Retrieve flink cluster session001 successfully, JobManager Web Interface: http://10.106.136.236:8081 这里显示的 http://10.106.136.236:8081 我是能够通过浏览器访问到的,打开显示作业正在运行,而且available slots一项显示的是 0,查看JM日志有如下error Causedby: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Couldnot allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources. at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) ~[flink-dist_2.12-1.11.3.jar:1.11.3] ... 47 more Causedby: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_275] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_275] at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) ~[?:1.8.0_275] at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) ~[?:1.8.0_275] ... 27 more Causedby: java.util.concurrent.TimeoutException ... 25 more 为什么会报这个资源配置不足的错?谢谢解答! 在 2020-12-29 09:53:48,"Yang Wang" <danrtsey...@gmail.com> 写道: >ConfigMap不需要提前创建,那个Warning信息可以忽略,是正常的,主要原因是先创建的deployment,再创建的ConfigMap >你可以参考社区的文档[1]把Jm的log打到console看一下 > >我怀疑是你没有创建service account导致的[2] > >[1]. >https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#log-files >[2]. >https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#rbac > >Best, >Yang > >陈帅 <casel_c...@126.com> 于2020年12月28日周一 下午5:54写道: > >> 今天改用官方最新发布的flink镜像版本1.11.3也启不起来 >> 这是我的命令 >> ./bin/kubernetes-session.sh \ >> -Dkubernetes.cluster-id=rtdp \ >> -Dtaskmanager.memory.process.size=4096m \ >> -Dkubernetes.taskmanager.cpu=2 \ >> -Dtaskmanager.numberOfTaskSlots=4 \ >> -Dresourcemanager.taskmanager-timeout=3600000 \ >> -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \ >> -Dkubernetes.namespace=rtdp >> >> >> >> Events: >> >> Type Reason Age From Message >> >> ---- ------ ---- ---- ------- >> >> Normal Scheduled 88s default-scheduler >> Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to >> cn-shanghai.192.168.16.130 >> >> Warning FailedMount 88s kubelet >> MountVolume.SetUp failed for volume "flink-config-volume" : configmap >> "flink-config-rtdp" not found >> >> Warning FailedMount 88s kubelet >> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap >> "hadoop-config-rtdp" not found >> >> Normal AllocIPSucceed 87s terway-daemon Alloc IP >> 192.168.32.25/22 for Pod >> >> Normal Pulling 87s kubelet Pulling >> image "flink:1.11.3-scala_2.12-java8" >> >> Normal Pulled 31s kubelet >> Successfully pulled image "flink:1.11.3-scala_2.12-java8" >> >> Normal Created 18s (x2 over 26s) kubelet Created >> container flink-job-manager >> >> Normal Started 18s (x2 over 26s) kubelet Started >> container flink-job-manager >> >> Normal Pulled 18s kubelet Container >> image "flink:1.11.3-scala_2.12-java8" already present on machine >> >> Warning BackOff 10s kubelet Back-off >> restarting failed container >> >> >> >> >> >> >> >> 这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了? >> >> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session >> >> >> >> >> >> >> >> >> >> 在 2020-12-27 22:50:32,"陈帅" <casel_c...@126.com> 写道: >> >> >本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤: >> > >> > >> >git clone >> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian >> >docker build --tag flink:1.12.0-scala_2.12-java8 . >> > >> > >> >cd flink-1.12.0 >> >./bin/kubernetes-session.sh \ >> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \ >> -Dkubernetes.rest-service.exposed.type=NodePort \ >> -Dtaskmanager.numberOfTaskSlots=2 \ >> -Dkubernetes.cluster-id=flink-session-cluster >> > >> > >> >显示JM启起来了,但无法通过web访问 >> > >> >2020-12-27 22:08:12,387 INFO >> org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create >> flink session cluster session001 successfully, JobManager Web Interface: >> http://192.168.99.100:8081 >> > >> > >> > >> > >> >通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态 >> > >> >NAME READY STATUS >> RESTARTS AGE >> > >> >flink-session-cluster-858bd55dff-bzjk2 0/1 >> ContainerCreating 0 5m59s >> > >> >kubernetes-dashboard-1608509744-6bc8455756-mp47w 1/1 Running >> 0 6d14h >> > >> > >> > >> > >> >于是通过 `kubectl describe pod >> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下: >> > >> > >> > >> > >> >Name: flink-session-cluster-858bd55dff-bzjk2 >> > >> >Namespace: default >> > >> >Priority: 0 >> > >> >Node: minikube/192.168.99.100 >> > >> >Start Time: Sun, 27 Dec 2020 22:21:56 +0800 >> > >> >Labels: app=flink-session-cluster >> > >> > component=jobmanager >> > >> > pod-template-hash=858bd55dff >> > >> > type=flink-native-kubernetes >> > >> >Annotations: <none> >> > >> >Status: Pending >> > >> >IP: 172.17.0.4 >> > >> >IPs: >> > >> > IP: 172.17.0.4 >> > >> >Controlled By: ReplicaSet/flink-session-cluster-858bd55dff >> > >> >Containers: >> > >> > flink-job-manager: >> > >> > Container ID: >> > >> > Image: flink:1.12.0-scala_2.12-java8 >> > >> > Image ID: >> > >> > Ports: 8081/TCP, 6123/TCP, 6124/TCP >> > >> > Host Ports: 0/TCP, 0/TCP, 0/TCP >> > >> > Command: >> > >> > /docker-entrypoint.sh >> > >> > Args: >> > >> > native-k8s >> > >> > $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 >> -Xms1073741824 -XX:MaxMetaspaceSize=268435456 >> -Dlog.file=/opt/flink/log/jobmanager.log >> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml >> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties >> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties >> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint >> -D jobmanager.memory.off-heap.size=134217728b -D >> jobmanager.memory.jvm-overhead.min=201326592b -D >> jobmanager.memory.jvm-metaspace.size=268435456b -D >> jobmanager.memory.heap.size=1073741824b -D >> jobmanager.memory.jvm-overhead.max=201326592b >> > >> > State: Waiting >> > >> > Reason: ImagePullBackOff >> > >> > Ready: False >> > >> > Restart Count: 0 >> > >> > Limits: >> > >> > cpu: 1 >> > >> > memory: 1600Mi >> > >> > Requests: >> > >> > cpu: 1 >> > >> > memory: 1600Mi >> > >> > Environment: >> > >> > _POD_IP_ADDRESS: (v1:status.podIP) >> > >> > HADOOP_CONF_DIR: /opt/hadoop/conf >> > >> > Mounts: >> > >> > /opt/flink/conf from flink-config-volume (rw) >> > >> > /opt/hadoop/conf from hadoop-config-volume (rw) >> > >> > /var/run/secrets/kubernetes.io/serviceaccount from >> default-token-s47ht (ro) >> > >> >Conditions: >> > >> > Type Status >> > >> > Initialized True >> > >> > Ready False >> > >> > ContainersReady False >> > >> > PodScheduled True >> > >> >Volumes: >> > >> > hadoop-config-volume: >> > >> > Type: ConfigMap (a volume populated by a ConfigMap) >> > >> > Name: hadoop-config-flink-session-cluster >> > >> > Optional: false >> > >> > flink-config-volume: >> > >> > Type: ConfigMap (a volume populated by a ConfigMap) >> > >> > Name: flink-config-flink-session-cluster >> > >> > Optional: false >> > >> > default-token-s47ht: >> > >> > Type: Secret (a volume populated by a Secret) >> > >> > SecretName: default-token-s47ht >> > >> > Optional: false >> > >> >QoS Class: Guaranteed >> > >> >Node-Selectors: <none> >> > >> >Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for >> 300s >> > >> > node.kubernetes.io/unreachable:NoExecute op=Exists for >> 300s >> > >> >Events: >> > >> > Type Reason Age From Message >> > >> > ---- ------ ---- ---- ------- >> > >> > Normal Scheduled 21m default-scheduler >> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to >> minikube >> > >> > Warning FailedMount 21m (x2 over 21m) kubelet >> MountVolume.SetUp failed for volume "flink-config-volume" : configmap >> "flink-config-flink-session-cluster" not found >> > >> > Warning FailedMount 21m (x2 over 21m) kubelet >> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap >> "hadoop-config-flink-session-cluster" not found >> > >> > Normal Pulling 13m (x4 over 21m) kubelet Pulling >> image "flink:1.12.0-scala_2.12-java8" >> > >> > Warning Failed 13m (x4 over 15m) kubelet Failed to >> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc >> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8 >> not found: manifest unknown: manifest unknown >> > >> > Normal BackOff 13m (x5 over 15m) kubelet Back-off >> pulling image "flink:1.12.0-scala_2.12-java8" >> > >> > Warning Failed 11m (x5 over 15m) kubelet Error: >> ErrImagePull >> > >> > Warning Failed 100s (x53 over 15m) kubelet Error: >> ImagePullBackOff >> > >> > >> > >> > >> >一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看 >> > >> >REPOSITORY TAG >> IMAGE ID CREATED SIZE >> > >> >flink >> 1.12.0-scala_2.12-java8 f7dd9b9e020b 12 hours ago 642MB >> > >> > >> > >> > >> >> >显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢? >> > >> >第一次用k8s,还请各位指点,谢谢! >> > >> > >> > >> > >> > >> > >> > >> > >>