环境:MacBook Pro 单机安装了 minkube v1.15.1 和 kubernetes v1.19.4
我在flink v1.11.3发行版下执行如下命令
kubectl create namespace flink-session-cluster


kubectl create serviceaccount flink -n flink-session-cluster


kubectl create clusterrolebinding flink-role-binding-flink \ --clusterrole=edit 
\ --serviceaccount=flink-session-cluster:flink


./bin/kubernetes-session.sh \ -Dkubernetes.namespace=flink-session-cluster \ 
-Dkubernetes.jobmanager.service-account=flink \ 
-Dkubernetes.cluster-id=session001 \ -Dtaskmanager.memory.process.size=8192m \ 
-Dkubernetes.taskmanager.cpu=1 \ -Dtaskmanager.numberOfTaskSlots=4 \ 
-Dresourcemanager.taskmanager-timeout=3600000


屏幕打印的结果显示flink web UI启在了 http://192.168.64.2:8081 而不是类似于 
http://192.168.50.135:31753 这样的5位数端口,是哪里有问题?这里的host ip应该是minikube 
ip吗?我本地浏览器访问不了http://192.168.64.2:8081



2021-01-02 10:28:04,177 INFO  
org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived 
from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than 
its min value 192.000mb (201326592 bytes), min value will be used instead

2021-01-02 10:28:04,907 INFO  
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create flink 
session cluster session001 successfully, JobManager Web Interface: 
http://192.168.64.2:8081




查看了pods, service, deployment都正常启动好了,显示全绿色的


接下来提交任务
./bin/flink run -d \ -e kubernetes-session \ 
-Dkubernetes.namespace=flink-session-cluster \ 
-Dkubernetes.cluster-id=session001 \ examples/streaming/WindowJoin.jar



Using windowSize=2000, data rate=3

To customize example, use: WindowJoin [--windowSize <window-size-in-millis>] 
[--rate <elements-per-second>]

2021-01-02 10:21:48,658 INFO  
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Retrieve 
flink cluster session001 successfully, JobManager Web Interface: 
http://10.106.136.236:8081




这里显示的 http://10.106.136.236:8081 我是能够通过浏览器访问到的,打开显示作业正在运行,而且available 
slots一项显示的是 0,查看JM日志有如下error




Causedby: 
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Couldnot allocate the required slot within slot request timeout. Please make 
sure that the cluster has enough resources.
    at 
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441)
 ~[flink-dist_2.12-1.11.3.jar:1.11.3]
    ... 47 more
Causedby: java.util.concurrent.CompletionException: 
java.util.concurrent.TimeoutException
    at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
 ~[?:1.8.0_275]
    at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
 ~[?:1.8.0_275]
    at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) 
~[?:1.8.0_275]
    at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
 ~[?:1.8.0_275]
    ... 27 more
Causedby: java.util.concurrent.TimeoutException
    ... 25 more


为什么会报这个资源配置不足的错?谢谢解答!








在 2020-12-29 09:53:48,"Yang Wang" <danrtsey...@gmail.com> 写道:
>ConfigMap不需要提前创建,那个Warning信息可以忽略,是正常的,主要原因是先创建的deployment,再创建的ConfigMap
>你可以参考社区的文档[1]把Jm的log打到console看一下
>
>我怀疑是你没有创建service account导致的[2]
>
>[1].
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#log-files
>[2].
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#rbac
>
>Best,
>Yang
>
>陈帅 <casel_c...@126.com> 于2020年12月28日周一 下午5:54写道:
>
>> 今天改用官方最新发布的flink镜像版本1.11.3也启不起来
>> 这是我的命令
>> ./bin/kubernetes-session.sh \
>>   -Dkubernetes.cluster-id=rtdp \
>>   -Dtaskmanager.memory.process.size=4096m \
>>   -Dkubernetes.taskmanager.cpu=2 \
>>   -Dtaskmanager.numberOfTaskSlots=4 \
>>   -Dresourcemanager.taskmanager-timeout=3600000 \
>>   -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \
>>   -Dkubernetes.namespace=rtdp
>>
>>
>>
>> Events:
>>
>>   Type     Reason          Age                From               Message
>>
>>   ----     ------          ----               ----               -------
>>
>>   Normal   Scheduled       88s                default-scheduler
>> Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to
>> cn-shanghai.192.168.16.130
>>
>>   Warning  FailedMount     88s                kubelet
>> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
>> "flink-config-rtdp" not found
>>
>>   Warning  FailedMount     88s                kubelet
>> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
>> "hadoop-config-rtdp" not found
>>
>>   Normal   AllocIPSucceed  87s                terway-daemon      Alloc IP
>> 192.168.32.25/22 for Pod
>>
>>   Normal   Pulling         87s                kubelet            Pulling
>> image "flink:1.11.3-scala_2.12-java8"
>>
>>   Normal   Pulled          31s                kubelet
>> Successfully pulled image "flink:1.11.3-scala_2.12-java8"
>>
>>   Normal   Created         18s (x2 over 26s)  kubelet            Created
>> container flink-job-manager
>>
>>   Normal   Started         18s (x2 over 26s)  kubelet            Started
>> container flink-job-manager
>>
>>   Normal   Pulled          18s                kubelet            Container
>> image "flink:1.11.3-scala_2.12-java8" already present on machine
>>
>>   Warning  BackOff         10s                kubelet            Back-off
>> restarting failed container
>>
>>
>>
>>
>>
>>
>>
>> 这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了?
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-12-27 22:50:32,"陈帅" <casel_c...@126.com> 写道:
>>
>> >本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>> >
>> >
>> >git clone
>> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>> >docker build --tag flink:1.12.0-scala_2.12-java8 .
>> >
>> >
>> >cd flink-1.12.0
>> >./bin/kubernetes-session.sh \
>> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
>> -Dkubernetes.rest-service.exposed.type=NodePort \
>> -Dtaskmanager.numberOfTaskSlots=2 \
>> -Dkubernetes.cluster-id=flink-session-cluster
>> >
>> >
>> >显示JM启起来了,但无法通过web访问
>> >
>> >2020-12-27 22:08:12,387 INFO
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
>> flink session cluster session001 successfully, JobManager Web Interface:
>> http://192.168.99.100:8081
>> >
>> >
>> >
>> >
>> >通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>> >
>> >NAME                                               READY   STATUS
>>       RESTARTS   AGE
>> >
>> >flink-session-cluster-858bd55dff-bzjk2             0/1
>>  ContainerCreating   0          5m59s
>> >
>> >kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running
>>      0          6d14h
>> >
>> >
>> >
>> >
>> >于是通过 `kubectl describe pod
>> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>> >
>> >
>> >
>> >
>> >Name:         flink-session-cluster-858bd55dff-bzjk2
>> >
>> >Namespace:    default
>> >
>> >Priority:     0
>> >
>> >Node:         minikube/192.168.99.100
>> >
>> >Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>> >
>> >Labels:       app=flink-session-cluster
>> >
>> >              component=jobmanager
>> >
>> >              pod-template-hash=858bd55dff
>> >
>> >              type=flink-native-kubernetes
>> >
>> >Annotations:  <none>
>> >
>> >Status:       Pending
>> >
>> >IP:           172.17.0.4
>> >
>> >IPs:
>> >
>> >  IP:           172.17.0.4
>> >
>> >Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>> >
>> >Containers:
>> >
>> >  flink-job-manager:
>> >
>> >    Container ID:
>> >
>> >    Image:         flink:1.12.0-scala_2.12-java8
>> >
>> >    Image ID:
>> >
>> >    Ports:         8081/TCP, 6123/TCP, 6124/TCP
>> >
>> >    Host Ports:    0/TCP, 0/TCP, 0/TCP
>> >
>> >    Command:
>> >
>> >      /docker-entrypoint.sh
>> >
>> >    Args:
>> >
>> >      native-k8s
>> >
>> >      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
>> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
>> -Dlog.file=/opt/flink/log/jobmanager.log
>> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
>> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
>> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
>> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
>> -D jobmanager.memory.off-heap.size=134217728b -D
>> jobmanager.memory.jvm-overhead.min=201326592b -D
>> jobmanager.memory.jvm-metaspace.size=268435456b -D
>> jobmanager.memory.heap.size=1073741824b -D
>> jobmanager.memory.jvm-overhead.max=201326592b
>> >
>> >    State:          Waiting
>> >
>> >      Reason:       ImagePullBackOff
>> >
>> >    Ready:          False
>> >
>> >    Restart Count:  0
>> >
>> >    Limits:
>> >
>> >      cpu:     1
>> >
>> >      memory:  1600Mi
>> >
>> >    Requests:
>> >
>> >      cpu:     1
>> >
>> >      memory:  1600Mi
>> >
>> >    Environment:
>> >
>> >      _POD_IP_ADDRESS:   (v1:status.podIP)
>> >
>> >      HADOOP_CONF_DIR:  /opt/hadoop/conf
>> >
>> >    Mounts:
>> >
>> >      /opt/flink/conf from flink-config-volume (rw)
>> >
>> >      /opt/hadoop/conf from hadoop-config-volume (rw)
>> >
>> >      /var/run/secrets/kubernetes.io/serviceaccount from
>> default-token-s47ht (ro)
>> >
>> >Conditions:
>> >
>> >  Type              Status
>> >
>> >  Initialized       True
>> >
>> >  Ready             False
>> >
>> >  ContainersReady   False
>> >
>> >  PodScheduled      True
>> >
>> >Volumes:
>> >
>> >  hadoop-config-volume:
>> >
>> >    Type:      ConfigMap (a volume populated by a ConfigMap)
>> >
>> >    Name:      hadoop-config-flink-session-cluster
>> >
>> >    Optional:  false
>> >
>> >  flink-config-volume:
>> >
>> >    Type:      ConfigMap (a volume populated by a ConfigMap)
>> >
>> >    Name:      flink-config-flink-session-cluster
>> >
>> >    Optional:  false
>> >
>> >  default-token-s47ht:
>> >
>> >    Type:        Secret (a volume populated by a Secret)
>> >
>> >    SecretName:  default-token-s47ht
>> >
>> >    Optional:    false
>> >
>> >QoS Class:       Guaranteed
>> >
>> >Node-Selectors:  <none>
>> >
>> >Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for
>> 300s
>> >
>> >                 node.kubernetes.io/unreachable:NoExecute op=Exists for
>> 300s
>> >
>> >Events:
>> >
>> >  Type     Reason       Age                  From               Message
>> >
>> >  ----     ------       ----                 ----               -------
>> >
>> >  Normal   Scheduled    21m                  default-scheduler
>> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
>> minikube
>> >
>> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
>> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
>> "flink-config-flink-session-cluster" not found
>> >
>> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
>> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
>> "hadoop-config-flink-session-cluster" not found
>> >
>> >  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling
>> image "flink:1.12.0-scala_2.12-java8"
>> >
>> >  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to
>> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc
>> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
>> not found: manifest unknown: manifest unknown
>> >
>> >  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off
>> pulling image "flink:1.12.0-scala_2.12-java8"
>> >
>> >  Warning  Failed       11m (x5 over 15m)    kubelet            Error:
>> ErrImagePull
>> >
>> >  Warning  Failed       100s (x53 over 15m)  kubelet            Error:
>> ImagePullBackOff
>> >
>> >
>> >
>> >
>> >一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>> >
>> >REPOSITORY                                             TAG
>>        IMAGE ID       CREATED        SIZE
>> >
>> >flink
>> 1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
>> >
>> >
>> >
>> >
>>
>> >显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>> >
>> >第一次用k8s,还请各位指点,谢谢!
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>

Reply via email to