Re: Cannot start native K8s
A quick update on this issue. The root cause of this issue is compatibility of kubernetes-client and java 8u252[1]. And we have bumped he fabric8 kubernetes-client version from 4.5.2 to 4.9.2 in master and release-1.11 branch. Now users could deploy Flink on K8s natively with java 8u252. If you really could not use the latest Flink version, you could set the environment "HTTP2_DISABLE=true" in Flink client, jobmanager, taskmanager side. [1]. https://github.com/fabric8io/kubernetes-client/issues/2212 Best, Yang Yang Wang 于2020年5月11日周一 上午11:51写道: > Glad to hear that you could deploy the Flink cluster on K8s natively. > Thanks for > trying the in-preview feature and give your feedback. > > > Moreover, i want to give a very simple conclusion here. Currently, because > of the > compatibility issue of fabric8 kubernetes-client, the native K8s > integration have the > following known limitation. > * For jdk 8u252, the native k8s integration could only work on kubernetes > v1.16 and > lower versions. > * For other jdk versions(e.g. 8u242, jdk11), i am not aware of the same > issues. The native > K8s integration works well. > > > Best, > Yang > > Dongwon Kim 于2020年5月9日周六 上午11:46写道: > >> Hi Yang, >> >> Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that >> the current user account can access to K8s. >> Now that I copied it, I found that kubernetes-session.sh is working fine. >> Thanks very much! >> >> Best, >> Dongwon >> >> [flink@DAC-E04-W06 ~]$ kubernetes-session.sh >> 2020-05-09 12:43:49,961 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.rpc.address, DAC-E04-W06 >> 2020-05-09 12:43:49,962 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.rpc.port, 6123 >> 2020-05-09 12:43:49,962 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.heap.size, 1024m >> 2020-05-09 12:43:49,962 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: taskmanager.memory.process.size, 24g >> 2020-05-09 12:43:49,963 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: taskmanager.numberOfTaskSlots, 24 >> 2020-05-09 12:43:49,963 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: parallelism.default, 1 >> 2020-05-09 12:43:49,963 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability, zookeeper >> 2020-05-09 12:43:49,963 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability.zookeeper.path.root, /flink >> 2020-05-09 12:43:49,964 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ >> 2020-05-09 12:43:49,964 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 >> 2020-05-09 12:43:49,965 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.execution.failover-strategy, region >> 2020-05-09 12:43:49,965 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: rest.port, 8082 >> 2020-05-09 12:43:51,122 INFO >> org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The >> derived from fraction jvm overhead memory (2.400gb (2576980416 bytes)) is >> greater than its max value 1024.000mb (1073741824 bytes), max value will be >> used instead >> 2020-05-09 12:43:51,123 INFO >> org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The >> derived from fraction network memory (2.291gb (2459539902 bytes)) is >> greater than its max value 1024.000mb (1073741824 bytes), max value will be >> used instead >> 2020-05-09 12:43:51,131 INFO >> org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes >> deployment requires a fixed port. Configuration blob.server.port will be >> set to 6124 >> 2020-05-09 12:43:51,131 INFO >> org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes >> deployment requires a fixed port. Configuration taskmanager.rpc.port will >> be set to 6122 >> 2020-05-09 12:43:51,134 INFO >> org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes >> deployment requires a fixed port. Configuration >> high-availability.jobmanager.port will be set to 6123 >> 2020-05-09 12:43:52,167 INFO >> org.apache.flink.kubernetes.KubernetesClusterDescriptor - Create >> flink session cluster flink-cluster-4a82d41b-af15-4205-8a44-62351e270242 >> successfully, JobManager Web Interface:
Re: Cannot start native K8s
Glad to hear that you could deploy the Flink cluster on K8s natively. Thanks for trying the in-preview feature and give your feedback. Moreover, i want to give a very simple conclusion here. Currently, because of the compatibility issue of fabric8 kubernetes-client, the native K8s integration have the following known limitation. * For jdk 8u252, the native k8s integration could only work on kubernetes v1.16 and lower versions. * For other jdk versions(e.g. 8u242, jdk11), i am not aware of the same issues. The native K8s integration works well. Best, Yang Dongwon Kim 于2020年5月9日周六 上午11:46写道: > Hi Yang, > > Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that > the current user account can access to K8s. > Now that I copied it, I found that kubernetes-session.sh is working fine. > Thanks very much! > > Best, > Dongwon > > [flink@DAC-E04-W06 ~]$ kubernetes-session.sh > 2020-05-09 12:43:49,961 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.rpc.address, DAC-E04-W06 > 2020-05-09 12:43:49,962 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.rpc.port, 6123 > 2020-05-09 12:43:49,962 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.heap.size, 1024m > 2020-05-09 12:43:49,962 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: taskmanager.memory.process.size, 24g > 2020-05-09 12:43:49,963 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: taskmanager.numberOfTaskSlots, 24 > 2020-05-09 12:43:49,963 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: parallelism.default, 1 > 2020-05-09 12:43:49,963 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability, zookeeper > 2020-05-09 12:43:49,963 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.zookeeper.path.root, /flink > 2020-05-09 12:43:49,964 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ > 2020-05-09 12:43:49,964 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 > 2020-05-09 12:43:49,965 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.execution.failover-strategy, region > 2020-05-09 12:43:49,965 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: rest.port, 8082 > 2020-05-09 12:43:51,122 INFO > org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The > derived from fraction jvm overhead memory (2.400gb (2576980416 bytes)) is > greater than its max value 1024.000mb (1073741824 bytes), max value will be > used instead > 2020-05-09 12:43:51,123 INFO > org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The > derived from fraction network memory (2.291gb (2459539902 bytes)) is > greater than its max value 1024.000mb (1073741824 bytes), max value will be > used instead > 2020-05-09 12:43:51,131 INFO > org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes > deployment requires a fixed port. Configuration blob.server.port will be > set to 6124 > 2020-05-09 12:43:51,131 INFO > org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes > deployment requires a fixed port. Configuration taskmanager.rpc.port will > be set to 6122 > 2020-05-09 12:43:51,134 INFO > org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes > deployment requires a fixed port. Configuration > high-availability.jobmanager.port will be set to 6123 > 2020-05-09 12:43:52,167 INFO > org.apache.flink.kubernetes.KubernetesClusterDescriptor - Create > flink session cluster flink-cluster-4a82d41b-af15-4205-8a44-62351e270242 > successfully, JobManager Web Interface: http://cluster-endpoint:31513 > > > On Sat, May 9, 2020 at 12:29 PM Yang Wang wrote: > >> Hi Dongwon Kim, >> >> Thanks a lot for your information. I will dig into this issue. >> >> I think the "UnknownHostException" is caused by incorrectly setting the >> Kubernetes >> ApiServer address. Maybe you are using "kubernetes.default.svc". However, >> it >> could not be accessed outside of the Kubernetes cluster. You need to >> configure >> a correct ip/hostname for ApiServer address, which could be accessed in >> your >> local environment. You could use `kubectl auth can-i create pods` to >> verify >> whether the kube config is correct. >> >> BTW, currently we only find the flink on native
Re: Cannot start native K8s
Hi Yang, Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that the current user account can access to K8s. Now that I copied it, I found that kubernetes-session.sh is working fine. Thanks very much! Best, Dongwon [flink@DAC-E04-W06 ~]$ kubernetes-session.sh 2020-05-09 12:43:49,961 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.address, DAC-E04-W06 2020-05-09 12:43:49,962 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.port, 6123 2020-05-09 12:43:49,962 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.heap.size, 1024m 2020-05-09 12:43:49,962 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: taskmanager.memory.process.size, 24g 2020-05-09 12:43:49,963 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: taskmanager.numberOfTaskSlots, 24 2020-05-09 12:43:49,963 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: parallelism.default, 1 2020-05-09 12:43:49,963 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability, zookeeper 2020-05-09 12:43:49,963 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability.zookeeper.path.root, /flink 2020-05-09 12:43:49,964 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ 2020-05-09 12:43:49,964 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 2020-05-09 12:43:49,965 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.execution.failover-strategy, region 2020-05-09 12:43:49,965 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: rest.port, 8082 2020-05-09 12:43:51,122 INFO org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The derived from fraction jvm overhead memory (2.400gb (2576980416 bytes)) is greater than its max value 1024.000mb (1073741824 bytes), max value will be used instead 2020-05-09 12:43:51,123 INFO org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The derived from fraction network memory (2.291gb (2459539902 bytes)) is greater than its max value 1024.000mb (1073741824 bytes), max value will be used instead 2020-05-09 12:43:51,131 INFO org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 2020-05-09 12:43:51,131 INFO org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 2020-05-09 12:43:51,134 INFO org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes deployment requires a fixed port. Configuration high-availability.jobmanager.port will be set to 6123 2020-05-09 12:43:52,167 INFO org.apache.flink.kubernetes.KubernetesClusterDescriptor - Create flink session cluster flink-cluster-4a82d41b-af15-4205-8a44-62351e270242 successfully, JobManager Web Interface: http://cluster-endpoint:31513 On Sat, May 9, 2020 at 12:29 PM Yang Wang wrote: > Hi Dongwon Kim, > > Thanks a lot for your information. I will dig into this issue. > > I think the "UnknownHostException" is caused by incorrectly setting the > Kubernetes > ApiServer address. Maybe you are using "kubernetes.default.svc". However, > it > could not be accessed outside of the Kubernetes cluster. You need to > configure > a correct ip/hostname for ApiServer address, which could be accessed in > your > local environment. You could use `kubectl auth can-i create pods` to verify > whether the kube config is correct. > > BTW, currently we only find the flink on native K8s could not work on > 8u252. For > 8u242 and lower version, it works well. > > > Best, > Yang > > Dongwon Kim 于2020年5月9日周六 上午10:43写道: > >> Hello Yang, >> >> I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not >> a Minikube). >> Previously, as you pointed out, openjdk version "1.8.0_252" was installed. >> I bump up java version to openjdk 11.0.7 but got something different: >> >> [flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh >> 2020-05-09 11:39:36,737 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.rpc.address, DAC-E04-W06 >> 2020-05-09 11:39:36,739 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property:
Re: Cannot start native K8s
Hi Dongwon Kim, Thanks a lot for your information. I will dig into this issue. I think the "UnknownHostException" is caused by incorrectly setting the Kubernetes ApiServer address. Maybe you are using "kubernetes.default.svc". However, it could not be accessed outside of the Kubernetes cluster. You need to configure a correct ip/hostname for ApiServer address, which could be accessed in your local environment. You could use `kubectl auth can-i create pods` to verify whether the kube config is correct. BTW, currently we only find the flink on native K8s could not work on 8u252. For 8u242 and lower version, it works well. Best, Yang Dongwon Kim 于2020年5月9日周六 上午10:43写道: > Hello Yang, > > I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not > a Minikube). > Previously, as you pointed out, openjdk version "1.8.0_252" was installed. > I bump up java version to openjdk 11.0.7 but got something different: > > [flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh > 2020-05-09 11:39:36,737 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.rpc.address, DAC-E04-W06 > 2020-05-09 11:39:36,739 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.rpc.port, 6123 > 2020-05-09 11:39:36,739 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.heap.size, 1024m > 2020-05-09 11:39:36,739 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: taskmanager.memory.process.size, 24g > 2020-05-09 11:39:36,739 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: taskmanager.numberOfTaskSlots, 24 > 2020-05-09 11:39:36,739 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: parallelism.default, 1 > 2020-05-09 11:39:36,740 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability, zookeeper > 2020-05-09 11:39:36,740 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.zookeeper.path.root, /flink > 2020-05-09 11:39:36,740 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ > 2020-05-09 11:39:36,740 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 > 2020-05-09 11:39:36,741 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.execution.failover-strategy, region > 2020-05-09 11:39:36,741 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: rest.port, 8082 > 2020-05-09 11:39:36,817 WARN io.fabric8.kubernetes.client.Config > - Error reading service account token from: > [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. > 2020-05-09 11:39:36,823 WARN io.fabric8.kubernetes.client.Config > - Error reading service account token from: > [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. > 2020-05-09 11:39:37,080 WARN io.fabric8.kubernetes.client.Config > - Error reading service account token from: > [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. > 2020-05-09 11:39:37,082 WARN io.fabric8.kubernetes.client.Config > - Error reading service account token from: > [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. > 2020-05-09 11:39:37,334 ERROR > org.apache.flink.kubernetes.cli.KubernetesSessionCli - Error while > running the Flink session. > io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] > for kind: [Service] with name: > [flink-cluster-6adb7c62-8940-4828-990c-a87379102d61] in namespace: > [default] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164) > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334) > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246) > at > org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104) > at > org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185) > at >
Re: Cannot start native K8s
Hello Yang, I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not a Minikube). Previously, as you pointed out, openjdk version "1.8.0_252" was installed. I bump up java version to openjdk 11.0.7 but got something different: [flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh 2020-05-09 11:39:36,737 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.address, DAC-E04-W06 2020-05-09 11:39:36,739 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.port, 6123 2020-05-09 11:39:36,739 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.heap.size, 1024m 2020-05-09 11:39:36,739 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: taskmanager.memory.process.size, 24g 2020-05-09 11:39:36,739 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: taskmanager.numberOfTaskSlots, 24 2020-05-09 11:39:36,739 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: parallelism.default, 1 2020-05-09 11:39:36,740 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability, zookeeper 2020-05-09 11:39:36,740 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability.zookeeper.path.root, /flink 2020-05-09 11:39:36,740 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ 2020-05-09 11:39:36,740 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 2020-05-09 11:39:36,741 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.execution.failover-strategy, region 2020-05-09 11:39:36,741 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: rest.port, 8082 2020-05-09 11:39:36,817 WARN io.fabric8.kubernetes.client.Config - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. 2020-05-09 11:39:36,823 WARN io.fabric8.kubernetes.client.Config - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. 2020-05-09 11:39:37,080 WARN io.fabric8.kubernetes.client.Config - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. 2020-05-09 11:39:37,082 WARN io.fabric8.kubernetes.client.Config - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. 2020-05-09 11:39:37,334 ERROR org.apache.flink.kubernetes.cli.KubernetesSessionCli - Error while running the Flink session. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Service] with name: [flink-cluster-6adb7c62-8940-4828-990c-a87379102d61] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164) at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334) at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246) at org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104) at org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185) at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) at org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185) Caused by: java.net.UnknownHostException: kubernetes.default.svc: Name or service not known at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929) at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1515) at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848) at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1505) at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1364) at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1298) at
Re: Cannot start native K8s
Hi Dongwon Kim, Are you running Flink on a minikube or a real Kubernetes cluster? I just could not reproduce it in a real Kubernetes cluster with java 8u252. For minikube, i get the exception with you. Best, Yang Yang Wang 于2020年5月6日周三 上午9:29写道: > Hi Dongwon Kim, > > I think it is a known issue. The native kubernetes integration could not > work with jdk 8u252 > due to okhttp issue[1]. Currently, you could upgrade your jdk to a new > version to work around. > > > [1]. https://issues.apache.org/jira/browse/FLINK-17416 > > Dongwon Kim 于2020年5月6日周三 上午7:15写道: > >> Hi, >> >> I'm using Flink-1.10 and tested everything [1] successfully. >> While trying [2], I got the following message. >> Can anyone help please? >> >> [root@DAC-E04-W06 bin]# ./kubernetes-session.sh >>> 2020-05-06 08:10:49,411 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: jobmanager.rpc.address, DAC-E04-W06 >>> 2020-05-06 08:10:49,412 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: jobmanager.rpc.port, 6123 >>> 2020-05-06 08:10:49,412 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: jobmanager.heap.size, 1024m >>> 2020-05-06 08:10:49,412 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: taskmanager.memory.process.size, 24g >>> 2020-05-06 08:10:49,413 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: taskmanager.numberOfTaskSlots, 24 >>> 2020-05-06 08:10:49,413 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: parallelism.default, 1 >>> 2020-05-06 08:10:49,413 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: high-availability, zookeeper >>> 2020-05-06 08:10:49,413 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: high-availability.zookeeper.path.root, /flink >>> 2020-05-06 08:10:49,414 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ >>> 2020-05-06 08:10:49,414 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 >>> 2020-05-06 08:10:49,414 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: jobmanager.execution.failover-strategy, region >>> 2020-05-06 08:10:49,415 INFO >>> org.apache.flink.configuration.GlobalConfiguration- Loading >>> configuration property: rest.port, 8082 >>> 2020-05-06 08:10:50,386 ERROR >>> org.apache.flink.kubernetes.cli.KubernetesSessionCli - Error while >>> running the Flink session. >>> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] >>> for kind: [Service] with name: >>> [flink-cluster-5c12bd50-a540-4614-96d0-549785a8bc62] in namespace: >>> [default] failed. >>> at >>> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) >>> at >>> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) >>> at >>> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231) >>> at >>> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164) >>> at >>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334) >>> at >>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246) >>> at >>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104) >>> at >>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185) >>> at >>> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) >>> at >>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185) >>> Caused by: java.net.SocketException: Broken pipe (Write failed) >>> at java.net.SocketOutputStream.socketWrite0(Native Method) >>> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) >>> at java.net.SocketOutputStream.write(SocketOutputStream.java:155) >>> at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431) >>> at sun.security.ssl.OutputRecord.write(OutputRecord.java:417) >>> at >>> sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894) >>> at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865) >>> at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123) >>> at
Re: Cannot start native K8s
Hi Dongwon Kim, I think it is a known issue. The native kubernetes integration could not work with jdk 8u252 due to okhttp issue[1]. Currently, you could upgrade your jdk to a new version to work around. [1]. https://issues.apache.org/jira/browse/FLINK-17416 Dongwon Kim 于2020年5月6日周三 上午7:15写道: > Hi, > > I'm using Flink-1.10 and tested everything [1] successfully. > While trying [2], I got the following message. > Can anyone help please? > > [root@DAC-E04-W06 bin]# ./kubernetes-session.sh >> 2020-05-06 08:10:49,411 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.rpc.address, DAC-E04-W06 >> 2020-05-06 08:10:49,412 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.rpc.port, 6123 >> 2020-05-06 08:10:49,412 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.heap.size, 1024m >> 2020-05-06 08:10:49,412 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: taskmanager.memory.process.size, 24g >> 2020-05-06 08:10:49,413 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: taskmanager.numberOfTaskSlots, 24 >> 2020-05-06 08:10:49,413 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: parallelism.default, 1 >> 2020-05-06 08:10:49,413 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability, zookeeper >> 2020-05-06 08:10:49,413 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability.zookeeper.path.root, /flink >> 2020-05-06 08:10:49,414 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ >> 2020-05-06 08:10:49,414 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 >> 2020-05-06 08:10:49,414 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: jobmanager.execution.failover-strategy, region >> 2020-05-06 08:10:49,415 INFO >> org.apache.flink.configuration.GlobalConfiguration- Loading >> configuration property: rest.port, 8082 >> 2020-05-06 08:10:50,386 ERROR >> org.apache.flink.kubernetes.cli.KubernetesSessionCli - Error while >> running the Flink session. >> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] >> for kind: [Service] with name: >> [flink-cluster-5c12bd50-a540-4614-96d0-549785a8bc62] in namespace: >> [default] failed. >> at >> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) >> at >> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) >> at >> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231) >> at >> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164) >> at >> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334) >> at >> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246) >> at >> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104) >> at >> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185) >> at >> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) >> at >> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185) >> Caused by: java.net.SocketException: Broken pipe (Write failed) >> at java.net.SocketOutputStream.socketWrite0(Native Method) >> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) >> at java.net.SocketOutputStream.write(SocketOutputStream.java:155) >> at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431) >> at sun.security.ssl.OutputRecord.write(OutputRecord.java:417) >> at >> sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894) >> at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865) >> at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123) >> at org.apache.flink.kubernetes.shadded.okio.Okio$1.write(Okio.java:79) >> at >> org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$1.write(AsyncTimeout.java:180) >> at >> org.apache.flink.kubernetes.shadded.okio.RealBufferedSink.flush(RealBufferedSink.java:224) >> at >> org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Writer.settings(Http2Writer.java:203) >> at >>
Cannot start native K8s
Hi, I'm using Flink-1.10 and tested everything [1] successfully. While trying [2], I got the following message. Can anyone help please? [root@DAC-E04-W06 bin]# ./kubernetes-session.sh > 2020-05-06 08:10:49,411 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.rpc.address, DAC-E04-W06 > 2020-05-06 08:10:49,412 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.rpc.port, 6123 > 2020-05-06 08:10:49,412 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.heap.size, 1024m > 2020-05-06 08:10:49,412 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: taskmanager.memory.process.size, 24g > 2020-05-06 08:10:49,413 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: taskmanager.numberOfTaskSlots, 24 > 2020-05-06 08:10:49,413 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: parallelism.default, 1 > 2020-05-06 08:10:49,413 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability, zookeeper > 2020-05-06 08:10:49,413 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.zookeeper.path.root, /flink > 2020-05-06 08:10:49,414 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.storageDir, hdfs:///user/flink/ha/ > 2020-05-06 08:10:49,414 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181 > 2020-05-06 08:10:49,414 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: jobmanager.execution.failover-strategy, region > 2020-05-06 08:10:49,415 INFO > org.apache.flink.configuration.GlobalConfiguration- Loading > configuration property: rest.port, 8082 > 2020-05-06 08:10:50,386 ERROR > org.apache.flink.kubernetes.cli.KubernetesSessionCli - Error while > running the Flink session. > io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] > for kind: [Service] with name: > [flink-cluster-5c12bd50-a540-4614-96d0-549785a8bc62] in namespace: > [default] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164) > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334) > at > org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246) > at > org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104) > at > org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185) > at > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at > org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185) > Caused by: java.net.SocketException: Broken pipe (Write failed) > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) > at java.net.SocketOutputStream.write(SocketOutputStream.java:155) > at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431) > at sun.security.ssl.OutputRecord.write(OutputRecord.java:417) > at > sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894) > at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865) > at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123) > at org.apache.flink.kubernetes.shadded.okio.Okio$1.write(Okio.java:79) > at > org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$1.write(AsyncTimeout.java:180) > at > org.apache.flink.kubernetes.shadded.okio.RealBufferedSink.flush(RealBufferedSink.java:224) > at > org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Writer.settings(Http2Writer.java:203) > at > org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Connection.start(Http2Connection.java:515) > at > org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Connection.start(Http2Connection.java:505) > at > org.apache.flink.kubernetes.shadded.okhttp3.internal.connection.RealConnection.startHttp2(RealConnection.java:298) > at >