The root cause might be the LoadBalancer could not really work in your environment. We already have a ticket to track this[1] and will try to get it resolved in the next release.
For now, could you please have a try by adding "-Dkubernetes.rest-service.exposed.type=NodePort" to your session and submission commands? Maybe you are also interested in the new flink-kubernetes-operator project[2]. It should make it easier to run a Flink application on the K8s. [1]. https://issues.apache.org/jira/browse/FLINK-17231 [2]. https://github.com/apache/flink-kubernetes-operator Best, Yang Burcu Gul POLAT EGRI <be...@sdt.com.tr> 于2022年3月25日周五 21:39写道: > I am getting the following error when I try to execute sample at Flink > documentation - Native Kubernetes > <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/native_kubernetes/> > . > > I have succedded to execute the first command in documentation by adding > some extra parameters with the help of this post > <https://cloudolife.com/2020/12/12/Cloud-Native/BIg-Data/Flink/Deploy-a-Apache-Flink-session-cluster-natively-on-Kubernetes-K8S/> > . > > user@local:~/flink-1.14.4$ ./bin/kubernetes-session.sh \ > > -Dkubernetes.cluster-id=dproc-example-flink-cluster-id \ > > -Dtaskmanager.memory.process.size=4096m \ > > -Dkubernetes.taskmanager.cpu=2 \ > > -Dtaskmanager.numberOfTaskSlots=4 \ > > -Dresourcemanager.taskmanager-timeout=3600000 \ > > -Dkubernetes.namespace=sdt-dproc-flink-test \ > > -Dkubernetes.config.file=/home/devuser/.kube/config \ > > -Dkubernetes.jobmanager.service-account=flink-service-account > > After executing above command, I have listed the new pod like below. > > user@local:~/flink-1.14.4$ kubectl get pods > > NAME READY STATUS RESTARTS > AGE > > dproc-example-flink-cluster-id-68c79bf67-mwh52 1/1 Running 0 > 1m > > Then, I have executed the below command to submit example job. > > user@local:~/flink-1.14.4$ ./bin/flink run --target kubernetes-session \ > > -Dkubernetes.service-account=flink-service-account \ > > -Dkubernetes.cluster-id=dproc-example-flink-cluster-id \ > > -Dkubernetes.namespace=sdt-dproc-flink-test \ > > -Dkubernetes.config.file=/home/devuser/.kube/config > > examples/batch/WordCount.jar --input /home/user/sometexts.txt --output > /tmp/flinksample > > After a while, I received below logs: > > 2022-03-25 12:38:00,538 INFO > org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Retrieve > flink cluster dproc-example-flink-cluster-id successfully, JobManager Web > Interface: http://10.150.140.248:8081 > > > > ------------------------------------------------------------ > > The program finished with the following exception: > > > > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: java.util.concurrent.ExecutionException: > org.apache.flink.runtime.client.JobSubmissionException: Failed to submit > JobGraph. > > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) > > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) > > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) > > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) > > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) > > at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) > > at > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) > > at > org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) > > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) > > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: > org.apache.flink.runtime.client.JobSubmissionException: Failed to submit > JobGraph. > > at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:316) > > at > org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061) > > at > org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:131) > > at > org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70) > > at > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:93) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) > > ... 8 more > > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.runtime.client.JobSubmissionException: Failed to submit > JobGraph. > > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > > at > org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056) > > ... 16 more > > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. > > at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$11(RestClusterClient.java:433) > > at > java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) > > at > java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970) > > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > > at > org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:399) > > at > java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) > > at > java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > > at > org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:476) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) > > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:262) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) > > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) > > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > > at > org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > > at java.base/java.lang.Thread.run(Thread.java:829) > > Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException: Could > not complete the operation. Number of retries has been exhausted. > > at > org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:395) > > ... 21 more > > Caused by: java.util.concurrent.CompletionException: > org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException: > connection timed out: /10.150.140.248:8081 > > at > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) > > at > java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) > > at > java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1063) > > ... 19 more > > Caused by: > org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException: > connection timed out: /10.150.140.248:8081 > > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) > > ... 8 more > > I understand from the last part of this error that the JobManager Web > Interface URL is wrong because when I check the Kubernetes services, port > is different. > > user@local:~/flink-1.14.4$ kubectl get svc > > NAME TYPE CLUSTER-IP > EXTERNAL-IP PORT(S) AGE > > dproc-example-flink-cluster-id ClusterIP None <none> > 6123/TCP,6124/TCP 6h32m > > dproc-example-flink-cluster-id-rest LoadBalancer 10.97.100.197 > <pending> 8081:30976/TCP 6h32m > > The port should be 30976 rather that 8081. I have already tried to edit > rest.port in flink-conf.yaml with this value and also as parameter from > command line. But nothing changed. Always I get this error. > > How can I force Flink client to access correct JobManager URL. > > > > *Burcu * > > > Bu e-posta ve içeriği kişiye özel ve gizli bilgiler içerebilir. Eğer > mesajın muhatabı veya muhataba iletmekle yükümlü yetkili temsilcisi siz > değilseniz, bu mesajı çoğaltmak, dağıtmak, açıklamak dahil olmak üzere > herhangi bir suretle kullanmamanız gerektiğini, aksine davranışınızın > hukuka aykırılık teşkil edebileceğini bildiririz. Eğer bu mesajı > yanlışlıkla aldıysanız, lütfen göndericiye e-posta ile bildirerek siliniz. > Bu mesajda belirtilen şahsi görüşler göndericiye aittir ve SDT A.Ş.’nin > resmi görüşünü temsil etmeyebilir. > > This email and its contents may contain information that is privileged and > confidential. If you are not an intended recipient,or the agent responsible > for delivering this email to the intended recipient, you are hereby > notified that any use, dissemination, distribution, or copying of this > communication is strictly prohibited and may be unlawful. If you received > this email in error, please notify the sender by replying to this email and > delete the email sent in error. Personel opinions presented in this e-mail > message are solely those of the author and do not necessarily represent SDT > A.S.`s formal and authorized views. >