如果你是用的ClusterIP的暴露方式,那任务提交只能在K8s内进行的
因为外部环境无法解析到K8s内部的service(也就是tuiwen-flink-rest.flink)

你可以在K8s集群内起一个Pod来充当Flink client,然后在Pod内进行任务提交


Best,
Yang

吴松 <wus...@funstory.ai> 于2020年11月24日周二 下午4:23写道:

> 不好意思,这个报错应该是内存的问题。 我想说的是一下的报错。
>
>
>
>
>
>
> 2020-11-24 16:19:33,569 ERROR
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient [] - A
> Kubernetes exception occurred.
> java.net.UnknownHostException: tuiwen-flink-rest.flink: Name or service
> not known
>         at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> ~[?:1.8.0_252]
>         at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
> ~[?:1.8.0_252]
>         at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getByName(InetAddress.java:1077)
> ~[?:1.8.0_252]
>         at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.getWebMonitorAddress(HighAvailabilityServicesUtils.java:193)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:113)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deploySessionCluster(KubernetesClusterDescriptor.java:142)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:109)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:188)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
> [flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:188)
> [flink-dist_2.12-1.11.2.jar:1.11.2]
> 2020-11-24 16:19:33,606 ERROR
> org.apache.flink.kubernetes.cli.KubernetesSessionCli&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;[] - Error while running the Flink session.
> java.lang.RuntimeException:
> org.apache.flink.client.deployment.ClusterRetrieveException: Could not
> create the RestClusterClient.
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:117)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deploySessionCluster(KubernetesClusterDescriptor.java:142)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:109)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:188)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:188)
> [flink-dist_2.12-1.11.2.jar:1.11.2]
> Caused by: org.apache.flink.client.deployment.ClusterRetrieveException:
> Could not create the RestClusterClient.
>         ... 6 more
> Caused by: java.net.UnknownHostException: tuiwen-flink-rest.flink: Name or
> service not known
>         at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> ~[?:1.8.0_252]
>         at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
> ~[?:1.8.0_252]
>         at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> ~[?:1.8.0_252]
>         at java.net.InetAddress.getByName(InetAddress.java:1077)
> ~[?:1.8.0_252]
>         at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.getWebMonitorAddress(HighAvailabilityServicesUtils.java:193)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:113)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         ... 5 more
>
>
> ------------------------------------------------------------
> &nbsp;The program finished with the following exception:
>
>
> java.lang.RuntimeException:
> org.apache.flink.client.deployment.ClusterRetrieveException: Could not
> create the RestClusterClient.
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:117)
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deploySessionCluster(KubernetesClusterDescriptor.java:142)
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:109)
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:188)
>         at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:188)
> Caused by: org.apache.flink.client.deployment.ClusterRetrieveException:
> Could not create the RestClusterClient.
>         ... 6 more
> Caused by: java.net.UnknownHostException: tuiwen-flink-rest.flink: Name or
> service not known
>         at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>         at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
>         at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1193)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>         at java.net.InetAddress.getByName(InetAddress.java:1077)
>         at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.getWebMonitorAddress(HighAvailabilityServicesUtils.java:193)
>         at
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:113)
>         ... 5 more
>
>
>
> &nbsp;
> &nbsp;
> ------------------&nbsp;Original&nbsp;------------------
> From: &nbsp;"吴松"<wus...@funstory.ai&gt;;
> Date: &nbsp;Tue, Nov 24, 2020 03:51 PM
> To: &nbsp;"user-zh"<user-zh@flink.apache.org&gt;;
>
> Subject: &nbsp;flink on native k8s deploy issue
>
> &nbsp;
>
>
>
> 使用-Dkubernetes.rest-service.exposed.type=ClusterIP 配置是启动的flink报错:
>
>
> 如下:
>
>
> 2020-11-24 15:49:19,796 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> jobmanager.rpc.address, 0.0.0.0
> 2020-11-24 15:49:19,800 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> jobmanager.rpc.port, 6123
> 2020-11-24 15:49:19,801 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> jobmanager.memory.process.size, 1600m
> 2020-11-24 15:49:19,801 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> taskmanager.memory.process.size, 1800m
> 2020-11-24 15:49:19,801 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> taskmanager.numberOfTaskSlots, 1
> 2020-11-24 15:49:19,802 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> parallelism.default, 1
> 2020-11-24 15:49:19,802 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property: high-availability,
> zookeeper
> 2020-11-24 15:49:19,803 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> high-availability.cluster-id, /tuiwen-flink
> 2020-11-24 15:49:19,803 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> high-availability.storageDir, file:/usr/flink/tuiwen-flink
> 2020-11-24 15:49:19,804 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> high-availability.zookeeper.quorum,
> data-kafka-zookeeper-headless.tuiwen-public:2181
> 2020-11-24 15:49:19,804 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property: state.backend,
> rocksdb
> 2020-11-24 15:49:19,805 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> state.checkpoints.dir, file:/usr/flink/flink-checkpoints
> 2020-11-24 15:49:19,805 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> state.checkpoints.num-retained, 100
> 2020-11-24 15:49:19,805 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> state.savepoints.dir, file:/usr/flink/flink-savepoints
> 2020-11-24 15:49:19,806 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property:
> jobmanager.execution.failover-strategy, region
> 2020-11-24 15:49:19,806 INFO&nbsp;
> org.apache.flink.configuration.GlobalConfiguration&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;[] - Loading configuration property: web.upload.dir,
> /usr/flink
> 2020-11-24 15:49:19,990 INFO&nbsp;
> org.apache.flink.client.deployment.DefaultClusterClientServiceLoader [] -
> Could not load factory due to missing dependencies.
> 2020-11-24 15:49:22,366 INFO&nbsp;
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The
> derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is
> less than its min value 192.000mb (201326592 bytes), min value will be used
> instead
> 2020-11-24 15:49:22,399 INFO&nbsp;
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The
> derived from fraction jvm overhead memory (70.000mb (73400321 bytes)) is
> less than its min value 192.000mb (201326592 bytes), min value will be used
> instead
> 2020-11-24 15:49:22,401 INFO&nbsp;
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The
> derived from fraction network memory (25.200mb (26424115 bytes)) is less
> than its min value 64.000mb (67108864 bytes), min value will be used instead
> 2020-11-24 15:49:22,405 ERROR
> org.apache.flink.kubernetes.cli.KubernetesSessionCli&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;[] - Error while running the Flink session.
> org.apache.flink.configuration.IllegalConfigurationException: Sum of
> configured Framework Heap Memory (128.000mb (134217728 bytes)), Framework
> Off-Heap Memory (128.000mb (134217728 bytes)), Task Off-Heap Memory (0
> bytes), Managed Memory (100.800mb (105696462 bytes)) and Network Memory
> (64.000mb (67108864 bytes)) exceed configured Total Flink Memory (252.000mb
> (264241152 bytes)).
>         at
> org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:136)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:42)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveProcessSpecWithTotalProcessMemory(ProcessMemoryUtils.java:105)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:79)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:109)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.client.deployment.AbstractContainerizedClusterClientFactory.getClusterSpecification(AbstractContainerizedClusterClientFactory.java:47)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:110)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:188)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
> ~[flink-dist_2.12-1.11.2.jar:1.11.2]
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:188)
> [flink-dist_2.12-1.11.2.jar:1.11.2]
>
>
> ------------------------------------------------------------
> &nbsp;The program finished with the following exception:
>
>
> org.apache.flink.configuration.IllegalConfigurationException: Sum of
> configured Framework Heap Memory (128.000mb (134217728 bytes)), Framework
> Off-Heap Memory (128.000mb (134217728 bytes)), Task Off-Heap Memory (0
> bytes), Managed Memory (100.800mb (105696462 bytes)) and Network Memory
> (64.000mb (67108864 bytes)) exceed configured Total Flink Memory (252.000mb
> (264241152 bytes)).
>         at
> org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:136)
>         at
> org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:42)
>         at
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveProcessSpecWithTotalProcessMemory(ProcessMemoryUtils.java:105)
>         at
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:79)
>         at
> org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:109)
>         at
> org.apache.flink.client.deployment.AbstractContainerizedClusterClientFactory.getClusterSpecification(AbstractContainerizedClusterClientFactory.java:47)
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:110)
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:188)
>         at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>         at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:188)

回复