Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-20 文章 macdoor
拿到了吗?有什么发现吗?



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-19 文章 macdoor
https://pan.baidu.com/s/1GHdfeF2y8RUW_Htgdn4KbQ 提取码: piaf 



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-19 文章 Yang Wang
通过附件或者你上传到第三方的存储,然后在这里共享一下链接

macdoor  于2021年1月19日周二 下午12:44写道:

> 可以的,怎么发给你?
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>


Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-18 文章 macdoor
可以的,怎么发给你?



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-18 文章 Yang Wang
看着是有很多Connecting websocket 和 Scheduling reconnect task的log
我觉得还是你的Pod和APIServer的网络不是很稳定

另外,可以的话,你把DEBUG级别的JobManager完整log发一下

Best,
Yang

macdoor  于2021年1月19日周二 上午9:31写道:

> 多谢!打开了DEBUG日志,仍然只有最后一个ERROR,不过之前有不少包含
> kubernetes.client.dsl.internal.WatchConnectionManager  的日志,grep
> 了一部分,能看出些什么吗?
>
> job-debug-0118.log:2021-01-19 02:12:25,551 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> job-debug-0118.log:2021-01-19 02:12:25,646 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Connecting websocket ...
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@2553d42c
> job-debug-0118.log:2021-01-19 02:12:25,647 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> job-debug-0118.log:2021-01-19 02:12:30,128 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Connecting websocket ...
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@5a9fa83e
> job-debug-0118.log:2021-01-19 02:12:30,176 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> job-debug-0118.log:2021-01-19 02:12:39,028 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
> closing the watch
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@2553d42c
> job-debug-0118.log:2021-01-19 02:12:39,028 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Closing websocket
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket@15b15029
> job-debug-0118.log:2021-01-19 02:12:39,030 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket close received. code: 1000, reason:
> job-debug-0118.log:2021-01-19 02:12:39,030 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Ignoring onClose for already closed/closing websocket
> job-debug-0118.log:2021-01-19 02:12:39,031 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
> closing the watch
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@2cdbe5a0
> job-debug-0118.log:2021-01-19 02:12:39,031 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Closing websocket
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket@1e3f5396
> job-debug-0118.log:2021-01-19 02:12:39,033 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket close received. code: 1000, reason:
> job-debug-0118.log:2021-01-19 02:12:39,033 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Ignoring onClose for already closed/closing websocket
> job-debug-0118.log:2021-01-19 02:12:42,677 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Connecting websocket ...
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@210aab4b
> job-debug-0118.log:2021-01-19 02:12:42,678 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> job-debug-0118.log:2021-01-19 02:12:42,920 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Connecting websocket ...
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@278d8398
> job-debug-0118.log:2021-01-19 02:12:42,921 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> job-debug-0118.log:2021-01-19 02:12:45,130 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Connecting websocket ...
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@4b318628
> job-debug-0118.log:2021-01-19 02:12:45,132 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket successfully opened
> job-debug-0118.log:2021-01-19 02:13:05,927 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
> closing the watch
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@278d8398
> job-debug-0118.log:2021-01-19 02:13:05,927 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Closing websocket
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket@69d1ebd2
> job-debug-0118.log:2021-01-19 02:13:05,930 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> WebSocket close received. code: 1000, reason:
> job-debug-0118.log:2021-01-19 02:13:05,930 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
> Ignoring onClose for already closed/closing websocket
> job-debug-0118.log:2021-01-19 02:13:05,940 DEBUG
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
> closing the watch
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@210aab4b
> 

Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-18 文章 macdoor
多谢!打开了DEBUG日志,仍然只有最后一个ERROR,不过之前有不少包含
kubernetes.client.dsl.internal.WatchConnectionManager  的日志,grep
了一部分,能看出些什么吗?

job-debug-0118.log:2021-01-19 02:12:25,551 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket successfully opened
job-debug-0118.log:2021-01-19 02:12:25,646 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@2553d42c
job-debug-0118.log:2021-01-19 02:12:25,647 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket successfully opened
job-debug-0118.log:2021-01-19 02:12:30,128 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@5a9fa83e
job-debug-0118.log:2021-01-19 02:12:30,176 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket successfully opened
job-debug-0118.log:2021-01-19 02:12:39,028 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
closing the watch
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@2553d42c
job-debug-0118.log:2021-01-19 02:12:39,028 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Closing websocket
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket@15b15029
job-debug-0118.log:2021-01-19 02:12:39,030 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket close received. code: 1000, reason: 
job-debug-0118.log:2021-01-19 02:12:39,030 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Ignoring onClose for already closed/closing websocket
job-debug-0118.log:2021-01-19 02:12:39,031 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
closing the watch
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@2cdbe5a0
job-debug-0118.log:2021-01-19 02:12:39,031 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Closing websocket
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket@1e3f5396
job-debug-0118.log:2021-01-19 02:12:39,033 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket close received. code: 1000, reason: 
job-debug-0118.log:2021-01-19 02:12:39,033 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Ignoring onClose for already closed/closing websocket
job-debug-0118.log:2021-01-19 02:12:42,677 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@210aab4b
job-debug-0118.log:2021-01-19 02:12:42,678 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket successfully opened
job-debug-0118.log:2021-01-19 02:12:42,920 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@278d8398
job-debug-0118.log:2021-01-19 02:12:42,921 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket successfully opened
job-debug-0118.log:2021-01-19 02:12:45,130 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Connecting websocket ...
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@4b318628
job-debug-0118.log:2021-01-19 02:12:45,132 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket successfully opened
job-debug-0118.log:2021-01-19 02:13:05,927 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
closing the watch
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@278d8398
job-debug-0118.log:2021-01-19 02:13:05,927 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Closing websocket
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket@69d1ebd2
job-debug-0118.log:2021-01-19 02:13:05,930 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket close received. code: 1000, reason: 
job-debug-0118.log:2021-01-19 02:13:05,930 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Ignoring onClose for already closed/closing websocket
job-debug-0118.log:2021-01-19 02:13:05,940 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Force
closing the watch
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@210aab4b
job-debug-0118.log:2021-01-19 02:13:05,940 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
Closing websocket
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket@3db9d8d8
job-debug-0118.log:2021-01-19 02:13:05,942 DEBUG
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] -
WebSocket close received. code: 

Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-18 文章 Yang Wang
可以用iperf来进行网络的测试,你需要在镜像里面提前安装好

另外,可以打开debug log看一下是不是Watch经过了很多次重试都连不上,才导致失败的

Best,
Yang

macdoor  于2021年1月18日周一 下午7:08写道:

> 我查看了一下之前的日志,没有发现 too old resource
> version,而且连续几个日志都没有其他错误,直接就这个错误,restart,然后就是一个新日志了。
>
> 我用的k8s集群似乎网络确实不太稳定,请教一下如何测试Pod和APIServer之间的网络比较容易说明问题?ping?或者什么工具?
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/


Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-18 文章 macdoor
我查看了一下之前的日志,没有发现 too old resource
version,而且连续几个日志都没有其他错误,直接就这个错误,restart,然后就是一个新日志了。

我用的k8s集群似乎网络确实不太稳定,请教一下如何测试Pod和APIServer之间的网络比较容易说明问题?ping?或者什么工具?



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-18 文章 Yang Wang
你搜索一下看看有没有too old resource version的报错
另外,测试一下Pod和APIServer的网络状态,是不是经常断

Best,
Yang

macdoor  于2021年1月18日周一 上午9:45写道:

> 大约几十分钟就会restart,请教大佬们有查的思路,每次抛出的错误都是一样的,运行一段时间也会积累很多ConfigMap,下面是一个具体的错误
>
> 错误内容
>
> 2021-01-17 04:16:46,116 ERROR
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Fatal error occurred in ResourceManager.
> org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Error
> while watching the ConfigMap
> test-flink-etl-42557c3f6325ffc876958430859178cd-jobmanager-leader
> at
>
> org.apache.flink.kubernetes.highavailability.KubernetesLeaderRetrievalDriver$ConfigMapCallbackHandlerImpl.handleFatalError(KubernetesLeaderRetrievalDriver.java:120)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> org.apache.flink.kubernetes.kubeclient.resources.AbstractKubernetesWatcher.onClose(AbstractKubernetesWatcher.java:48)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.utils.WatcherToggle.onClose(WatcherToggle.java:56)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.closeEvent(WatchConnectionManager.java:367)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$700(WatchConnectionManager.java:50)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:259)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket.onReadMessage(RealWebSocket.java:323)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .WebSocketReader.readMessageFrame(WebSocketReader.java:219)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .WebSocketReader.processNextFrame(WebSocketReader.java:105)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket.loopReader(RealWebSocket.java:274)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket$2.onResponse(RealWebSocket.java:214)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_275]
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_275]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
> 2021-01-17 04:16:46,117 ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Fatal
> error occurred in the cluster entrypoint.
> org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Error
> while watching the ConfigMap
> test-flink-etl-42557c3f6325ffc876958430859178cd-jobmanager-leader
> at
>
> org.apache.flink.kubernetes.highavailability.KubernetesLeaderRetrievalDriver$ConfigMapCallbackHandlerImpl.handleFatalError(KubernetesLeaderRetrievalDriver.java:120)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> org.apache.flink.kubernetes.kubeclient.resources.AbstractKubernetesWatcher.onClose(AbstractKubernetesWatcher.java:48)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.utils.WatcherToggle.onClose(WatcherToggle.java:56)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.closeEvent(WatchConnectionManager.java:367)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$700(WatchConnectionManager.java:50)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
>
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:259)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket.onReadMessage(RealWebSocket.java:323)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .WebSocketReader.readMessageFrame(WebSocketReader.java:219)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .WebSocketReader.processNextFrame(WebSocketReader.java:105)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.ws
> .RealWebSocket.loopReader(RealWebSocket.java:274)
> [flink-dist_2.11-1.12.1.jar:1.12.1]
> at
> 

K8s HA Session模式下1.12.1 jobmanager 周期性 restart

2021-01-17 文章 macdoor
大约几十分钟就会restart,请教大佬们有查的思路,每次抛出的错误都是一样的,运行一段时间也会积累很多ConfigMap,下面是一个具体的错误

错误内容

2021-01-17 04:16:46,116 ERROR
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
Fatal error occurred in ResourceManager.
org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Error
while watching the ConfigMap
test-flink-etl-42557c3f6325ffc876958430859178cd-jobmanager-leader
at
org.apache.flink.kubernetes.highavailability.KubernetesLeaderRetrievalDriver$ConfigMapCallbackHandlerImpl.handleFatalError(KubernetesLeaderRetrievalDriver.java:120)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.kubeclient.resources.AbstractKubernetesWatcher.onClose(AbstractKubernetesWatcher.java:48)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.utils.WatcherToggle.onClose(WatcherToggle.java:56)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.closeEvent(WatchConnectionManager.java:367)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$700(WatchConnectionManager.java:50)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:259)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_275]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_275]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
2021-01-17 04:16:46,117 ERROR
org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Fatal
error occurred in the cluster entrypoint.
org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Error
while watching the ConfigMap
test-flink-etl-42557c3f6325ffc876958430859178cd-jobmanager-leader
at
org.apache.flink.kubernetes.highavailability.KubernetesLeaderRetrievalDriver$ConfigMapCallbackHandlerImpl.handleFatalError(KubernetesLeaderRetrievalDriver.java:120)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.kubeclient.resources.AbstractKubernetesWatcher.onClose(AbstractKubernetesWatcher.java:48)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.utils.WatcherToggle.onClose(WatcherToggle.java:56)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.closeEvent(WatchConnectionManager.java:367)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$700(WatchConnectionManager.java:50)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:259)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
[flink-dist_2.11-1.12.1.jar:1.12.1]
at
org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)