[ https://issues.apache.org/jira/browse/FLINK-29535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614104#comment-17614104 ]
Gyula Fora commented on FLINK-29535: ------------------------------------ I think this might be already fixed in 1.2.0: https://issues.apache.org/jira/browse/FLINK-28272 > Flink Operator Certificate renew issue > -------------------------------------- > > Key: FLINK-29535 > URL: https://issues.apache.org/jira/browse/FLINK-29535 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator > Reporter: Sebastian Struß > Priority: Major > > It seems that there is an issue with the Kubernetes Operator (at least in > version 1.1.0) when it comes to certificates for the webhook. > We've seen this error message pop up in the logs: > | | > |An exceptionCaught() event was fired, and it reached at the tail of the > pipeline. It usually means the last handler in the pipeline did not handle > the exception.| > | > and > javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate at > sun.security.ssl.Alert.createSSLException(Unknown Source) ~[?:?] at > sun.security.ssl.Alert.createSSLException(Unknown Source) ~[?:?] at > sun.security.ssl.TransportContext.fatal(Unknown Source) ~[?:?] at > sun.security.ssl.Alert$AlertConsumer.consume(Unknown Source) ~[?:?] at > sun.security.ssl.TransportContext.dispatch(Unknown Source) ~[?:?] at > sun.security.ssl.SSLTransport.decode(Unknown Source) ~[?:?] at > sun.security.ssl.SSLEngineImpl.decode(Unknown Source) ~[?:?] at > sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source) ~[?:?] at > sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) ~[?:?] at > sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) ~[?:?] at > javax.net.ssl.SSLEngine.unwrap(Unknown Source) ~[?:?] at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:296) > ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1342) > ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235) > ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) > ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at > org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507) > ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at > org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446) > ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0]| > It happens when our fluxcd is trying to update the FlinkDeployment resource. > This seems to trigger a webhook to an endpoint (in the operator) which is > serving a (then) invalid certificate. > We've noticed this after 18 days of it running, so maybe something shortlived > was not renewed correctly? -- This message was sent by Atlassian Jira (v8.20.10#820010)