[ 
https://issues.apache.org/jira/browse/FLINK-36059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18059265#comment-18059265
 ] 

Martijn Visser commented on FLINK-36059:
----------------------------------------

The reason why this now starts to fail is because confluentinc/cp-kafka:7.2.2 
ships with JDK 11.0.16.1, which has a known bug (JDK-8287073) where 
CgroupV2Subsystem.getInstance() throws a NullPointerException. The fix was 
backported to JDK 11.0.17. Docker 29 changed cgroup handling in a way that 
triggers this bug, causing the JVM inside the Kafka container to crash on 
startup.

The minimum update is to update from confluentinc/cp-kafka:7.2.2 to 
confluentinc/cp-kafka:7.2.9 (latest 7.2.x patch, ships with JDK 11.0.21). 

The same applies to schema registry which uses 
confluentinc/cp-schema-registry:7.2.2. This likely has the same JDK issue
CC [~fcsaky]

> SqlClientITCase failed due to could not create/start container
> --------------------------------------------------------------
>
>                 Key: FLINK-36059
>                 URL: https://issues.apache.org/jira/browse/FLINK-36059
>             Project: Flink
>          Issue Type: Bug
>          Components: Build System / CI
>    Affects Versions: 1.20.0
>            Reporter: Weijie Guo
>            Assignee: Martijn Visser
>            Priority: Critical
>
> {code:java}
> Aug 14 04:57:22 Caused by: org.rnorth.ducttape.RetryCountExceededException: 
> Retry limit hit with exception
> Aug 14 04:57:22       at 
> org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:88)
> Aug 14 04:57:22       at 
> org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:344)
> Aug 14 04:57:22       ... 14 more
> Aug 14 04:57:22 Caused by: 
> org.testcontainers.containers.ContainerLaunchException: Could not 
> create/start container
> Aug 14 04:57:22       at 
> org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:564)
> Aug 14 04:57:22       at 
> org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:354)
> Aug 14 04:57:22       at 
> org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
> Aug 14 04:57:22       ... 15 more
> Aug 14 04:57:22 Caused by: java.lang.IllegalStateException: Wait strategy 
> failed. Container exited with code 126
> Aug 14 04:57:22       at 
> org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:534)
> Aug 14 04:57:22       ... 17 more
> Aug 14 04:57:22 Caused by: 
> org.testcontainers.containers.ContainerLaunchException: Timed out waiting for 
> log output matching '.*\[KafkaServer id=\d+\] started.*'
> Aug 14 04:57:22       at 
> org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:47)
> Aug 14 04:57:22       at 
> org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)
> Aug 14 04:57:22       at 
> org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:977)
> Aug 14 04:57:22       at 
> org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:501)
> Aug 14 04:57:22       ... 17 more
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=61408&view=logs&j=dc1bf4ed-4646-531a-f094-e103042be549&t=fb3d654d-52f8-5b98-fe9d-b18dd2e2b790&l=14752



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to