[
https://issues.apache.org/jira/browse/HIVE-29009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17966273#comment-17966273
]
László Bodor edited comment on HIVE-29009 at 6/12/25 4:31 PM:
--------------------------------------------------------------
thanks [~zratkai] for the headsup and [~zabetak] for the jira!
I created 2 consecutive jstacks of a surefire process for later reference (I
checked a hanging PR)
[^jstack.txt] [^jstack2.txt]
analyzing soon
found the same stack in another hanging POD
definitely this is the one that hangs while creating a container:
{code}
at
org.apache.hive.service.auth.saml.TestHttpSamlAuthentication.setupIDP(TestHttpSamlAuthentication.java:178)
{code}
{code}
"main" #1 prio=5 os_prio=0 cpu=10515.69ms elapsed=14800.79s
tid=0x00007e17fc026fe0 nid=0x2452 waiting on condition [0x00007e180171e000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x0000000088200000> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938)
at
java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153)
at
java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322)
at
sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:1042)
at org.testcontainers.shaded.okio.Okio$2.read(Okio.java:140)
at
org.testcontainers.shaded.okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
at
org.testcontainers.shaded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:358)
at
org.testcontainers.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230)
at
org.testcontainers.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:224)
at
org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.readChunkSize(Http1ExchangeCodec.java:489)
at
org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.read(Http1ExchangeCodec.java:471)
at
org.testcontainers.shaded.okhttp3.internal.Util.skipAll(Util.java:204)
at
org.testcontainers.shaded.okhttp3.internal.Util.discard(Util.java:186)
at
org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.close(Http1ExchangeCodec.java:511)
at
org.testcontainers.shaded.okio.ForwardingSource.close(ForwardingSource.java:43)
at
org.testcontainers.shaded.okhttp3.internal.connection.Exchange$ResponseBodySource.close(Exchange.java:313)
at
org.testcontainers.shaded.okio.RealBufferedSource.close(RealBufferedSource.java:476)
at
org.testcontainers.shaded.okhttp3.internal.Util.closeQuietly(Util.java:139)
at
org.testcontainers.shaded.okhttp3.ResponseBody.close(ResponseBody.java:192)
at org.testcontainers.shaded.okhttp3.Response.close(Response.java:290)
at
org.testcontainers.shaded.com.github.dockerjava.okhttp.OkDockerHttpClient$OkResponse.close(OkDockerHttpClient.java:285)
at
org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.lambda$null$0(DefaultInvocationBuilder.java:272)
at
org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder$$Lambda$508/0x0000000100c15750.close(Unknown
Source)
at
com.github.dockerjava.api.async.ResultCallbackTemplate.close(ResultCallbackTemplate.java:77)
at
org.testcontainers.utility.ResourceReaper.start(ResourceReaper.java:205)
at
org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:205)
- locked <0x0000000088202cf8> (a [Ljava.lang.Object;)
at
org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)
at
org.testcontainers.LazyDockerClient.authConfig(LazyDockerClient.java:12)
at
org.testcontainers.containers.GenericContainer.start(GenericContainer.java:310)
at
org.apache.hive.service.auth.saml.TestHttpSamlAuthentication.setupIDP(TestHttpSamlAuthentication.java:178)
at
org.apache.hive.service.auth.saml.TestHttpSamlAuthentication.testGroupNameFiltering2(TestHttpSamlAuthentication.java:490)
{code}
pulling the image from the test container is successful:
{code}
jenkins@hive-precommit-pr-5846-8-7kcn4-kk8j8-09sbl:~$ docker pull
vihangk1/docker-test-saml-idp
Using default tag: latest
latest: Pulling from vihangk1/docker-test-saml-idp
000eee12ec04: Pull complete
8ae4f9fcfeea: Pull complete
60f22fbbd07a: Pull complete
ccc7a63ad75f: Pull complete
a2427b8dd6e7: Pull complete
91cac3b30184: Pull complete
d6e40015fc10: Pull complete
54695fdb10a7: Pull complete
500ca11be45f: Pull complete
86b2805859cf: Pull complete
c61685fa4f4f: Pull complete
0bf989f9dbbb: Pull complete
01848ea209b5: Pull complete
39c7bca1ade4: Pull complete
86b21261d0f9: Pull complete
c7c15aaefb4f: Pull complete
97bb9720c19d: Pull complete
4167b21c4b10: Pull complete
72c3c76084bc: Pull complete
5cb6a996b51f: Pull complete
369cd8a9d59b: Pull complete
2a0b5c2345a5: Pull complete
105aeb2326c3: Pull complete
67b31997154f: Pull complete
6b4de3e17eff: Pull complete
1758f12ba411: Pull complete
Digest: sha256:25309538648518dce92b423944a02b60254baf313880a192063cedfd23a7a668
Status: Downloaded newer image for vihangk1/docker-test-saml-idp:latest
docker.io/vihangk1/docker-test-saml-idp:latest
jenkins@hive-precommit-pr-5846-8-7kcn4-kk8j8-09sbl:~$
{code}
java version for reference:
{code}
jenkins@hive-precommit-pr-5846-8-7kcn4-kk8j8-09sbl:~$ java --version
openjdk 17.0.9 2023-10-17 LTS
OpenJDK Runtime Environment Zulu17.46+19-CA (build 17.0.9+8-LTS)
OpenJDK 64-Bit Server VM Zulu17.46+19-CA (build 17.0.9+8-LTS, mixed mode,
sharing)
{code}
was (Author: abstractdog):
thanks [~zratkai] for the headsup and [~zabetak] for the jira!
I created 2 consecutive jstacks of a surefire process for later reference (I
checked a hanging PR)
[^jstack.txt] [^jstack2.txt]
analyzing soon
found the same stack in another hanging POD
definitely this is the one that hangs while creating a container:
{code}
at
org.apache.hive.service.auth.saml.TestHttpSamlAuthentication.setupIDP(TestHttpSamlAuthentication.java:178)
{code}
{code}
"main" #1 prio=5 os_prio=0 cpu=10515.69ms elapsed=14800.79s
tid=0x00007e17fc026fe0 nid=0x2452 waiting on condition [0x00007e180171e000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x0000000088200000> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938)
at
java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153)
at
java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322)
at
sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:1042)
at org.testcontainers.shaded.okio.Okio$2.read(Okio.java:140)
at
org.testcontainers.shaded.okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
at
org.testcontainers.shaded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:358)
at
org.testcontainers.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230)
at
org.testcontainers.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:224)
at
org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.readChunkSize(Http1ExchangeCodec.java:489)
at
org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.read(Http1ExchangeCodec.java:471)
at
org.testcontainers.shaded.okhttp3.internal.Util.skipAll(Util.java:204)
at
org.testcontainers.shaded.okhttp3.internal.Util.discard(Util.java:186)
at
org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.close(Http1ExchangeCodec.java:511)
at
org.testcontainers.shaded.okio.ForwardingSource.close(ForwardingSource.java:43)
at
org.testcontainers.shaded.okhttp3.internal.connection.Exchange$ResponseBodySource.close(Exchange.java:313)
at
org.testcontainers.shaded.okio.RealBufferedSource.close(RealBufferedSource.java:476)
at
org.testcontainers.shaded.okhttp3.internal.Util.closeQuietly(Util.java:139)
at
org.testcontainers.shaded.okhttp3.ResponseBody.close(ResponseBody.java:192)
at org.testcontainers.shaded.okhttp3.Response.close(Response.java:290)
at
org.testcontainers.shaded.com.github.dockerjava.okhttp.OkDockerHttpClient$OkResponse.close(OkDockerHttpClient.java:285)
at
org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.lambda$null$0(DefaultInvocationBuilder.java:272)
at
org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder$$Lambda$508/0x0000000100c15750.close(Unknown
Source)
at
com.github.dockerjava.api.async.ResultCallbackTemplate.close(ResultCallbackTemplate.java:77)
at
org.testcontainers.utility.ResourceReaper.start(ResourceReaper.java:205)
at
org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:205)
- locked <0x0000000088202cf8> (a [Ljava.lang.Object;)
at
org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)
at
org.testcontainers.LazyDockerClient.authConfig(LazyDockerClient.java:12)
at
org.testcontainers.containers.GenericContainer.start(GenericContainer.java:310)
at
org.apache.hive.service.auth.saml.TestHttpSamlAuthentication.setupIDP(TestHttpSamlAuthentication.java:178)
at
org.apache.hive.service.auth.saml.TestHttpSamlAuthentication.testGroupNameFiltering2(TestHttpSamlAuthentication.java:490)
{code}
pulling the image from the test container is successful:
{code}
jenkins@hive-precommit-pr-5846-8-7kcn4-kk8j8-09sbl:~$ docker pull
vihangk1/docker-test-saml-idp
Using default tag: latest
latest: Pulling from vihangk1/docker-test-saml-idp
000eee12ec04: Pull complete
8ae4f9fcfeea: Pull complete
60f22fbbd07a: Pull complete
ccc7a63ad75f: Pull complete
a2427b8dd6e7: Pull complete
91cac3b30184: Pull complete
d6e40015fc10: Pull complete
54695fdb10a7: Pull complete
500ca11be45f: Pull complete
86b2805859cf: Pull complete
c61685fa4f4f: Pull complete
0bf989f9dbbb: Pull complete
01848ea209b5: Pull complete
39c7bca1ade4: Pull complete
86b21261d0f9: Pull complete
c7c15aaefb4f: Pull complete
97bb9720c19d: Pull complete
4167b21c4b10: Pull complete
72c3c76084bc: Pull complete
5cb6a996b51f: Pull complete
369cd8a9d59b: Pull complete
2a0b5c2345a5: Pull complete
105aeb2326c3: Pull complete
67b31997154f: Pull complete
6b4de3e17eff: Pull complete
1758f12ba411: Pull complete
Digest: sha256:25309538648518dce92b423944a02b60254baf313880a192063cedfd23a7a668
Status: Downloaded newer image for vihangk1/docker-test-saml-idp:latest
docker.io/vihangk1/docker-test-saml-idp:latest
jenkins@hive-precommit-pr-5846-8-7kcn4-kk8j8-09sbl:~$
{code}
> Intermittent CI timeouts while running tests
> --------------------------------------------
>
> Key: HIVE-29009
> URL: https://issues.apache.org/jira/browse/HIVE-29009
> Project: Hive
> Issue Type: Bug
> Components: Build Infrastructure, Testing Infrastructure
> Reporter: Stamatis Zampetakis
> Priority: Major
> Attachments: jstack.txt, jstack2.txt
>
>
> Recently various CI runs in master and PRs are timing out while executing
> tests. The problem is intermittent but rather frequent. The first and last
> (at the time of logging this ticket) timeout failure in master are outlined
> below:
> First: https://ci.hive.apache.org/job/hive-precommit/job/master/2532/
> Last: https://ci.hive.apache.org/job/hive-precommit/job/master/2546/
> Unfortunately due to HIVE-29008 the CI logs do not contain enough information
> to easily determine which test is hanging and if it is the same everytime.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)