Alexander Belyak created IGNITE-22123: -----------------------------------------
Summary: Critical thread blocked on 2 node cluster with intensive create tables Key: IGNITE-22123 URL: https://issues.apache.org/jira/browse/IGNITE-22123 Project: Ignite Issue Type: Bug Affects Versions: 3.0 Reporter: Alexander Belyak # Start 2 nodes cluster (I use single host for both nodes) # Connect to the first node and do in cycle: ## create table (different 8 columns PK) ## insert a few rows in it ## select a row from table Expected result: test pass without errors Actual result: client get: {noformat} 16:45:20.253 [junit-timeout-thread-119] INFO o.g.a.t.teststeps.ThinClientSteps - Query: drop table if exists eight_different_types_TINYINT_INTEGER_FLOAT_TINYINT_TINYINT_TINYINT_TINYINT_TINYINT Apr 26, 2024 4:45:20 PM org.apache.ignite.internal.logger.IgniteLogger logInternal INFO: Partition assignment change notification received [remoteAddress=localhost:10800] 16:45:20.266 [junit-timeout-thread-119] INFO o.g.a.t.teststeps.ThinClientSteps - Query: create table eight_different_types_TINYINT_INTEGER_FLOAT_TINYINT_TINYINT_TINYINT_TINYINT_TINYINT(keyTINYINT0 TINYINT not null, keyINTEGER1 INTEGER not null, keyFLOAT2 FLOAT not null, keyTINYINT3 TINYINT not null, keyTINYINT4 TINYINT not null, keyTINYINT5 TINYINT not null, keyTINYINT6 TINYINT not null, keyTINYINT7 TINYINT not null, val INTEGER not null, primary key (keyTINYINT0, keyINTEGER1, keyFLOAT2, keyTINYINT3, keyTINYINT4, keyTINYINT5, keyTINYINT6, keyTINYINT7)) 16:45:28.570 [junit-timeout-thread-119] INFO o.g.a.t.teststeps.ThinClientSteps - Query: insert into eight_different_types_TINYINT_INTEGER_FLOAT_TINYINT_TINYINT_TINYINT_TINYINT_TINYINT(keyTINYINT0, keyINTEGER1, keyFLOAT2, keyTINYINT3, keyTINYINT4, keyTINYINT5, keyTINYINT6, keyTINYINT7, val) values (-96, 25781810, 5.0, -93, -92, -91, -90, -89, 116513)org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:e21f4eb1-4ecb-4ea2-aaf7-62a08b677f20 Failed to get the primary replica [tablePartitionId=85_part_18, awaitTimestamp=HybridTimestamp [physical=2024-04-26 16:45:28:599 +0300, logical=0, composite=112337821931864064]] at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) at org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:765) at org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:699) at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525) at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:634) at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:476) at org.apache.ignite.internal.client.sql.ClientSql.execute(ClientSql.java:94) at org.gridgain.ai3tests.tests.teststeps.ThinClientSteps.lambda$executeQuery$0(ThinClientSteps.java:61) at io.qameta.allure.Allure.lambda$step$1(Allure.java:127) at io.qameta.allure.Allure.step(Allure.java:181) at io.qameta.allure.Allure.step(Allure.java:125) at org.gridgain.ai3tests.tests.teststeps.ThinClientSteps.executeQuery(ThinClientSteps.java:61) at org.gridgain.ai3tests.tests.PrimaryKeyConstraintsTest.testTypes(PrimaryKeyConstraintsTest.java:167) at org.gridgain.ai3tests.tests.PrimaryKeyConstraintsTest.test8Columns(PrimaryKeyConstraintsTest.java:120) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at io.qameta.allure.junit5.AllureJunit5.interceptTestTemplateMethod(AllureJunit5.java:59) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.util.concurrent.CompletionException: org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:e21f4eb1-4ecb-4ea2-aaf7-62a08b677f20 Failed to get the primary replica [tablePartitionId=85_part_18, awaitTimestamp=HybridTimestamp [physical=2024-04-26 16:45:28:599 +0300, logical=0, composite=112337821931864064]] at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2094) at org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:419) at org.apache.ignite.internal.client.TcpClientChannel.lambda$onMessage$3(TcpClientChannel.java:238) at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183) Caused by: org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:e21f4eb1-4ecb-4ea2-aaf7-62a08b677f20 Failed to get the primary replica [tablePartitionId=85_part_18, awaitTimestamp=HybridTimestamp [physical=2024-04-26 16:45:28:599 +0300, logical=0, composite=112337821931864064]] at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) at org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:765) at org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:699) at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525) at org.apache.ignite.internal.client.TcpClientChannel.readError(TcpClientChannel.java:508) at org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:397) ... 7 more{noformat} in a server logs: {noformat} 2024-04-26 16:46:42:196 +0300 [ERROR][%PrimaryKeyConstraintsTest_cluster_0%common-scheduler-0][CriticalWorkerWatchdog] A critical thread is blocked for 580 ms that is more than the allowed 500 ms, it is "PrimaryKeyConstraintsTest_cluster_0-srv-worker-15" prio=10 Id=49 RUNNABLE at java.base@18.0.1.1/sun.nio.ch.FileDispatcherImpl.write0(Native Method) at java.base@18.0.1.1/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:62) at java.base@18.0.1.1/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:137) at java.base@18.0.1.1/sun.nio.ch.IOUtil.write(IOUtil.java:81) at java.base@18.0.1.1/sun.nio.ch.IOUtil.write(IOUtil.java:58) at java.base@18.0.1.1/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:532) at app//io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:415) at app//io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:931) at app//io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:359) at app//io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:895) at app//io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1372) at app//io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:935) at app//io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:957) at app//io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:982) at app//io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:950) at app//io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:1000) at app//io.netty.handler.stream.ChunkedWriteHandler.doFlush(ChunkedWriteHandler.java:286) at app//io.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:146) at app//io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:937) at app//io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:921) at app//io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:907) at app//io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:125) at app//io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:939) at app//io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:957) at app//io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:982) at app//io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:950) at app//io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:1000) at app//io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1025) at app//io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:306) at app//org.apache.ignite.internal.network.netty.NettySender.writeWithRecovery(NettySender.java:173) at app//org.apache.ignite.internal.network.netty.NettySender.lambda$send$1(NettySender.java:113) at app//org.apache.ignite.internal.network.netty.NettySender$$Lambda$1226/0x000000080135b9d8.run(Unknown Source) at app//io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) at app//io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) at app//io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) at app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base@18.0.1.1/java.lang.Thread.run(Thread.java:833) Number of locked synchronizers = 1 - java.util.concurrent.locks.ReentrantLock$NonfairSync@3d96d0f5 {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)