[ https://issues.apache.org/jira/browse/IOTDB-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633427#comment-17633427 ]
Jinrui Zhang commented on IOTDB-4873: ------------------------------------- It seems that some requests are fetched both from Queue and WAL when preparing batch, which leads to the `merge` operation on receiver side. See the snapshot below !image-2022-11-14-09-19-47-544.png|width=1117,height=351! > Multi-user concurrent write and query + [ select into ] : ERROR > o.a.i.c.m.t.MultiLeaderConsensusIService$AsyncProcessor$syncLog$1:903 - > Exception inside handler > ---------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: IOTDB-4873 > URL: https://issues.apache.org/jira/browse/IOTDB-4873 > Project: Apache IoTDB > Issue Type: Bug > Components: mpp-cluster > Affects Versions: 0.14.0-SNAPSHOT > Reporter: 刘珍 > Assignee: Haiming Zhu > Priority: Major > Attachments: 4873.conf, image-2022-11-14-09-17-48-992.png, > image-2022-11-14-09-18-10-120.png, image-2022-11-14-09-19-47-544.png, > screenshot-1.png, select_into.sh > > > master_1107_523e82a > 1. start 3rep ,3C 3D cluster > 2. Start benchmark concurrent writes and queries > 3. After 16 hours, ip62 execute " select into " > About 1000 SQL, single user execution : > ”select s_0,s_1,s_2,s_3,s_4,s_5,s_6,s_7,s_8,s_9,s_10 into > root.test.g_1.::(::) from root.test.g_1.d_ip62_660” > !screenshot-1.png! > ip62 datanode displays the following error log : > 2022-11-08 09:27:31,366 [pool-20-IoTDB-MultiLeaderConsensusRPC-Processor-72] > ERROR o.a.i.c.m.t.MultiLeaderConsensusIService$AsyncProcessor$syncLog$1:903 - > Exception inside handler > java.lang.NullPointerException: null > at > org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.mergeInsertNodes(DataRegionStateMachine.java:376) > at > org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.grabInsertNode(DataRegionStateMachine.java:295) > at > org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.deserializeAndWrap(DataRegionStateMachine.java:272) > at > org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.write(DataRegionStateMachine.java:325) > at > org.apache.iotdb.consensus.multileader.service.MultiLeaderRPCServiceProcessor.syncLog(MultiLeaderRPCServiceProcessor.java:132) > at > org.apache.iotdb.consensus.multileader.thrift.MultiLeaderConsensusIService$AsyncProcessor$syncLog.start(MultiLeaderConsensusIService.java:922) > at > org.apache.iotdb.consensus.multileader.thrift.MultiLeaderConsensusIService$AsyncProcessor$syncLog.start(MultiLeaderConsensusIService.java:865) > at > org.apache.thrift.TBaseAsyncProcessor.process(TBaseAsyncProcessor.java:103) > at > org.apache.thrift.server.AbstractNonblockingServer$AsyncFrameBuffer.invoke(AbstractNonblockingServer.java:603) > at org.apache.thrift.server.Invocation.run(Invocation.java:18) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2022-11-08 09:27:50,962 [Query-Worker-Thread-48$20221108_012730_15774_3.1.0] > ERROR o.a.i.d.m.e.o.p.AbstractIntoOperator:123 - Error occurred while > inserting tablets in SELECT INTO: can't connect to node > {}TEndPoint(ip:192.168.10.68, port:9003) > 2022-11-08 09:27:50,962 [Query-Worker-Thread-48$20221108_012730_15774_3.1.0] > ERROR o.a.i.d.m.e.s.AbstractDriverThread:80 - [ExecuteFailed] > org.apache.iotdb.db.exception.IntoProcessException: Error occurred while > inserting tablets in SELECT INTO: can't connect to node > {}TEndPoint(ip:192.168.10.68, port:9003) > at > org.apache.iotdb.db.mpp.execution.operator.process.AbstractIntoOperator.insertMultiTabletsInternally(AbstractIntoOperator.java:124) > at > org.apache.iotdb.db.mpp.execution.operator.process.IntoOperator.next(IntoOperator.java:73) > at > org.apache.iotdb.db.mpp.execution.driver.Driver.processInternal(Driver.java:186) > at > org.apache.iotdb.db.mpp.execution.driver.Driver.lambda$processFor$1(Driver.java:125) > at > org.apache.iotdb.db.mpp.execution.driver.Driver.tryWithLock(Driver.java:270) > at > org.apache.iotdb.db.mpp.execution.driver.Driver.processFor(Driver.java:118) > at > org.apache.iotdb.db.mpp.execution.schedule.DriverTaskThread.execute(DriverTaskThread.java:64) > at > org.apache.iotdb.db.mpp.execution.schedule.AbstractDriverThread.run(AbstractDriverThread.java:74) > 2022-11-08 09:27:50,966 [Query-Worker-Thread-48$20221108_012730_15774_3.1.0] > WARN o.a.i.d.m.e.s.DriverScheduler$Scheduler:387 - The task > 20221108_012730_15774_3.1.0 is aborted. All other tasks in the same query > will be cancelled > TEST ENV: > 1. 192.168.10.62 66 64 72CPU 256GB > ConfigNode : > MAX_HEAP_SIZE="12G" > MAX_DIRECT_MEMORY_SIZE="6G" > Common : > > config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > schema_replication_factor=3 > > schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > data_replication_factor=3 > data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.MultiLeaderConsensus > query_timeout_threshold=36000000 > multi_leader_throttle_threshold_in_byte=536870912000 > DataNode : > MAX_HEAP_SIZE="192G" > MAX_DIRECT_MEMORY_SIZE="32G" > 2. benchmark configuration > 192.168.10.64 : /data/liuzhen_test/weektest/benchmark_tool > DEVICE_NUMBER=1000 > SENSOR_NUMBER=3000 > CLIENT_NUMBER=100 > DEVICE_NAME_PREFIX=d_ip62_ > SG_STRATEGY=mod > GROUP_NUMBER=1 > OPERATION_PROPORTION=70:1:1:1:1:0:1:1:1:1:1 > 3. select into is executed after the Benchmark runs for 16 hours(It's still > running) > The file is attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)