刘珍 created IOTDB-4350: ------------------------- Summary: [ MultiLeader Throttle Down] Performance does not return to normal after “Throttle Down“ Key: IOTDB-4350 URL: https://issues.apache.org/jira/browse/IOTDB-4350 Project: Apache IoTDB Issue Type: Bug Components: mpp-cluster Affects Versions: 0.14.0-SNAPSHOT Reporter: 刘珍 Assignee: 张洪胤
m_0905_0095eb3,3副本3C3D 3个dataregion , 每个node上有1个leader。 ip72 断网2分钟,查看集群状态,切主成功后。 ip73断网2分钟,之后不执行故障操作。 同步慢,multiLeader一直在合并限流,但是限流性能也回不去,如下,统计1分钟的写入数据量(bm中的batch) IoTDB> select count(latency) from root.result.moresession_2022_09_06_04_47_03.INGESTION where okPoint>0 group by ([1662454041076000186,1662459764764000179),1m); +-----------------------------------+--------------------------------------------------------------------+ | Time|count(root.result.moresession_2022_09_06_04_47_03.INGESTION.latency)| +-----------------------------------+--------------------------------------------------------------------+ |2022-09-06T16:47:21.076000186+08:00| 5544| |2022-09-06T16:48:21.076000186+08:00| 6282| |2022-09-06T16:49:21.076000186+08:00| 5671| |2022-09-06T16:50:21.076000186+08:00| 4589| |2022-09-06T16:51:21.076000186+08:00| 5350| |2022-09-06T16:52:21.076000186+08:00| 1121| |2022-09-06T16:53:21.076000186+08:00| 901| |2022-09-06T16:54:21.076000186+08:00| 201| |2022-09-06T16:55:21.076000186+08:00| 334| |2022-09-06T16:56:21.076000186+08:00| 3501| |2022-09-06T16:57:21.076000186+08:00| 3677| |2022-09-06T16:58:21.076000186+08:00| 3111| |2022-09-06T16:59:21.076000186+08:00| 1948| |2022-09-06T17:00:21.076000186+08:00| 3889| |2022-09-06T17:01:21.076000186+08:00| 2982| |2022-09-06T17:02:21.076000186+08:00| 4465| |2022-09-06T17:03:21.076000186+08:00| 4871| |2022-09-06T17:04:21.076000186+08:00| 4478| |2022-09-06T17:05:21.076000186+08:00| 3242| |2022-09-06T17:06:21.076000186+08:00| 2545| |2022-09-06T17:07:21.076000186+08:00| 2579| |2022-09-06T17:08:21.076000186+08:00| 133| |2022-09-06T17:09:21.076000186+08:00| 488| |2022-09-06T17:10:21.076000186+08:00| 253| |2022-09-06T17:11:21.076000186+08:00| 445| |2022-09-06T17:12:21.076000186+08:00| 2122| |2022-09-06T17:13:21.076000186+08:00| 1799| |2022-09-06T17:14:21.076000186+08:00| 1568| |2022-09-06T17:15:21.076000186+08:00| 355| |2022-09-06T17:16:21.076000186+08:00| 1127| |2022-09-06T17:17:21.076000186+08:00| 803| |2022-09-06T17:18:21.076000186+08:00| 674| |2022-09-06T17:19:21.076000186+08:00| 621| |2022-09-06T17:20:21.076000186+08:00| 361| |2022-09-06T17:21:21.076000186+08:00| 367| |2022-09-06T17:22:21.076000186+08:00| 999| |2022-09-06T17:23:21.076000186+08:00| 1119| |2022-09-06T17:24:21.076000186+08:00| 1113| |2022-09-06T17:25:21.076000186+08:00| 1737| |2022-09-06T17:26:21.076000186+08:00| 1282| |2022-09-06T17:27:21.076000186+08:00| 4454| |2022-09-06T17:28:21.076000186+08:00| 2013| |2022-09-06T17:29:21.076000186+08:00| 623| |2022-09-06T17:30:21.076000186+08:00| 313| |2022-09-06T17:31:21.076000186+08:00| 455| |2022-09-06T17:32:21.076000186+08:00| 353| |2022-09-06T17:33:21.076000186+08:00| 347| |2022-09-06T17:34:21.076000186+08:00| 587| |2022-09-06T17:35:21.076000186+08:00| 1370| |2022-09-06T17:36:21.076000186+08:00| 341| |2022-09-06T17:37:21.076000186+08:00| 1555| |2022-09-06T17:38:21.076000186+08:00| 3266| |2022-09-06T17:39:21.076000186+08:00| 1344| |2022-09-06T17:40:21.076000186+08:00| 1057| |2022-09-06T17:41:21.076000186+08:00| 682| |2022-09-06T17:42:21.076000186+08:00| 231| |2022-09-06T17:43:21.076000186+08:00| 170| |2022-09-06T17:44:21.076000186+08:00| 729| |2022-09-06T17:45:21.076000186+08:00| 118| |2022-09-06T17:46:21.076000186+08:00| 135| |2022-09-06T17:47:21.076000186+08:00| 109| |2022-09-06T17:48:21.076000186+08:00| 167| |2022-09-06T17:49:21.076000186+08:00| 139| |2022-09-06T17:50:21.076000186+08:00| 138| |2022-09-06T17:51:21.076000186+08:00| 321| |2022-09-06T17:52:21.076000186+08:00| 138| |2022-09-06T17:53:21.076000186+08:00| 326| |2022-09-06T17:54:21.076000186+08:00| 166| |2022-09-06T17:55:21.076000186+08:00| 70| |2022-09-06T17:56:21.076000186+08:00| 302| |2022-09-06T17:57:21.076000186+08:00| 587| |2022-09-06T17:58:21.076000186+08:00| 25| |2022-09-06T17:59:21.076000186+08:00| 427| |2022-09-06T18:00:21.076000186+08:00| 2| |2022-09-06T18:01:21.076000186+08:00| 96| |2022-09-06T18:02:21.076000186+08:00| 72| |2022-09-06T18:03:21.076000186+08:00| 94| |2022-09-06T18:04:21.076000186+08:00| 99| |2022-09-06T18:05:21.076000186+08:00| 66| |2022-09-06T18:06:21.076000186+08:00| 230| |2022-09-06T18:07:21.076000186+08:00| 10| |2022-09-06T18:08:21.076000186+08:00| 335| |2022-09-06T18:09:21.076000186+08:00| 25| |2022-09-06T18:10:21.076000186+08:00| 10| |2022-09-06T18:11:21.076000186+08:00| 18| |2022-09-06T18:12:21.076000186+08:00| 142| |2022-09-06T18:13:21.076000186+08:00| 281| |2022-09-06T18:14:21.076000186+08:00| 30| |2022-09-06T18:15:21.076000186+08:00| 14| |2022-09-06T18:16:21.076000186+08:00| 8| |2022-09-06T18:17:21.076000186+08:00| 7| |2022-09-06T18:18:21.076000186+08:00| 38| |2022-09-06T18:19:21.076000186+08:00| 13| |2022-09-06T18:20:21.076000186+08:00| 40| |2022-09-06T18:21:21.076000186+08:00| 12| |2022-09-06T18:22:21.076000186+08:00| 10| +-----------------------------------+--------------------------------------------------------------------+ Total line number = 96 复现流程: 1. 机器配置 192.168.10.72/73/74 48核386GB bm在ip71 2. 数据库配置 ConfigNode MAX_HEAP_SIZE="8G" schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.MultiLeaderConsensus schema_replication_factor=3 data_replication_factor=3 DataNode MAX_HEAP_SIZE="256G" MAX_DIRECT_MEMORY_SIZE="32G" max_connection_for_internal_service=1100 max_waiting_time_when_insert_blocked=3600000 query_timeout_threshold=3600000 2. benchmark见附件 3. 断网 ip72断网 cat restart_network.sh #!/bin/bash sudo ifconfig enp129s0f1 down sleep $1 sudo ifconfig enp129s0f1 up nohup sh -x restart_network.sh "120" > a.log & 查看region状态,切主成功后, ip73也执行断网 -- This message was sent by Atlassian Jira (v8.20.10#820010)