[ 
https://issues.apache.org/jira/browse/IOTDB-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

刘珍 reopened IOTDB-5061:
-----------------------

rc/1.0.1 2023-01-07_09dd173
需要等ratis发版,目前此iotdb版本,缩容失败,报错:


> Failed to rename mtree.snapshot.tmp to mtree.snapshot while creating mtree 
> snapshot
> -----------------------------------------------------------------------------------
>
>                 Key: IOTDB-5061
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5061
>             Project: Apache IoTDB
>          Issue Type: Bug
>          Components: mpp-cluster
>    Affects Versions: 0.14.0-SNAPSHOT
>            Reporter: 刘珍
>            Assignee: Song Ziyang
>            Priority: Blocker
>              Labels: pull-request-available
>         Attachments: image-2022-12-17-08-39-00-520.png, iotdb_4593.conf, 
> screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> m_1127_ffbdaf3
> 1. 启动3副本3C5D 集群
> 2.BM 写入数据,1小时后,缩容IP72 datanode。
> 3. 开始缩容,1小时40分钟IP72 刷大量ERROR(308个ERROR 日志文件 NPE)
> 2022-11-27 15:42:25,876 [3@group-000200000006-StateMachineUpdater] ERROR 
> o.a.i.d.m.m.s.MemMTreeSnapshotUtil:89 - {color:#DE350B}Failed to rename 
> mtree.snapshot.tmp to mtree.snapshot while creating mtree snapshot.{color}
> 2022-11-27 15:42:26,157 [3@group-000200000006-StateMachineUpdater] ERROR 
> o.a.r.s.i.StateMachineUpdater:194 - 3@group-000200000006-StateMachineUpdater 
> caught a Throwable.
> {color:#DE350B}java.lang.NullPointerException: null{color}
>         at 
> org.apache.iotdb.db.metadata.tag.TagManager.createSnapshot(TagManager.java:79)
>         at 
> org.apache.iotdb.db.metadata.schemaregion.SchemaRegionMemoryImpl.createSnapshot(SchemaRegionMemoryImpl.java:456)
>         at 
> org.apache.iotdb.db.consensus.statemachine.SchemaRegionStateMachine.takeSnapshot(SchemaRegionStateMachine.java:62)
>         at 
> org.apache.iotdb.consensus.IStateMachine.takeSnapshot(IStateMachine.java:82)
>         at 
> org.apache.iotdb.consensus.ratis.ApplicationStateMachineProxy.takeSnapshot(ApplicationStateMachineProxy.java:212)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.takeSnapshot(StateMachineUpdater.java:270)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.checkAndTakeSnapshot(StateMachineUpdater.java:262)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:186)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> 2022-11-27 15:42:26,158 [3@group-000200000006-StateMachineUpdater] ERROR 
> o.a.r.s.i.StateMachineUpdater:194 - 3@group-000200000006-StateMachineUpdater 
> caught a Throwable.
> java.lang.NullPointerException: null
>         at 
> org.apache.iotdb.db.metadata.tag.TagManager.createSnapshot(TagManager.java:79)
>         at 
> org.apache.iotdb.db.metadata.schemaregion.SchemaRegionMemoryImpl.createSnapshot(SchemaRegionMemoryImpl.java:456)
>         at 
> org.apache.iotdb.db.consensus.statemachine.SchemaRegionStateMachine.takeSnapshot(SchemaRegionStateMachine.java:62)
>         at 
> org.apache.iotdb.consensus.IStateMachine.takeSnapshot(IStateMachine.java:82)
>         at 
> org.apache.iotdb.consensus.ratis.ApplicationStateMachineProxy.takeSnapshot(ApplicationStateMachineProxy.java:212)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.takeSnapshot(StateMachineUpdater.java:270)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.checkAndTakeSnapshot(StateMachineUpdater.java:262)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:183)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> 测试环境
> 1. 192.168.10.72~76
> ConfigNode
> MAX_HEAP_SIZE="8G"
> cn_connection_timeout_ms=120000
> Common
> schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
> data_region_consensus_protocol_class=org.apache.iotdb.consensus.iot.IoTConsensus
> schema_replication_factor=3
> data_replication_factor=3
> connection_timeout_ms=120000
> max_connection_for_internal_service=200
> max_waiting_time_when_insert_blocked=600000
> query_timeout_threshold=36000000
> DataNode
> MAX_HEAP_SIZE="256G"
> MAX_DIRECT_MEMORY_SIZE="32G"
> 2. bm配置见附件
> 3. ip72 ${iotdb_dir}下的脚本
> sleep 1h
> ./sbin/start-cli.sh -h 192.168.10.72 -e "show cluster" > bef_remove.out
> ./sbin/start-cli.sh -h 192.168.10.72 -e "show regions" >> bef_remove.out
> ./sbin/start-cli.sh -h 192.168.10.72 -e "show storage group" >> bef_remove.out
> ./sbin/remove-datanode.sh "192.168.10.72:6667" >> remove_ip72.out
> 4. 查看缩容结果,各节点日志



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to