[ 
https://issues.apache.org/jira/browse/IOTDB-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641047#comment-17641047
 ] 

刘珍 commented on IOTDB-4830:
---------------------------

rel/1.0  1130_40de3ad
私有云3副本3C5D
1.启动3副本3C5D集群
2.stop ip3的datanode
3.BM写入数据,完成
4.缩容ip3的datanode,缩容成功。
查看ConfigNode Leader的日志:
2022-11-30 10:52:40,849 [ForkJoinPool.commonPool-worker-5] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 10:54:41,161 [ForkJoinPool.commonPool-worker-1] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 10:56:41,474 [ForkJoinPool.commonPool-worker-1] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 10:58:41,789 [ForkJoinPool.commonPool-worker-0] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 11:00:42,105 [ForkJoinPool.commonPool-worker-6] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 11:02:42,401 [ForkJoinPool.commonPool-worker-0] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 11:04:42,686 [0@group-000000000000-StateMachineUpdater] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 11:06:42,972 [ForkJoinPool.commonPool-worker-5] ERROR 
o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file 
[/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin]
 is already exist.
2022-11-30 11:11:48,561 [ProcExecWorker-2] ERROR 
o.a.i.c.c.s.SyncDataNodeClientPool:97 - {color:#DE350B}SET_SYSTEM_STATUS failed 
on DataNode TEndPoint(ip:172.20.70.3, port:9003)
java.io.IOException: Borrow client from pool for node TEndPoint(ip:172.20.70.3, 
port:9003) failed, you need to increase 
dn_max_connection_for_internal_service.{color}
        at 
org.apache.iotdb.commons.client.ClientManager.borrowClient(ClientManager.java:64)
        at 
org.apache.iotdb.confignode.client.sync.SyncDataNodeClientPool.sendSyncRequestToDataNodeWithGivenRetry(SyncDataNodeClientPool.java:87)
        at 
org.apache.iotdb.confignode.procedure.env.ConfigNodeProcedureEnv.markDataNodeAsRemovingAndBroadcast(ConfigNodeProcedureEnv.java:373)
        at 
org.apache.iotdb.confignode.procedure.impl.node.RemoveDataNodeProcedure.executeFromState(RemoveDataNodeProcedure.java:86)
        at 
org.apache.iotdb.confignode.procedure.impl.node.RemoveDataNodeProcedure.executeFromState(RemoveDataNodeProcedure.java:47)
        at 
org.apache.iotdb.confignode.procedure.impl.statemachine.StateMachineProcedure.execute(StateMachineProcedure.java:186)
        at 
org.apache.iotdb.confignode.procedure.Procedure.doExecute(Procedure.java:365)
        at 
org.apache.iotdb.confignode.procedure.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:414)
        at 
org.apache.iotdb.confignode.procedure.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:373)
        at 
org.apache.iotdb.confignode.procedure.ProcedureExecutor.access$300(ProcedureExecutor.java:50)
        at 
org.apache.iotdb.confignode.procedure.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:741)
Caused by: net.sf.cglib.core.CodeGenerationException: 
org.apache.thrift.transport.TTransportException-->java.net.ConnectException: 
Connection refused (Connection refused)
        at net.sf.cglib.core.ReflectUtils.newInstance(ReflectUtils.java:235)
        at net.sf.cglib.core.ReflectUtils.newInstance(ReflectUtils.java:220)
        at net.sf.cglib.proxy.Enhancer.createUsingReflection(Enhancer.java:639)
        at net.sf.cglib.proxy.Enhancer.firstInstance(Enhancer.java:538)
        at 
net.sf.cglib.core.AbstractClassGenerator.create(AbstractClassGenerator.java:225)
        at net.sf.cglib.proxy.Enhancer.createHelper(Enhancer.java:377)
        at net.sf.cglib.proxy.Enhancer.create(Enhancer.java:304)
        at 
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.newErrorHandler(SyncThriftClientWithErrorHandler.java:48)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$Factory.makeObject(SyncDataNodeInternalServiceClient.java:127)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$Factory.makeObject(SyncDataNodeInternalServiceClient.java:105)
        at 
org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:780)
        at 
org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:439)
        at 
org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
        at 
org.apache.iotdb.commons.client.ClientManager.borrowClient(ClientManager.java:50)
        ... 10 common frames omitted
Caused by: org.apache.thrift.transport.TTransportException: 
java.net.ConnectException: Connection refused (Connection refused)
        at org.apache.thrift.transport.TSocket.open(TSocket.java:243)
        at 
org.apache.iotdb.rpc.TElasticFramedTransport.open(TElasticFramedTransport.java:91)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient.<init>(SyncDataNodeInternalServiceClient.java:63)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$b73d1a05.<init>(<generated>)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at net.sf.cglib.core.ReflectUtils.newInstance(ReflectUtils.java:228)
        ... 23 common frames omitted
Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.thrift.transport.TSocket.open(TSocket.java:238)
        ... 31 common frames omitted


> [SchemaRegion migrated failed] remove datanode that has stopped ,confignode 
> executes “DELETE_OLD_REGION_PEER” on this datanode
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IOTDB-4830
>                 URL: https://issues.apache.org/jira/browse/IOTDB-4830
>             Project: Apache IoTDB
>          Issue Type: Bug
>          Components: mpp-cluster
>    Affects Versions: 0.14.0-SNAPSHOT
>            Reporter: 刘珍
>            Assignee: 陈哲涵
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.14.0-SNAPSHOT
>
>         Attachments: image-2022-11-02-14-55-28-013.png, 
> image-2022-11-15-14-35-54-026.png, image-2022-11-15-14-37-38-147.png, 
> image-2022-11-15-15-10-58-501.png, image-2022-11-15-15-12-05-884.png, 
> iotdb_4830.conf, screenshot-1.png
>
>
> m_1102_09e2566
> 1. 启动3副本 , 3C5D集群
> 2.调用stop-datanode.sh脚本正常停止ip76的 datanode 
> 3. benchmark写入数据完成
> 4. 缩容下线的ip76的datanode
> confignode 会重试连接ip76 
> ,并且有DELETE_OLD_REGION_PEER重试操作,DELETE_OLD_REGION_PEER可以不执行,因为不是缩容开始后的重试 :
> 2022-11-02 14:34:23,637 [ProcExecWorker-9] ERROR 
> o.a.i.c.c.s.SyncDataNodeClientPool:113 - 
> {color:#DE350B}*DELETE_OLD_REGION_PEER*{color} failed on DataNode 
> TEndPoint(ip:192.168.10.76, port:9003) 
> 5. 启动 ip76 datanode , 可以看到remove开始在 ip76上执行 ,但此时此节点的状态却是Running, 应该是Removing。
> ip76 datanode log (已经在执行remove了):
> 2022-11-02 14:38:45,611 [pool-53-IoTDB-Region-Migrate-Pool-1] INFO  
> o.a.i.d.s.RegionMigrateService$DeleteOldRegionPeerTask:493 - succeed to 
> remove region DataRegion[12] consensus group
> 此时集群节点状态:
>  !image-2022-11-02-14-55-28-013.png! 
> TEST ENV
> 192.168.10.72~76



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to