刘珍 created IOTDB-4526: ------------------------- Summary: [ remove datanode ] ERROR o.a.i.d.s.t.i.DataNodeInternalRPCServiceImpl:806 - change region DataRegion[xx] leader failed Key: IOTDB-4526 URL: https://issues.apache.org/jira/browse/IOTDB-4526 Project: Apache IoTDB Issue Type: Bug Components: mpp-cluster Affects Versions: 0.14.0-SNAPSHOT Reporter: 刘珍 Assignee: Jinrui Zhang Attachments: image-2022-09-27-09-54-40-156.png
m_0924_04d9a4a schemaregion : ratis dataregion :multiLeader 均3副本,3C3D,bm写入完成(5万dev,600 sensor / dev , 1万points/sensor),先增加2个节点ip75,ip76 ,再缩容ip72,缩容失败,ip72 datanode error: 2022-09-27 09:33:37,558 [pool-21-IoTDB-DataNodeInternalRPC-Processor-150] ERROR o.a.i.d.s.t.i.DataNodeInternalRPCServiceImpl:806 - change region DataRegion[13] leader failed 2022-09-27 09:33:37,562 [pool-21-IoTDB-DataNodeInternalRPC-Processor-150] ERROR o.a.t.ProcessFunction:47 - Internal error processing changeRegionLeader java.lang.NullPointerException: null at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.transferLeader(DataNodeInternalRPCServiceImpl.java:808) at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.changeRegionLeader(DataNodeInternalRPCServiceImpl.java:790) at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$changeRegionLeader.getResult(IDataNodeRPCService.java:3212) at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$changeRegionLeader.getResult(IDataNodeRPCService.java:3192) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) ip72(leader)confignode error: 2022-09-27 09:33:37,585 [ProcExecWorker-8] ERROR o.a.i.c.c.s.d.SyncDataNodeClientPool:147 - Change regions leader error on Date node: TEndPoint(ip:192.168.10.72, port:9003) org.apache.thrift.TException: Error in calling method changeRegionLeader at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:94) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1.changeRegionLeader(<generated>) at org.apache.iotdb.confignode.client.sync.datanode.SyncDataNodeClientPool.changeRegionLeader(SyncDataNodeClientPool.java:141) at org.apache.iotdb.confignode.procedure.env.DataNodeRemoveHandler.changeRegionLeader(DataNodeRemoveHandler.java:540) at org.apache.iotdb.confignode.procedure.impl.RegionMigrateProcedure.executeFromState(RegionMigrateProcedure.java:104) at org.apache.iotdb.confignode.procedure.impl.RegionMigrateProcedure.executeFromState(RegionMigrateProcedure.java:46) at org.apache.iotdb.confignode.procedure.StateMachineProcedure.execute(StateMachineProcedure.java:185) at org.apache.iotdb.confignode.procedure.Procedure.doExecute(Procedure.java:365) at org.apache.iotdb.confignode.procedure.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:414) at org.apache.iotdb.confignode.procedure.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:373) at org.apache.iotdb.confignode.procedure.ProcedureExecutor.access$300(ProcedureExecutor.java:50) at org.apache.iotdb.confignode.procedure.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:741) Caused by: org.apache.thrift.TException: Error in calling method recv_changeRegionLeader at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:94) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1.recv_changeRegionLeader(<generated>) at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Client.changeRegionLeader(IDataNodeRPCService.java:741) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1.CGLIB$changeRegionLeader$133(<generated>) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1$$FastClassByCGLIB$$bb86de5d.invoke(<generated>) at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55) ... 11 common frames omitted Caused by: org.apache.thrift.TException: Error in calling method receiveBase at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:94) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1.receiveBase(<generated>) at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Client.recv_changeRegionLeader(IDataNodeRPCService.java:754) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1.CGLIB$recv_changeRegionLeader$59(<generated>) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1$$FastClassByCGLIB$$bb86de5d.invoke(<generated>) at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55) ... 17 common frames omitted Caused by: org.apache.thrift.TApplicationException: Internal error processing changeRegionLeader at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1.CGLIB$receiveBase$139(<generated>) at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$986af3c1$$FastClassByCGLIB$$bb86de5d.invoke(<generated>) at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55) ... 23 common frames omitted 测试环境: 1. 192.168.10.72/73/74/75/76 48CPU 384GB 2. benchmark 执行写入,写入完成 3. 启动ip75 ,ip76的datanode服务 4. 缩容ip72 5.查看缩容节点ip72的日志,new peer ip75的日志,confignode leader ip72的日志 6. new peer ip75 consensus文件夹大 !image-2022-09-27-09-54-40-156.png! 7. ip72 缩容置位removing状态后,还有新的合并执行 -- This message was sent by Atlassian Jira (v8.20.10#820010)