刘珍 created IOTDB-5467:
-------------------------

             Summary: Execute query : ERROR 
o.a.i.d.m.e.e.RegionWriteExecutor:88 - Fetch Schema failed
                 Key: IOTDB-5467
                 URL: https://issues.apache.org/jira/browse/IOTDB-5467
             Project: Apache IoTDB
          Issue Type: Bug
          Components: mpp-cluster
    Affects Versions: 1.0.1
            Reporter: 刘珍
            Assignee: Minghui Liu
         Attachments: image-2023-02-03-14-39-21-109.png, 
image-2023-02-03-14-39-33-195.png

测试版本:rc/1.0.1  20230129 573097a
问题描述:
启动3副本3C21D集群,
2023-1-31 16:40:00 , 启动1个Benchmark连ip2(操作执行间隔OP_INTERVAL=1000)执行读写,
设置TTL 为1小时(脚本见附件)
2023-02-02 02:21:08 ConfigNode Leader(ip23)报错,连不上ip7的datanode(unkown)
IP2 datanode log:
2023-02-02 02:51:04,625 
[pool-26-IoTDB-DataNodeInternalRPC-Processor-9]{color:red}* ERROR 
o.a.i.d.m.e.e.RegionWriteExecutor:88 - Fetch Schema failed.
java.lang.RuntimeException: Fetch Schema failed.*{color}
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:202)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:156)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:98)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchemaWithAutoCreate(ClusterSchemaFetcher.java:265)
        at 
org.apache.iotdb.db.mpp.plan.analyze.SchemaValidator.validate(SchemaValidator.java:56)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.executeDataInsert(RegionWriteExecutor.java:202)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:174)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:128)
        at 
org.apache.iotdb.db.mpp.plan.planner.plan.node.write.InsertTabletNode.accept(InsertTabletNode.java:1086)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor.execute(RegionWriteExecutor.java:86)
        at 
org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.sendPlanNode(DataNodeInternalRPCServiceImpl.java:288)
        at 
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3607)
        at 
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3587)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.iotdb.commons.exception.IoTDBException: 
org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceFailureInfo$FailureException
        at 
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.dealWithException(QueryExecution.java:428)
        at 
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.getResult(QueryExecution.java:411)
        at 
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.getBatchResult(QueryExecution.java:437)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:200)
        ... 18 common frames omitted
Caused by: 
org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceFailureInfo$FailureException:
 null
        at 
org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceManager.lambda$cancelTimeoutFlushingInstances$8(FragmentInstanceManager.java:288)
        at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
        at 
java.base/java.util.concurrent.ConcurrentHashMap$EntrySpliterator.forEachRemaining(ConcurrentHashMap.java:3645)
        at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
        at 
org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceManager.cancelTimeoutFlushingInstances(FragmentInstanceManager.java:288)
        at 
org.apache.iotdb.commons.concurrent.threadpool.ScheduledExecutorUtil.lambda$scheduleWithFixedDelay$1(ScheduledExecutorUtil.java:177)
        at 
org.apache.iotdb.commons.concurrent.WrappedRunnable$1.runMayThrow(WrappedRunnable.java:44)
        at 
org.apache.iotdb.commons.concurrent.WrappedRunnable.run(WrappedRunnable.java:29)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at 
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

2023-02-02 04:00:52,056 [pool-26-IoTDB-DataNodeInternalRPC-Processor-16] ERROR 
o.a.i.d.m.e.e.RegionWriteExecutor:88 - cannot fetch schema, status is: 301, msg 
is: {color:red}*There is not enough memory to execute current fragment 
instance, current remaining free memory is 10409, estimated memory usage for 
current fragment instance is 131072*{color}
java.lang.RuntimeException: cannot fetch schema, status is: 301, msg is: There 
is not enough memory to execute current fragment instance, current remaining 
free memory is 10409, estimated memory usage for current fragment instance is 
131072
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:188)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:156)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:98)
        at 
org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchemaWithAutoCreate(ClusterSchemaFetcher.java:265)
        at 
org.apache.iotdb.db.mpp.plan.analyze.SchemaValidator.validate(SchemaValidator.java:56)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.executeDataInsert(RegionWriteExecutor.java:202)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:174)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:128)
        at 
org.apache.iotdb.db.mpp.plan.planner.plan.node.write.InsertTabletNode.accept(InsertTabletNode.java:1086)
        at 
org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor.execute(RegionWriteExecutor.java:86)
        at 
org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.sendPlanNode(DataNodeInternalRPCServiceImpl.java:288)
        at 
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3607)
        at 
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3587)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

测试详细信息:
1.启动3C21D集群
3C:172.16.2.23/24/25  /data/iotdb/r_0129_573097a/logs
21D: 172.16.2.2 ~ 172.16.2.22     /data1/iotdb/r_0129_573097a
配置参数:
ConfigNode配置
MAX_HEAP_SIZE="20G"
MAX_DIRECT_MEMORY_SIZE="6G"
cn_target_config_node_list=172.16.2.23:10710

DataNode配置:
MAX_HEAP_SIZE="20G"
MAX_DIRECT_MEMORY_SIZE="6G"
dn_target_config_node_list=172.16.2.23:10710,172.16.2.24:10710,172.16.2.25:10710

Common配置:
schema_replication_factor=3
data_replication_factor=3

2. 启动Benchmark,配置见附件,主要参数如下:
DEVICE_NUMBER=4200
SENSOR_NUMBER=600
CLIENT_NUMBER=210
GROUP_NUMBER=1
OPERATION_PROPORTION=91:1:1:1:1:0:1:1:1:1:1

3. 启动BM开始写入数据后,启动设置TTL的脚本(见附件)

4. 查看日志
2023-02-02 02:21:08 ConfigNode Leader(ip23)报错,连不上ip7的datanode(unkown),ip7 
ping不通。
查看datanode的报错日志,见问题描述。
集群状态:
 !image-2023-02-03-14-39-21-109.png! 
region状态:
 !image-2023-02-03-14-39-33-195.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to