[jira] [Assigned] (IOTDB-4986) Too many IoTDB-DataNodeInternalRPC-Processor threads are open
[ https://issues.apache.org/jira/browse/IOTDB-4986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haiming Zhu reassigned IOTDB-4986: -- Sprint: Catalyst-Backlog (was: 2023-1-Storage) Assignee: Xinyu Tan (was: Haiming Zhu) > Too many IoTDB-DataNodeInternalRPC-Processor threads are open > - > > Key: IOTDB-4986 > URL: https://issues.apache.org/jira/browse/IOTDB-4986 > Project: Apache IoTDB > Issue Type: Improvement > Components: mpp-cluster >Affects Versions: 0.14.0-SNAPSHOT >Reporter: 刘珍 >Assignee: Xinyu Tan >Priority: Critical > > m_1118_3d5eeae > 1. 启动3副本3C21D 集群 > 2. 顺序启动7Benchmark > 3. 某个节点的datanode IoTDB-DataNodeInternalRPC-Processor 线程会开的很多,2k+ > (慢慢会降下来),但是会偶现OOM > 2022-11-18 14:26:48,320 > [pool-22-IoTDB-DataNodeInternalRPC-Processor-374$20221118_062422_29227_16.1.0] > ERROR o.a.i.d.m.p.s.FragmentInstanceDispatcherImpl:234 - write locally > failed. TSStatus: TSStatus(code:506, subStatus:[]), message: null > 2022-11-18 14:29:44,568 [DataNodeInternalRPC-Service]{color:red}* ERROR > o.a.i.c.c.IoTDBDefaultThreadExceptionHandler:31 - Exception in thread > DataNodeInternalRPC-Service-40 > java.lang.OutOfMemoryError: unable to create native thread: possibly out of > memory or process/resource limits reached*{color} > at java.base/java.lang.Thread.start0(Native Method) > at java.base/java.lang.Thread.start(Thread.java:803) > at > java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937) > at > java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1354) > at > org.apache.thrift.server.TThreadPoolServer.execute(TThreadPoolServer.java:155) > at > org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:139) > at > org.apache.iotdb.commons.service.AbstractThriftServiceThread.run(AbstractThriftServiceThread.java:258) > 2022-11-18 14:29:53,751 [ClientRPC-Service] ERROR > o.a.i.c.c.IoTDBDefaultThreadExceptionHandler:31 - Exception in thread > ClientRPC-Service-42 > java.lang.OutOfMemoryError: unable to create native thread: possibly out of > memory or process/resource limits reached > at java.base/java.lang.Thread.start0(Native Method) > at java.base/java.lang.Thread.start(Thread.java:803) > at > java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937) > at > java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1354) > at > org.apache.thrift.server.TThreadPoolServer.execute(TThreadPoolServer.java:155) > at > org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:139) > at > org.apache.iotdb.commons.service.AbstractThriftServiceThread.run(AbstractThriftServiceThread.java:258) > 2022-11-18 14:30:11,736 [pool-6-IoTDB-Flush-4] ERROR > o.a.i.d.e.s.TsFileProcessor:1095 - root.test.g_0-6: > /data/iotdb/m_1118_3d5eeae/sbin/../data/datanode/data/unsequence/root.test.g_0/6/2538/1668752675355-5-0-0.tsfile > meet error when flushing a memtable, change system mode to error > java.lang.OutOfMemoryError: unable to create native thread: possibly out of > memory or process/resource limits reached > at java.base/java.lang.Thread.start0(Native Method) > at java.base/java.lang.Thread.start(Thread.java:803) > at > java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937) > at > java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1354) > at > java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118) > at > org.apache.iotdb.db.rescon.AbstractPoolManager.submit(AbstractPoolManager.java:56) > at > org.apache.iotdb.db.engine.flush.MemTableFlushTask.(MemTableFlushTask.java:88) > at > org.apache.iotdb.db.engine.storagegroup.TsFileProcessor.flushOneMemTable(TsFileProcessor.java:1082) > at > org.apache.iotdb.db.engine.flush.FlushManager$FlushThread.runMayThrow(FlushManager.java:108) > at > org.apache.iotdb.commons.concurrent.WrappedRunnable.run(WrappedRunnable.java:29) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2022-11-18 14:30:11,736 [pool-6-IoTDB-Flush-4] ERROR > o.a.i.c.e.HandleSystemErrorStrategy:37 - Unrecoverable error occurs! Change
[jira] [Commented] (IOTDB-5132) 【Need reproduce】 Create aligned timeseries about 50W sensors with benchmark, failed with 301 null
[ https://issues.apache.org/jira/browse/IOTDB-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654312#comment-17654312 ] changxue commented on IOTDB-5132: - master branch 昨天傍晚的最新代码。仍然有这个问题: 44 datanode 的error 日志如下,其他日志已经上传: {code} 2023-01-04 14:32:31,692 [pool-25-IoTDB-ClientRPC-Processor-5$20230104_063229_5_3.1.0] ERROR o.a.i.d.m.e.e.RegionWriteExecutor$WritePlanNodeExecutionVisitor:146 - Something wrong happened while calling consensus layer's write API. org.apache.iotdb.consensus.exception.RatisRequestFailedException: Ratis request failed org.apache.ratis.server.raftlog.RaftLogIOException from Server 2@group-0002: Log entry size 7388963 exceeds the max buffer limit of 4194304 at org.apache.iotdb.consensus.ratis.RatisConsensus.write(RatisConsensus.java:286) at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.executePlanNodeInConsensusLayer(RegionWriteExecutor.java:161) at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitPlan(RegionWriteExecutor.java:138) at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitPlan(RegionWriteExecutor.java:128) at org.apache.iotdb.db.mpp.plan.planner.plan.node.PlanVisitor.visitCreateAlignedTimeSeries(PlanVisitor.java:219) at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitCreateAlignedTimeSeries(RegionWriteExecutor.java:320) at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitCreateAlignedTimeSeries(RegionWriteExecutor.java:128) at org.apache.iotdb.db.mpp.plan.planner.plan.node.metedata.write.CreateAlignedTimeSeriesNode.accept(CreateAlignedTimeSeriesNode.java:191) at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor.execute(RegionWriteExecutor.java:86) at org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchLocally(FragmentInstanceDispatcherImpl.java:246) at org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchOneInstance(FragmentInstanceDispatcherImpl.java:142) at org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchWriteSync(FragmentInstanceDispatcherImpl.java:124) at org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatch(FragmentInstanceDispatcherImpl.java:94) at org.apache.iotdb.db.mpp.plan.scheduler.ClusterScheduler.start(ClusterScheduler.java:112) at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.schedule(QueryExecution.java:287) at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.start(QueryExecution.java:211) at org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:152) at org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:166) at org.apache.iotdb.db.service.thrift.impl.ClientRPCServiceImpl.createAlignedTimeseries(ClientRPCServiceImpl.java:667) at org.apache.iotdb.service.rpc.thrift.IClientRPCService$Processor$createAlignedTimeseries.getResult(IClientRPCService.java:3984) at org.apache.iotdb.service.rpc.thrift.IClientRPCService$Processor$createAlignedTimeseries.getResult(IClientRPCService.java:3964) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at org.apache.iotdb.db.service.thrift.ProcessorWithMetrics.process(ProcessorWithMetrics.java:64) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.ratis.protocol.exceptions.StateMachineException: org.apache.ratis.server.raftlog.RaftLogIOException from Server 2@group-0002: Log entry size 7388963 exceeds the max buffer limit of 4194304 at org.apache.ratis.server.raftlog.RaftLogBase.appendImpl(RaftLogBase.java:184) at org.apache.ratis.server.raftlog.RaftLogBase.lambda$append$2(RaftLogBase.java:161) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:69) at org.apache.ratis.server.raftlog.RaftLogBase.append(RaftLogBase.java:161) at org.apache.ratis.server.impl.ServerState.appendLog(ServerState.java:366) at org.apache.ratis.server.impl.RaftServerImpl.appendTransaction(RaftServerImpl.java:770) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:878) at org.apache.ratis.server.impl.RaftServerImpl.lambda$null$12(RaftServerImpl.java:815) at o
[jira] [Commented] (IOTDB-5353) [Metric]Data of ‘The Time Consumed Of Operation’ is missing in new version
[ https://issues.apache.org/jira/browse/IOTDB-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654310#comment-17654310 ] Qingxin Feng commented on IOTDB-5353: - 不兼容是因为原来的名字为 operation 的Histogram 监控项变成了statement_execution的Timer类型监控项 > [Metric]Data of ‘The Time Consumed Of Operation’ is missing in new version > -- > > Key: IOTDB-5353 > URL: https://issues.apache.org/jira/browse/IOTDB-5353 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Cluster >Affects Versions: 1.0.1 >Reporter: Qingxin Feng >Assignee: Minghui Liu >Priority: Major > Attachments: image-2023-01-04-14-26-23-349.png > > > version 1.0.1-SNAPSHOT (Build: e42478c) > > !image-2023-01-04-14-26-23-349.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5354) Implement `IoTool list`、`IoTool deploy`、`IoTool status`
[ https://issues.apache.org/jira/browse/IOTDB-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaofei Cao reassigned IOTDB-5354: - Assignee: 伊丹翔 > Implement `IoTool list`、`IoTool deploy`、`IoTool status` > - > > Key: IOTDB-5354 > URL: https://issues.apache.org/jira/browse/IOTDB-5354 > Project: Apache IoTDB > Issue Type: New Feature >Reporter: Gaofei Cao >Assignee: 伊丹翔 >Priority: Major > Original Estimate: 336h > Remaining Estimate: 336h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5354) Implement `IoTool list`、`IoTool deploy`、`IoTool status`
Gaofei Cao created IOTDB-5354: - Summary: Implement `IoTool list`、`IoTool deploy`、`IoTool status` Key: IOTDB-5354 URL: https://issues.apache.org/jira/browse/IOTDB-5354 Project: Apache IoTDB Issue Type: New Feature Reporter: Gaofei Cao -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5353) [Metric]Data of ‘The Time Consumed Of Operation’ is lost in new version
Qingxin Feng created IOTDB-5353: --- Summary: [Metric]Data of ‘The Time Consumed Of Operation’ is lost in new version Key: IOTDB-5353 URL: https://issues.apache.org/jira/browse/IOTDB-5353 Project: Apache IoTDB Issue Type: Bug Components: Core/Cluster Affects Versions: 1.0.1 Reporter: Qingxin Feng Assignee: Minghui Liu Attachments: image-2023-01-04-14-26-23-349.png -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5143) 【Need reproduce】[confignode]Couldn't load the configuration iotdb-common.properties from any of the known sources
[ https://issues.apache.org/jira/browse/IOTDB-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654307#comment-17654307 ] changxue commented on IOTDB-5143: - IoTDB使用昨天晚上的master分支代码,仍然有这个问题。以及那个上传新日志 > 【Need reproduce】[confignode]Couldn't load the configuration > iotdb-common.properties from any of the known sources > - > > Key: IOTDB-5143 > URL: https://issues.apache.org/jira/browse/IOTDB-5143 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Server >Affects Versions: 1.0.0 >Reporter: changxue >Assignee: Gaofei Cao >Priority: Major > Attachments: IOTDB-5143_allnodes-log.tar.gz, _allnodes-log.tar.gz, > benchmark-log.out, config.properties > > > 【2023.1.3 Apply:There are not any usefully logs, need reproduce in 1.0.1 > branch. > > environment: > 3C3D cluster, master Dec.8th source codes > iot-benchmark 1.0 is OK > reproduction: > 1. run iot-benchmark with 1 device 30 thousands sensors, configs see > attachment config.properties > 2. the benchmark logs show > {code:java} > 2022-12-08 11:09:54,288 WARN > org.apache.iotdb.tsfile.common.conf.TSFileDescriptor:132 - not found > iotdb-common.properties, use the default configs. > {code} > This message also shows in confignode logs. > benchmark configuration see attachment of config.properties, using 1.0 > 问题: > 没有任何数据写入:show regions只显示schema region的信息,没有data region的 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5352) Function list of IoTool
Gaofei Cao created IOTDB-5352: - Summary: Function list of IoTool Key: IOTDB-5352 URL: https://issues.apache.org/jira/browse/IOTDB-5352 Project: Apache IoTDB Issue Type: Task Reporter: Gaofei Cao Assignee: 伊丹翔 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5351) [Trigger] Add StatisticsUpdaterTrigger as trigger example and fix possible IT failures
liaolanyu created IOTDB-5351: Summary: [Trigger] Add StatisticsUpdaterTrigger as trigger example and fix possible IT failures Key: IOTDB-5351 URL: https://issues.apache.org/jira/browse/IOTDB-5351 Project: Apache IoTDB Issue Type: Improvement Reporter: liaolanyu Assignee: liaolanyu -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5350) A bug about `show timeseries` while using offset
Xingyu Liu created IOTDB-5350: - Summary: A bug about `show timeseries` while using offset Key: IOTDB-5350 URL: https://issues.apache.org/jira/browse/IOTDB-5350 Project: Apache IoTDB Issue Type: Bug Reporter: Xingyu Liu Assignee: Xingyu Liu Attachments: A bug about `show timeseries` while using offset.jpg A bug about `show timeseries` while using offset. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5349) Unify cluster parameters management
Yongzao Dan created IOTDB-5349: -- Summary: Unify cluster parameters management Key: IOTDB-5349 URL: https://issues.apache.org/jira/browse/IOTDB-5349 Project: Apache IoTDB Issue Type: New Feature Reporter: Yongzao Dan Assignee: Yongzao Dan Fix For: 1.0.1 We need to unify cluster parameters management in node-commons package instead of spread them out throughout the project -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5348) A tool for upgrading v0.13 data contains unsupported path name to v1.0
Haonan Hou created IOTDB-5348: - Summary: A tool for upgrading v0.13 data contains unsupported path name to v1.0 Key: IOTDB-5348 URL: https://issues.apache.org/jira/browse/IOTDB-5348 Project: Apache IoTDB Issue Type: New Feature Reporter: Haonan Hou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5348) A tool for upgrading v0.13 data contains unsupported path name to v1.0
[ https://issues.apache.org/jira/browse/IOTDB-5348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haonan Hou reassigned IOTDB-5348: - Assignee: yusicheng > A tool for upgrading v0.13 data contains unsupported path name to v1.0 > --- > > Key: IOTDB-5348 > URL: https://issues.apache.org/jira/browse/IOTDB-5348 > Project: Apache IoTDB > Issue Type: New Feature >Reporter: Haonan Hou >Assignee: yusicheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5211) Verify the memory usage and performance of CDN
[ https://issues.apache.org/jira/browse/IOTDB-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaofei Cao reassigned IOTDB-5211: - Assignee: Yongzao Dan (was: 陈哲涵) > Verify the memory usage and performance of CDN > -- > > Key: IOTDB-5211 > URL: https://issues.apache.org/jira/browse/IOTDB-5211 > Project: Apache IoTDB > Issue Type: Improvement >Reporter: Gaofei Cao >Assignee: Yongzao Dan >Priority: Major > Attachments: image-2022-12-14-16-30-45-267.png, > image-2022-12-15-12-36-25-226.png > > > If it's possible to start ConfigNode and DataNode into one process like below? > And if's possible to use this mode as default in standalone(1C1D)? > > !image-2022-12-14-16-30-45-267.png|width=522,height=475! > > User feedback: > !image-2022-12-15-12-36-25-226.png|width=616,height=205! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5077) [SHOW REGION]New command formats need to be supported
[ https://issues.apache.org/jira/browse/IOTDB-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaofei Cao reassigned IOTDB-5077: - Assignee: 伊丹翔 (was: Gaofei Cao) > [SHOW REGION]New command formats need to be supported > - > > Key: IOTDB-5077 > URL: https://issues.apache.org/jira/browse/IOTDB-5077 > Project: Apache IoTDB > Issue Type: New Feature > Components: Core/Cluster >Reporter: FengQingxin >Assignee: 伊丹翔 >Priority: Major > > Hi, > As a DBA,when I using SHOW REGION to maintan the cluster which has more > than 20 Regions,I want to show the region on one datanode. > It looks like : show regions on 172.20.70.22 or show regions on > NodeID1,NodeID2 > > Thanks, > B.R. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (IOTDB-4897) [Design] DataRegion Load balance based on Disk Usage
[ https://issues.apache.org/jira/browse/IOTDB-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaofei Cao reopened IOTDB-4897: --- > [Design] DataRegion Load balance based on Disk Usage > > > Key: IOTDB-4897 > URL: https://issues.apache.org/jira/browse/IOTDB-4897 > Project: Apache IoTDB > Issue Type: Improvement >Reporter: Gaofei Cao >Assignee: 陈哲涵 >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5347) Implement iterating query for devices and timeseries schema query
[ https://issues.apache.org/jira/browse/IOTDB-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yukun Zhou reassigned IOTDB-5347: - Assignee: Yukun Zhou > Implement iterating query for devices and timeseries schema query > - > > Key: IOTDB-5347 > URL: https://issues.apache.org/jira/browse/IOTDB-5347 > Project: Apache IoTDB > Issue Type: Improvement >Reporter: Yukun Zhou >Assignee: Yukun Zhou >Priority: Major > Fix For: master branch > > > Eliminate the origin implementation, which retrieves all result and return, > by implementing iterating query based on SchemaReader in > TimeSeriesSchemaScanOperator, DeviceSchemaScanOperator and > PathsUsingTemplateSchemaScanOperator, with much less peak memory occupation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5347) Implement iterating query for devices and timeseries schema query
Yukun Zhou created IOTDB-5347: - Summary: Implement iterating query for devices and timeseries schema query Key: IOTDB-5347 URL: https://issues.apache.org/jira/browse/IOTDB-5347 Project: Apache IoTDB Issue Type: Improvement Reporter: Yukun Zhou Fix For: master branch Eliminate the origin implementation, which retrieves all result and return, by implementing iterating query based on SchemaReader in TimeSeriesSchemaScanOperator, DeviceSchemaScanOperator and PathsUsingTemplateSchemaScanOperator, with much less peak memory occupation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (IOTDB-4686) SchemaReader for schema query operator-refactor traverser
[ https://issues.apache.org/jira/browse/IOTDB-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yukun Zhou reopened IOTDB-4686: --- > SchemaReader for schema query operator-refactor traverser > - > > Key: IOTDB-4686 > URL: https://issues.apache.org/jira/browse/IOTDB-4686 > Project: Apache IoTDB > Issue Type: Sub-task >Reporter: xieqijun >Assignee: Yukun Zhou >Priority: Major > Labels: pull-request-available > > Refactor traverser in MTree to support iteratable schema read -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5346) Log error in MemtableFlushTask when recovering
Haonan Hou created IOTDB-5346: - Summary: Log error in MemtableFlushTask when recovering Key: IOTDB-5346 URL: https://issues.apache.org/jira/browse/IOTDB-5346 Project: Apache IoTDB Issue Type: Bug Affects Versions: 0.13.3, 1.0.0 Reporter: Haonan Hou Attachments: 52241672732815_.pic.jpg !52241672732815_.pic.jpg|width=1005,height=31! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5346) Log error in MemtableFlushTask when recovering
[ https://issues.apache.org/jira/browse/IOTDB-5346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haonan Hou reassigned IOTDB-5346: - Assignee: Haonan Hou > Log error in MemtableFlushTask when recovering > -- > > Key: IOTDB-5346 > URL: https://issues.apache.org/jira/browse/IOTDB-5346 > Project: Apache IoTDB > Issue Type: Bug >Affects Versions: 0.13.3, 1.0.0 >Reporter: Haonan Hou >Assignee: Haonan Hou >Priority: Minor > Attachments: 52241672732815_.pic.jpg > > > !52241672732815_.pic.jpg|width=1005,height=31! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5345) Use the logical clock to identify the snapshot version of IoTConsensus
Xinyu Tan created IOTDB-5345: Summary: Use the logical clock to identify the snapshot version of IoTConsensus Key: IOTDB-5345 URL: https://issues.apache.org/jira/browse/IOTDB-5345 Project: Apache IoTDB Issue Type: Improvement Reporter: Xinyu Tan Assignee: huxiangpeng Attachments: image-2023-01-03-23-45-07-397.png The current IoTConsensus uses physical clocks to identify different snapshot versions. In some operation scenarios, the physical clock of the machine may be rolled back. This may cause IoTConsensus to label the latest snapshot as the old snapshot version. Therefore, we need to use logical timestamps to mark different snapshot versions. For example, use a self-maintaining increment index. In addition, this work needs to ensure forward compatibility with 1.0.0 !image-2023-01-03-23-45-07-397.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5344) Catch exception thrown by ClientManager and give some friendly tips
Gaofei Cao created IOTDB-5344: - Summary: Catch exception thrown by ClientManager and give some friendly tips Key: IOTDB-5344 URL: https://issues.apache.org/jira/browse/IOTDB-5344 Project: Apache IoTDB Issue Type: Improvement Reporter: Gaofei Cao Assignee: Gaofei Cao -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5143) [confignode]Couldn't load the configuration iotdb-common.properties from any of the known sources
[ https://issues.apache.org/jira/browse/IOTDB-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653994#comment-17653994 ] Gaofei Cao commented on IOTDB-5143: --- This issue is similar to https://issues.apache.org/jira/browse/IOTDB-5132, There are not any usefully logs, need reproduce in 1.0.1 branch. > [confignode]Couldn't load the configuration iotdb-common.properties from any > of the known sources > - > > Key: IOTDB-5143 > URL: https://issues.apache.org/jira/browse/IOTDB-5143 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Server >Affects Versions: 1.0.0 >Reporter: changxue >Assignee: Gaofei Cao >Priority: Major > Attachments: _allnodes-log.tar.gz, config.properties > > > environment: > 3C3D cluster, master Dec.8th source codes > iot-benchmark 1.0 is OK > reproduction: > 1. run iot-benchmark with 1 device 30 thousands sensors, configs see > attachment config.properties > 2. the benchmark logs show > {code} > 2022-12-08 11:09:54,288 WARN > org.apache.iotdb.tsfile.common.conf.TSFileDescriptor:132 - not found > iotdb-common.properties, use the default configs. > {code} > This message also shows in confignode logs. > benchmark configuration see attachment of config.properties, using 1.0 > 问题: > 没有任何数据写入:show regions只显示schema region的信息,没有data region的 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5132) create aligned timeseries about 50W sensors with benchmark, failed with 301 null
[ https://issues.apache.org/jira/browse/IOTDB-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653985#comment-17653985 ] Gaofei Cao commented on IOTDB-5132: --- There are not any usefully logs, need reproduce in 1.0.1 branch. > create aligned timeseries about 50W sensors with benchmark, failed with 301 > null > - > > Key: IOTDB-5132 > URL: https://issues.apache.org/jira/browse/IOTDB-5132 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Schema Manager >Affects Versions: 1.0.0 >Reporter: changxue >Assignee: Gaofei Cao >Priority: Major > Attachments: allnodes-log.tar.gz, config.properties, nohup.out > > > create aligned timeseries about 50W sensors with benchmark, failed with 301 > null > environment: > 3C3D cluster, the 1.0.0 release bin of allinone > benchmark: > 1.0 commit: 25c1f742 > config see attachment of config.properties and nohup.out is its full logs. > I'm going to do performance testing with benchmark on IoTDB. There is a > scenario: create 50W timeseries on 1 device. > reproduction: > 1. start IoTDB cluster successfully > 2. 4 minutes later start the iot-benchmark > error log of benchmark: > {code} > 2022-12-06 19:37:30,280 ERROR > cn.edu.tsinghua.iot.benchmark.iotdb100.IoTDB:359 - Register IoTDB schema > failed because > org.apache.iotdb.rpc.StatementExecutionException: 301: null > at org.apache.iotdb.rpc.RpcUtils.verifySuccess(RpcUtils.java:96) > at > org.apache.iotdb.session.SessionConnection.createAlignedTimeseries(SessionConnection.java:293) > at > org.apache.iotdb.session.Session.createAlignedTimeseries(Session.java:552) > at > cn.edu.tsinghua.iot.benchmark.iotdb100.IoTDB.registerTimeseries(IoTDB.java:332) > at > cn.edu.tsinghua.iot.benchmark.iotdb100.IoTDB.registerSchema(IoTDB.java:208) > at > cn.edu.tsinghua.iot.benchmark.tsdb.DBWrapper.registerSchema(DBWrapper.java:517) > at > cn.edu.tsinghua.iot.benchmark.client.SchemaClient.run(SchemaClient.java:94) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > error log of datanode: > {code} > 2022-12-06 19:37:23,184 [grpc-default-executor-0] WARN > o.a.ratis.util.LogUtils:124 - 1: Failed requestVote 5->1#0 > org.apache.ratis.protocol.exceptions.GroupMismatchException: 1: > group-0002 not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:150) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:351) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:360) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:355) > at > org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:618) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:175) > at > org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:382) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182) > at > org.apache.ratis.thirdparty.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:354) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > 说明: > 1. 单独创建timeseries是没有问题的。 > 2. 猜测是列太多的问题。benchmark创建的是对齐序列。 > 3. show timeseries root.** 发现并未创建成功,1个都没有 -- This message was sent by Atlassian
[jira] [Commented] (IOTDB-5115) [ ConfigNode ] “ConfigNodeRPC-Processor” thread leaks
[ https://issues.apache.org/jira/browse/IOTDB-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653982#comment-17653982 ] Gaofei Cao commented on IOTDB-5115: --- Wait the optimization and test result of https://issues.apache.org/jira/browse/IOTDB-5312 > [ ConfigNode ] “ConfigNodeRPC-Processor” thread leaks > - > > Key: IOTDB-5115 > URL: https://issues.apache.org/jira/browse/IOTDB-5115 > Project: Apache IoTDB > Issue Type: Bug > Components: mpp-cluster >Affects Versions: 1.0.0 >Reporter: 刘珍 >Assignee: Gaofei Cao >Priority: Major > Attachments: image-2022-12-05-09-13-14-368.png > > > rel/1.0 fbbca3f ,1副本,schema/data region CUSTOM 策略 > 启动1个datanode,启动1个BM连接这个datanode创建元数据,每个序列写入1个点。 > 再启动到第15个datanode,15个BM创建原数据时,元数据不能继续创建(实际创建144799 设备,预期应该创建15),ConfigNode > Leader 的ConfigNodeRPC-Processor线程数在不断增长: > !image-2022-12-05-09-13-14-368.png|width=831,height=444! > 测试环境-私有云1期 > 1.172.16.2.2~22 是DataNode(共21台 8C32GB) > BM :172.16.2.26~46 运行BM (共21台 4C16GB) > ConfigNode :172.16.2.23~25 (共3台 8C32GB) > 配置参数 > # > config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > # > schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > # > data_region_consensus_protocol_class=org.apache.iotdb.consensus.iot.IoTConsensus > # schema_replication_factor=1 > # data_replication_factor=1 > schema_region_group_extension_policy=CUSTOM > schema_region_group_per_database=6 > data_region_group_extension_policy=CUSTOM > data_region_group_per_database=8 > time_partition_interval=6048000 > ConfigNode配置 > MAX_HEAP_SIZE="20G" > MAX_DIRECT_MEMORY_SIZE="6G" > cn_target_config_node_list=172.16.2.23:22277 > DataNode配置 > MAX_HEAP_SIZE="20G" > MAX_DIRECT_MEMORY_SIZE="6G" > dn_data_dirs=data/datanode/data,/data1/iotdb/datanode/data > dn_target_config_node_list=172.16.2.23:22277,172.16.2.24:22277,172.16.2.25:22277 > Common 配置 > # > config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > # > schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > # > data_region_consensus_protocol_class=org.apache.iotdb.consensus.iot.IoTConsensus > # schema_replication_factor=1 > # data_replication_factor=1 > schema_region_group_extension_policy=CUSTOM > schema_region_group_per_database=6 > data_region_group_extension_policy=CUSTOM > data_region_group_per_database=8 > time_partition_interval=6048000 > 2. BM配置 > DB_SWITCH=IoTDB-014-SESSION_BY_TABLET > GROUP_NUMBER=1 > DEVICE_NUMBER=1 > REAL_INSERT_RATE=1.0 > SENSOR_NUMBER=100 > IS_SENSOR_TS_ALIGNMENT=true > IS_OUT_OF_ORDER=false > OUT_OF_ORDER_RATIO=0.5 > OPERATION_PROPORTION=1:0:0:0:0:0:0:0:0:0:0 > CLIENT_NUMBER=15 > LOOP=1 > BATCH_SIZE_PER_WRITE=1 > START_TIME=2018-8-29T00:00:00+08:00 > POINT_STEP=100 > OP_MIN_INTERVAL=0 > OP_MIN_INTERVAL_RANDOM=false > INSERT_DATATYPE_PROPORTION=1:1:1:1:1:1 > ENCODINGS=PLAIN/PLAIN/PLAIN/PLAIN/PLAIN/PLAIN > COMPRESSOR=SNAPPY > IS_DELETE_DATA=false > CREATE_SCHEMA=true > 3. 日志 > ConfigNode Leader > 172.16.2.23 > /data/iotdb/rel_1202_fbbca3f/logs_confignode_thread_leak -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (IOTDB-5244) [ratis][remove datanode]installSnapshot failed
[ https://issues.apache.org/jira/browse/IOTDB-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Song Ziyang reopened IOTDB-5244: > [ratis][remove datanode]installSnapshot failed > -- > > Key: IOTDB-5244 > URL: https://issues.apache.org/jira/browse/IOTDB-5244 > Project: Apache IoTDB > Issue Type: Bug > Components: mpp-cluster >Affects Versions: master branch, 1.0.0 >Reporter: 刘珍 >Assignee: Song Ziyang >Priority: Major > Attachments: iotdb_5244.conf > > > rel/1.0 1216_c92440f > 1. 启动3副本3C5D集群,config/schema/data 均是ratis协议。 > 2. BM写入数据,完成。 > 配置见附件。 > 3.缩容节点(ip73)调用stop-datanode.sh,再start, > 再stop-datanode.sh,再start。 > 执行缩容。 > 4.ip68 datanode 报错 > 2022-12-19 20:25:22,705 [grpc-default-executor-4936] ERROR > o.a.r.s.i.SnapshotInstallationHandler:96 - 5@group-0001001E: > installSnapshot failed > org.apache.ratis.io.CorruptedFileException: File > /data/liuzhen_test/master_1216_d426f7a/data/datanode/data/snapshot/.tmp.group-0001001E/snapshot-c01d9ca8-3f9a-4b02-9fb4-fa680eae89e0/66_158230/sequence/root.test.g_3/30/2538/1671443330163-38-0-0.tsfile.resource > (exist? false, length=0) is corrupted: MD5 mismatch for snapshot-158230 > installation. Renamed temporary snapshot file > /data/liuzhen_test/master_1216_d426f7a/data/datanode/data/snapshot/.tmp.group-0001001E/snapshot-c01d9ca8-3f9a-4b02-9fb4-fa680eae89e0/66_158230/sequence/root.test.g_3/30/2538/1671443330163-38-0-0.tsfile.resource > to > /data/liuzhen_test/master_1216_d426f7a/data/datanode/data/snapshot/.tmp.group-0001001E/snapshot-c01d9ca8-3f9a-4b02-9fb4-fa680eae89e0/66_158230/sequence/root.test.g_3/30/2538/1671443330163-38-0-0.tsfile.resource.corrupt20221219-202522_690 > at > org.apache.ratis.server.storage.SnapshotManager.installSnapshot(SnapshotManager.java:155) > at > org.apache.ratis.server.impl.ServerState.installSnapshot(ServerState.java:480) > at > org.apache.ratis.server.impl.SnapshotInstallationHandler.checkAndInstallSnapshot(SnapshotInstallationHandler.java:181) > at > org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshotImpl(SnapshotInstallationHandler.java:120) > at > org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshot(SnapshotInstallationHandler.java:94) > at > org.apache.ratis.server.impl.RaftServerImpl.installSnapshot(RaftServerImpl.java:1517) > at > org.apache.ratis.server.impl.RaftServerProxy.installSnapshot(RaftServerProxy.java:640) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService$2.process(GrpcServerProtocolService.java:242) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService$2.process(GrpcServerProtocolService.java:239) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService$ServerRequestStreamObserver.onNext(GrpcServerProtocolService.java:124) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:332) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:315) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:834) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 测试环境 > 1. 192.168.10.62/66/68 3ConfigNode72cpu 256GB > 192.168.10.62/66/68/64/73 5DataNode > 73机器:48CPU 384GB > 2.数据库配置参数 > COMMON配置 > schema_replication_factor=3 > data_replication_factor=3 > data_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > query_timeout_threshold=360 > ConfigNode配置 > cn_connection_timeout_ms=12 > MAX_HEAP_SIZE="8G" > DataNode配置 > MAX_HEAP_SIZE="192G" > MAX_DIRECT_MEMORY_SIZE="32G" > dn_max_connection_for_internal_service=300 > 3.BM配置见附件 > 写入完成 > 4.ip73 > stop-datanode.sh > 清缓存,启动datanode > stop-datanode.sh > 启动datanode > 执行缩容。查看节点状态及日志。 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-4830) [SchemaRegion migrated failed] remove datanode that has stopped ,confignode executes “DELETE_OLD_REGION_PEER” on this datanode
[ https://issues.apache.org/jira/browse/IOTDB-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653980#comment-17653980 ] Gaofei Cao commented on IOTDB-4830: --- Close this issue because the description is too long. The remaining problem will be resolved in https://issues.apache.org/jira/browse/IOTDB-5343 > [SchemaRegion migrated failed] remove datanode that has stopped ,confignode > executes “DELETE_OLD_REGION_PEER” on this datanode > -- > > Key: IOTDB-4830 > URL: https://issues.apache.org/jira/browse/IOTDB-4830 > Project: Apache IoTDB > Issue Type: Bug > Components: mpp-cluster >Affects Versions: 0.14.0-SNAPSHOT >Reporter: 刘珍 >Assignee: 陈哲涵 >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0-SNAPSHOT > > Attachments: image-2022-11-02-14-55-28-013.png, > image-2022-11-15-14-35-54-026.png, image-2022-11-15-14-37-38-147.png, > image-2022-11-15-15-10-58-501.png, image-2022-11-15-15-12-05-884.png, > iotdb_4830.conf, screenshot-1.png > > > m_1102_09e2566 > 1. 启动3副本 , 3C5D集群 > 2.调用stop-datanode.sh脚本正常停止ip76的 datanode > 3. benchmark写入数据完成 > 4. 缩容下线的ip76的datanode > confignode 会重试连接ip76 > ,并且有DELETE_OLD_REGION_PEER重试操作,DELETE_OLD_REGION_PEER可以不执行,因为不是缩容开始后的重试 : > 2022-11-02 14:34:23,637 [ProcExecWorker-9] ERROR > o.a.i.c.c.s.SyncDataNodeClientPool:113 - > {color:#DE350B}*DELETE_OLD_REGION_PEER*{color} failed on DataNode > TEndPoint(ip:192.168.10.76, port:9003) > 5. 启动 ip76 datanode , 可以看到remove开始在 ip76上执行 ,但此时此节点的状态却是Running, 应该是Removing。 > ip76 datanode log (已经在执行remove了): > 2022-11-02 14:38:45,611 [pool-53-IoTDB-Region-Migrate-Pool-1] INFO > o.a.i.d.s.RegionMigrateService$DeleteOldRegionPeerTask:493 - succeed to > remove region DataRegion[12] consensus group > 此时集群节点状态: > !image-2022-11-02-14-55-28-013.png! > TEST ENV > 192.168.10.72~76 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (IOTDB-5111) [ ratis ] Data is distributed across disks ,after the cluster is restarted, all data is lost
[ https://issues.apache.org/jira/browse/IOTDB-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Tan reopened IOTDB-5111: -- > [ ratis ] Data is distributed across disks ,after the cluster is restarted, > all data is lost > > > Key: IOTDB-5111 > URL: https://issues.apache.org/jira/browse/IOTDB-5111 > Project: Apache IoTDB > Issue Type: Bug > Components: mpp-cluster >Affects Versions: 1.0.0 >Reporter: 刘珍 >Assignee: Song Ziyang >Priority: Major > Attachments: image-2022-12-02-17-58-45-096.png, > image-2022-12-02-17-59-05-010.png > > > rel/1.0 > config/schema/data 3个协议均是ratis, > dn_data_dirs=data/datanode/data,/data1/iotdb/datanode/data > 跨盘存储, > 写入数据,重启集群,{color:#DE350B}*数据全部丢失*{color}。 > 还有1个问题,{color:#DE350B}snapshot目录下依然有.tmp.文件夹名称{color}: > !image-2022-12-02-17-59-05-010.png! > 测试环境-私有云1期 8C32GB > 1. 3副本3C7D > Common > data_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus > schema_replication_factor=3 > data_replication_factor=3 > wal_buffer_size_in_byte=1048576 > max_waiting_time_when_insert_blocked=360 > query_timeout_threshold=3600 > ConfigNode > MAX_HEAP_SIZE="20G" > MAX_DIRECT_MEMORY_SIZE="6G" > DataNode > MAX_HEAP_SIZE="20G" > MAX_DIRECT_MEMORY_SIZE="6G" > dn_data_dirs=data/datanode/data,/data1/iotdb/datanode/data > 2. 启动BM 写入数据 > GROUP_NUMBER=1 > DEVICE_NUMBER=1000 > REAL_INSERT_RATE=1.0 > SENSOR_NUMBER=1000 > IS_SENSOR_TS_ALIGNMENT=true > IS_OUT_OF_ORDER=false > OUT_OF_ORDER_RATIO=0.5 > OPERATION_PROPORTION=1:0:0:0:0:0:0:0:0:0:0 > CLIENT_NUMBER=50 > LOOP=1 > BATCH_SIZE_PER_WRITE=10 > START_TIME=2018-8-30T00:00:00+08:00 > POINT_STEP=200 > OP_MIN_INTERVAL=0 > OP_MIN_INTERVAL_RANDOM=false > INSERT_DATATYPE_PROPORTION=1:1:1:1:1:1 > ENCODINGS=PLAIN/PLAIN/PLAIN/PLAIN/PLAIN/PLAIN > COMPRESSOR=SNAPPY > IS_DELETE_DATA=false > CREATE_SCHEMA=true > BENCHMARK_CLUSTER=false > !image-2022-12-02-17-58-45-096.png! > 3. 重启集群 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5343) Verify the trigger_info.bin error and "you need to increase dn_max_connection_for_internal_service" when remove DataNode
Gaofei Cao created IOTDB-5343: - Summary: Verify the trigger_info.bin error and "you need to increase dn_max_connection_for_internal_service" when remove DataNode Key: IOTDB-5343 URL: https://issues.apache.org/jira/browse/IOTDB-5343 Project: Apache IoTDB Issue Type: Improvement Reporter: Gaofei Cao Assignee: Gaofei Cao This issue is a mirror of https://issues.apache.org/jira/browse/IOTDB-4830. rel/1.0 2022-11-29_a7a1738 ,{color:#de350b}下面的2类问题需要确认。{color} rel/1.0 2022-12-01_84c01ae 版本也有“问题1”的问题。 {*}问题1{*}:{color:#de350b}[ForkJoinPool.commonPool-worker-5] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 需要确认这个报错。{color} {*}问题2{*}: SET_SYSTEM_STATUS failed on DataNode TEndPoint(ip:172.20.70.3, port:9003) java.io.IOException: Borrow client from pool for node TEndPoint(ip:172.20.70.3, port:9003) failed, you need to increase dn_max_connection_for_internal_service. 因为ip3已经下线,缩容的时候confignode会set ip3的status,所以失败,{color:#de350b}但是报错信息中的you need to increase dn_max_connection_for_internal_service. 不合适。{color} 私有云3副本3C5D 1.启动3副本3C5D集群 2.stop ip3的datanode 3.BM写入数据,完成 4.缩容ip3的datanode,缩容成功。 查看ConfigNode Leader的日志: 2022-11-30 10:52:40,849 [ForkJoinPool.commonPool-worker-5] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 10:54:41,161 [ForkJoinPool.commonPool-worker-1] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 10:56:41,474 [ForkJoinPool.commonPool-worker-1] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 10:58:41,789 [ForkJoinPool.commonPool-worker-0] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 11:00:42,105 [ForkJoinPool.commonPool-worker-6] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 11:02:42,401 [ForkJoinPool.commonPool-worker-0] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 11:04:42,686 [0@group--StateMachineUpdater] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 11:06:42,972 [ForkJoinPool.commonPool-worker-5] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-/sm/.tmp.1_20583/trigger_info.bin] is already exist. 2022-11-30 11:11:48,561 [ProcExecWorker-2] ERROR o.a.i.c.c.s.SyncDataNodeClientPool:97 - {color:#de350b}SET_SYSTEM_STATUS failed on DataNode TEndPoint(ip:172.20.70.3, port:9003) java.io.IOException: Borrow client from pool for node TEndPoint(ip:172.20.70.3, port:9003) failed, you need to increase dn_max_connection_for_internal_service.{color} at org.apache.iotdb.commons.client.ClientManager.borrowClient(ClientManager.java:64) at org.apache.iotdb.confignode.client.sync.SyncDataNodeClientPool.sendSyncRequestToDataNodeWithGivenRetry(SyncDataNodeClientPool.java:87) at org.apache.iotdb.confignode.procedure.env.ConfigNodeProcedureEnv.markDataNodeAsRemovingAndBroadcast(ConfigNodeProcedureEnv.java:373) at org.apache.iotdb.confignode.procedure.impl.node.RemoveDataNodeProcedure.executeFromState(RemoveDataNodeProcedure.java:86) at org.apache.iotdb.confignode.procedure.impl.node.RemoveDataNodeProcedure.executeFromState(RemoveDataNodeProcedure.java:47) at org.apache.iotdb.confignode.procedure.impl.statemachine.StateMachineProcedure.execute(StateMachineProcedure.java:186) at org.apache.iotdb.confignode.procedure.Procedure.doExecute(Procedure.java:365) at org.apache.iotdb.confignod
[jira] [Created] (IOTDB-5342) Separate filtering and business logic in AbstractTreeVisitor
yanze chen created IOTDB-5342: - Summary: Separate filtering and business logic in AbstractTreeVisitor Key: IOTDB-5342 URL: https://issues.apache.org/jira/browse/IOTDB-5342 Project: Apache IoTDB Issue Type: Improvement Reporter: yanze chen Assignee: yanze chen Separate filtering and business logic in AbstractTreeVisitor -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5341) The front end plan of eventWindow in aggregation query
yang caiyin created IOTDB-5341: -- Summary: The front end plan of eventWindow in aggregation query Key: IOTDB-5341 URL: https://issues.apache.org/jira/browse/IOTDB-5341 Project: Apache IoTDB Issue Type: New Feature Reporter: yang caiyin Assignee: yang caiyin introduce eventWindow in a complete query procedure, which requires the sql parsing, statement analysing and logical plan -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5340) Backend of eventWindow in RawDataAggregationOperator
yang caiyin created IOTDB-5340: -- Summary: Backend of eventWindow in RawDataAggregationOperator Key: IOTDB-5340 URL: https://issues.apache.org/jira/browse/IOTDB-5340 Project: Apache IoTDB Issue Type: New Feature Reporter: yang caiyin Assignee: yang caiyin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5164) [disk]datanode takes too much disk space, should improve
[ https://issues.apache.org/jira/browse/IOTDB-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaofei Cao reassigned IOTDB-5164: - Assignee: Gaofei Cao (was: 陈哲涵) > [disk]datanode takes too much disk space, should improve > > > Key: IOTDB-5164 > URL: https://issues.apache.org/jira/browse/IOTDB-5164 > Project: Apache IoTDB > Issue Type: Improvement >Reporter: changxue >Assignee: Gaofei Cao >Priority: Major > Attachments: iotdb-common.properties, iotdb-confignode.properties, > iotdb-datanode.properties > > > [disk]datanode takes too much disk space, should improve > Here is the disk taking state of one node, it shows 124G data would take 230G > in one node, and there are 3 nodes with 3 replicas, so 124G data would take 6 > times of real one. This is too much. > {code} > 124G ./datanode/data/sequence > 51M ./datanode/data/unsequence > 104G ./datanode/data/snapshot > 228G ./datanode/data > 414M ./datanode/wal/root.test-0 > 401M ./datanode/wal/root.test-3 > 394M ./datanode/wal/root.test-1 > 410M ./datanode/wal/root.test-2 > 394M ./datanode/wal/root.test-4 > 2.0G ./datanode/wal > 4.0K ./datanode/system/compression_ratio > 16K ./datanode/system/schema > 4.0K ./datanode/system/roles > 8.0K ./datanode/system/users > 48K ./datanode/system/databases > 4.0K ./datanode/system/upgrade > 8.0K ./datanode/system/udf > 100K ./datanode/system > 5.2M ./datanode/consensus/schema_region > 356K ./datanode/consensus/data_region > 5.6M ./datanode/consensus > 230G ./datanode > 4.0K ./confignode/system/roles > 8.0K ./confignode/system/users > 4.0K ./confignode/system/procedure > 24K ./confignode/system > 4.1M ./confignode/consensus/47474747-4747-4747-4747- > 4.1M ./confignode/consensus > 4.1M ./confignode > 230G . > {code} > 124G的数据,单个节点上要占用230G的空间,这是个3节点集群配置的3副本,所以,它总共要占用6倍的磁盘空间。这实在太多了,我觉得需要优化。咱们snapshot的设计是否有部分重复。这部分空间是否可以复用。 > 说明:可能是因为磁盘空间不足导致readonly, 然后snapshot。 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5339) What about executing migrate sql when all the datanodes of given regiongroup is Unknown?
Gaofei Cao created IOTDB-5339: - Summary: What about executing migrate sql when all the datanodes of given regiongroup is Unknown? Key: IOTDB-5339 URL: https://issues.apache.org/jira/browse/IOTDB-5339 Project: Apache IoTDB Issue Type: Improvement Reporter: Gaofei Cao Assignee: Gaofei Cao -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5338) WAL buffer flush threshold optimaztion
Jinrui Zhang created IOTDB-5338: --- Summary: WAL buffer flush threshold optimaztion Key: IOTDB-5338 URL: https://issues.apache.org/jira/browse/IOTDB-5338 Project: Apache IoTDB Issue Type: Improvement Reporter: Jinrui Zhang Assignee: Haiming Zhu -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (IOTDB-5231) [monitor]datanode could not start when binding 9091 error
[ https://issues.apache.org/jira/browse/IOTDB-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Tan reopened IOTDB-5231: -- > [monitor]datanode could not start when binding 9091 error > -- > > Key: IOTDB-5231 > URL: https://issues.apache.org/jira/browse/IOTDB-5231 > Project: Apache IoTDB > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: changxue >Assignee: Hongyin Zhang >Priority: Major > Labels: pull-request-available > Attachments: config.tar.gz, monitor_error_log.tar.gz > > > [monitor]datanode could not start when binding 9091 error > environment: > 3C3D cluster, rel/1.0 branch > 1. enable prometheus monitor > 2. the prometheus service has not been started > problem: > 1. 监控是附加功能,打开它并且它工作不正常(可以warning),但不应该出现error,不应该影响rpc service等的启动。 > 2. 这种情况下,stop-datanode.sh 是不能成功停止成功的,需要kill > 3. confignode启动成功,且成功绑定了9091, datanode再绑定9091,结果失败。需要使之成功。 > {code} > 2022-12-19 10:26:23,574 [main] INFO o.a.i.m.AbstractMetricService:130 - > Detect more than one MetricManager, will use > org.apache.iotdb.metrics.micrometer.MicrometerMetricManager > 2022-12-19 10:26:23,574 [main] INFO o.a.i.m.AbstractMetricService:137 - Load > metric reporters, type: [PROMETHEUS] > 2022-12-19 10:26:23,939 [main] ERROR o.a.i.c.s.m.MetricService:52 - Failed to > start Metrics ServerService because: > reactor.netty.ChannelBindException: Failed to bind on [0.0.0.0:9091] > Suppressed: java.lang.Exception: #block terminated with an error > at > reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:139) > at reactor.core.publisher.Mono.block(Mono.java:1731) > at > reactor.netty.transport.ServerTransport.bindNow(ServerTransport.java:145) > at > reactor.netty.transport.ServerTransport.bindNow(ServerTransport.java:130) > at > org.apache.iotdb.metrics.reporter.prometheus.PrometheusReporter.start(PrometheusReporter.java:81) > at > org.apache.iotdb.metrics.CompositeReporter.startAll(CompositeReporter.java:38) > at > org.apache.iotdb.metrics.AbstractMetricService.startAllReporter(AbstractMetricService.java:193) > at > org.apache.iotdb.metrics.AbstractMetricService.startCoreModule(AbstractMetricService.java:98) > at > org.apache.iotdb.metrics.AbstractMetricService.startService(AbstractMetricService.java:76) > at > org.apache.iotdb.commons.service.metric.MetricService.start(MetricService.java:49) > at > org.apache.iotdb.commons.service.RegisterManager.register(RegisterManager.java:51) > at > org.apache.iotdb.db.service.DataNode.doAddNode(DataNode.java:162) > at > org.apache.iotdb.db.service.DataNodeServerCommandLine.run(DataNodeServerCommandLine.java:95) > at > org.apache.iotdb.commons.ServerCommandLine.doMain(ServerCommandLine.java:58) > at > org.apache.iotdb.db.service.DataNode.main(DataNode.java:131) > 2022-12-19 10:26:23,940 [main] ERROR o.a.i.db.service.DataNode:178 - Fail to > start server > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5337) Parallelization of write operation in FragmentInstanceDispatcher
[ https://issues.apache.org/jira/browse/IOTDB-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinrui Zhang reassigned IOTDB-5337: --- Sprint: 2023-1-Storage Assignee: Haiming Zhu Remaining Estimate: 72h Original Estimate: 72h > Parallelization of write operation in FragmentInstanceDispatcher > > > Key: IOTDB-5337 > URL: https://issues.apache.org/jira/browse/IOTDB-5337 > Project: Apache IoTDB > Issue Type: Improvement >Reporter: Jinrui Zhang >Assignee: Haiming Zhu >Priority: Major > Original Estimate: 72h > Remaining Estimate: 72h > > In current implementation, the write operations split will be dispatched one > by one. > > We can try to dispatch them in parallel to improve the speed -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5337) Parallelization of write operation in FragmentInstanceDispatcher
Jinrui Zhang created IOTDB-5337: --- Summary: Parallelization of write operation in FragmentInstanceDispatcher Key: IOTDB-5337 URL: https://issues.apache.org/jira/browse/IOTDB-5337 Project: Apache IoTDB Issue Type: Improvement Reporter: Jinrui Zhang In current implementation, the write operations split will be dispatched one by one. We can try to dispatch them in parallel to improve the speed -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5336) Investigation regarding write interface used by TSBS in IoTDB
Jinrui Zhang created IOTDB-5336: --- Summary: Investigation regarding write interface used by TSBS in IoTDB Key: IOTDB-5336 URL: https://issues.apache.org/jira/browse/IOTDB-5336 Project: Apache IoTDB Issue Type: Improvement Reporter: Jinrui Zhang Assignee: Haiming Zhu -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5335) InsertRecords performance optimization
Jinrui Zhang created IOTDB-5335: --- Summary: InsertRecords performance optimization Key: IOTDB-5335 URL: https://issues.apache.org/jira/browse/IOTDB-5335 Project: Apache IoTDB Issue Type: Improvement Reporter: Jinrui Zhang Assignee: Haiming Zhu -- This message was sent by Atlassian Jira (v8.20.10#820010)