[jira] [Commented] (IOTDB-5557) [ metadata ] The metadata query results are inconsistent
[ https://issues.apache.org/jira/browse/IOTDB-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751560#comment-17751560 ] Xinyu Tan commented on IOTDB-5557: -- 第一次测试中唯一一条异常日志的原因分析如 https://issues.apache.org/jira/browse/IOTDB-6102 所示,与本 PR 无关,当前 PR 所对应的问题已解决。 > [ metadata ] The metadata query results are inconsistent > > > Key: IOTDB-5557 > URL: https://issues.apache.org/jira/browse/IOTDB-5557 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Schema Manager, mpp-cluster >Affects Versions: 1.1.0-SNAPSHOT >Reporter: 刘珍 >Assignee: Song Ziyang >Priority: Blocker > Labels: pull-request-available > Attachments: IOTDB_5557.conf, image-2023-02-20-14-04-32-611.png, > image-2023-07-29-08-21-43-740.png, screenshot-1.png > > > master : 0219_0cd4461 > 启动集群,log_datanode_all.log出现enjoy后,查询元数据,出现查询结果不一致(动态增加,直到全部元数据加载到内存)。 > 期望:只要集群已经开始提供查询服务,就要保证查询结果的一致性。 > 测试环境: > 1. 192.168.10.76 48cpu 384GB 内存 > 元数据信息:1db,1万设备,600序列/dev。 > ConfigNode: > MAX_HEAP_SIZE="8G" > DataNode: > MAX_HEAP_SIZE="256G" > MAX_DIRECT_MEMORY_SIZE="32G" > COMMON配置 > time_partition_interval=6048000 > query_timeout_threshold=3600 > enable_seq_space_compaction=false > enable_unseq_space_compaction=false > enable_cross_space_compaction=false > 2. 清操作系统缓存,启动数据库,出现enjoy后,执行count devices查看结果 > cat check_device_count.sh > while true > do > v_start=`grep enjoy logs/log_datanode_all.log|wc -l` > if [[ ${v_start} = "1" ]];then > for i in {1..100} > do >./sbin/start-cli.sh -h 192.168.10.76 -e "count devices;" > >> dev_count_during_start.out > done > break > fi > done > 下图结果,可以看出,count devices的结果在动态增加,直至1,完全加载到内存中: > !image-2023-02-20-14-04-32-611.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5557) [ metadata ] The metadata query results are inconsistent
[ https://issues.apache.org/jira/browse/IOTDB-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748765#comment-17748765 ] 刘珍 commented on IOTDB-5557: --- kill datanode 不产生snapshot,重启datanode ,执行查询脚本: #!/bin/bash nohup sudo ./sbin/start-standalone.sh >/dev/null 2>&1 & while true do if [[ -f "./logs/log_datanode_all.log" ]];then break fi done while true do v_start=`grep enjoy logs/log_datanode_all.log|wc -l` if [[ ${v_start} = "1" ]];then break fi done ./sbin/start-cli.sh -e "count devices" > ./1.out for i in {0..20} do ./sbin/start-cli.sh -e "count devices" >> ./1.out done 前9条查询,报错: !screenshot-1.png! > [ metadata ] The metadata query results are inconsistent > > > Key: IOTDB-5557 > URL: https://issues.apache.org/jira/browse/IOTDB-5557 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Schema Manager, mpp-cluster >Affects Versions: 1.1.0-SNAPSHOT >Reporter: 刘珍 >Assignee: Song Ziyang >Priority: Blocker > Labels: pull-request-available > Attachments: IOTDB_5557.conf, image-2023-02-20-14-04-32-611.png, > screenshot-1.png > > > master : 0219_0cd4461 > 启动集群,log_datanode_all.log出现enjoy后,查询元数据,出现查询结果不一致(动态增加,直到全部元数据加载到内存)。 > 期望:只要集群已经开始提供查询服务,就要保证查询结果的一致性。 > 测试环境: > 1. 192.168.10.76 48cpu 384GB 内存 > 元数据信息:1db,1万设备,600序列/dev。 > ConfigNode: > MAX_HEAP_SIZE="8G" > DataNode: > MAX_HEAP_SIZE="256G" > MAX_DIRECT_MEMORY_SIZE="32G" > COMMON配置 > time_partition_interval=6048000 > query_timeout_threshold=3600 > enable_seq_space_compaction=false > enable_unseq_space_compaction=false > enable_cross_space_compaction=false > 2. 清操作系统缓存,启动数据库,出现enjoy后,执行count devices查看结果 > cat check_device_count.sh > while true > do > v_start=`grep enjoy logs/log_datanode_all.log|wc -l` > if [[ ${v_start} = "1" ]];then > for i in {1..100} > do >./sbin/start-cli.sh -h 192.168.10.76 -e "count devices;" > >> dev_count_during_start.out > done > break > fi > done > 下图结果,可以看出,count devices的结果在动态增加,直至1,完全加载到内存中: > !image-2023-02-20-14-04-32-611.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5557) [ metadata ] The metadata query results are inconsistent
[ https://issues.apache.org/jira/browse/IOTDB-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747338#comment-17747338 ] 刘珍 commented on IOTDB-5557: --- 192.168.10.71 48CPU 256GB 20230706 测试 干净1C1D 使用附件中的benchmark 配置,写入数据完成执行flush。 清os缓存,重启集群。 iotdb master 0219_0cd4461 ,出现enjoy 立即查询元数据,可以看到元数据在动态replay,查询到的devices数一直在增加,直到完全replay。datanode 启动耗时{color:#DE350B}8秒{color}。 iotdb master 0725_1be3e0d,出现enjoy 立即查询元数据,count devices 符合预期。datanode 启动耗时变慢(可以认为正常,保证了功能的一致性),{color:#DE350B}50秒{color}。 重启脚本: #!/bin/bash nohup sudo ./sbin/start-standalone.sh >/dev/null 2>&1 & while true do if [[ -f "./logs/log_datanode_all.log" ]];then break fi done while true do v_start=`grep enjoy logs/log_datanode_all.log|wc -l` if [[ ${v_start} = "1" ]];then break fi done ./sbin/start-cli.sh -e "count devices" > ./1.out for i in {0..10} do ./sbin/start-cli.sh -e "count devices" >> ./1.out done {color:#DE350B}配置参数需要修改:{color} liuzhen@fit-71:/data/mpp_test/i_m_0725_1be3e0d$ conf/confignode-env.sh MAX_HEAP_SIZE="8G" liuzhen@fit-71:/data/mpp_test/i_m_0725_1be3e0d$ conf/datanode-env.sh MAX_HEAP_SIZE="256G" MAX_DIRECT_MEMORY_SIZE="32G" liuzhen@fit-71:/data/mpp_test/i_m_0725_1be3e0d$ conf/iotdb-common.properties {color:#DE350B}data_region_group_extension_policy=CUSTOM default_data_region_group_num_per_database=48{color} 设置data region数为48,因为iotdb master 0219_0cd4461 使用这个bm配置生成的data region数事48 ,为了保证可比性。 > [ metadata ] The metadata query results are inconsistent > > > Key: IOTDB-5557 > URL: https://issues.apache.org/jira/browse/IOTDB-5557 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Schema Manager, mpp-cluster >Affects Versions: 1.1.0-SNAPSHOT >Reporter: 刘珍 >Assignee: Song Ziyang >Priority: Blocker > Labels: pull-request-available > Attachments: image-2023-02-20-14-04-32-611.png > > > master : 0219_0cd4461 > 启动集群,log_datanode_all.log出现enjoy后,查询元数据,出现查询结果不一致(动态增加,直到全部元数据加载到内存)。 > 期望:只要集群已经开始提供查询服务,就要保证查询结果的一致性。 > 测试环境: > 1. 192.168.10.76 48cpu 384GB 内存 > 元数据信息:1db,1万设备,600序列/dev。 > ConfigNode: > MAX_HEAP_SIZE="8G" > DataNode: > MAX_HEAP_SIZE="256G" > MAX_DIRECT_MEMORY_SIZE="32G" > COMMON配置 > time_partition_interval=6048000 > query_timeout_threshold=3600 > enable_seq_space_compaction=false > enable_unseq_space_compaction=false > enable_cross_space_compaction=false > 2. 清操作系统缓存,启动数据库,出现enjoy后,执行count devices查看结果 > cat check_device_count.sh > while true > do > v_start=`grep enjoy logs/log_datanode_all.log|wc -l` > if [[ ${v_start} = "1" ]];then > for i in {1..100} > do >./sbin/start-cli.sh -h 192.168.10.76 -e "count devices;" > >> dev_count_during_start.out > done > break > fi > done > 下图结果,可以看出,count devices的结果在动态增加,直至1,完全加载到内存中: > !image-2023-02-20-14-04-32-611.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5557) [ metadata ] The metadata query results are inconsistent
[ https://issues.apache.org/jira/browse/IOTDB-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714493#comment-17714493 ] Jinrui Zhang commented on IOTDB-5557: - DataNode should be ready (visible to client) until all the replays of metadata operations finish > [ metadata ] The metadata query results are inconsistent > > > Key: IOTDB-5557 > URL: https://issues.apache.org/jira/browse/IOTDB-5557 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Schema Manager, mpp-cluster >Affects Versions: 1.1.0-SNAPSHOT >Reporter: 刘珍 >Assignee: Song Ziyang >Priority: Blocker > Attachments: image-2023-02-20-14-04-32-611.png > > > master : 0219_0cd4461 > 启动集群,log_datanode_all.log出现enjoy后,查询元数据,出现查询结果不一致(动态增加,直到全部元数据加载到内存)。 > 期望:只要集群已经开始提供查询服务,就要保证查询结果的一致性。 > 测试环境: > 1. 192.168.10.76 48cpu 384GB 内存 > 元数据信息:1db,1万设备,600序列/dev。 > ConfigNode: > MAX_HEAP_SIZE="8G" > DataNode: > MAX_HEAP_SIZE="256G" > MAX_DIRECT_MEMORY_SIZE="32G" > COMMON配置 > time_partition_interval=6048000 > query_timeout_threshold=3600 > enable_seq_space_compaction=false > enable_unseq_space_compaction=false > enable_cross_space_compaction=false > 2. 清操作系统缓存,启动数据库,出现enjoy后,执行count devices查看结果 > cat check_device_count.sh > while true > do > v_start=`grep enjoy logs/log_datanode_all.log|wc -l` > if [[ ${v_start} = "1" ]];then > for i in {1..100} > do >./sbin/start-cli.sh -h 192.168.10.76 -e "count devices;" > >> dev_count_during_start.out > done > break > fi > done > 下图结果,可以看出,count devices的结果在动态增加,直至1,完全加载到内存中: > !image-2023-02-20-14-04-32-611.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IOTDB-5557) [ metadata ] The metadata query results are inconsistent
[ https://issues.apache.org/jira/browse/IOTDB-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691188#comment-17691188 ] Yukun Zhou commented on IOTDB-5557: --- The root cause is the async mechanism of redoing raft log in ratis. > [ metadata ] The metadata query results are inconsistent > > > Key: IOTDB-5557 > URL: https://issues.apache.org/jira/browse/IOTDB-5557 > Project: Apache IoTDB > Issue Type: Bug > Components: Core/Schema Manager, mpp-cluster >Affects Versions: 1.1.0-SNAPSHOT >Reporter: 刘珍 >Assignee: Yukun Zhou >Priority: Major > Attachments: image-2023-02-20-14-04-32-611.png > > > master : 0219_0cd4461 > 启动集群,log_datanode_all.log出现enjoy后,查询元数据,出现查询结果不一致(动态增加,直到全部元数据加载到内存)。 > 期望:只要集群已经开始提供查询服务,就要保证查询结果的一致性。 > 测试环境: > 1. 192.168.10.76 48cpu 384GB 内存 > 元数据信息:1db,1万设备,600序列/dev。 > ConfigNode: > MAX_HEAP_SIZE="8G" > DataNode: > MAX_HEAP_SIZE="256G" > MAX_DIRECT_MEMORY_SIZE="32G" > COMMON配置 > time_partition_interval=6048000 > query_timeout_threshold=3600 > enable_seq_space_compaction=false > enable_unseq_space_compaction=false > enable_cross_space_compaction=false > 2. 清操作系统缓存,启动数据库,出现enjoy后,执行count devices查看结果 > cat check_device_count.sh > while true > do > v_start=`grep enjoy logs/log_datanode_all.log|wc -l` > if [[ ${v_start} = "1" ]];then > for i in {1..100} > do >./sbin/start-cli.sh -h 192.168.10.76 -e "count devices;" > >> dev_count_during_start.out > done > break > fi > done > 下图结果,可以看出,count devices的结果在动态增加,直至1,完全加载到内存中: > !image-2023-02-20-14-04-32-611.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)