[jira] [Created] (IOTDB-5835) Fix wal accumulation caused by datanode restart
Xinyu Tan created IOTDB-5835: Summary: Fix wal accumulation caused by datanode restart Key: IOTDB-5835 URL: https://issues.apache.org/jira/browse/IOTDB-5835 Project: Apache IoTDB Issue Type: Improvement Reporter: Xinyu Tan Assignee: Xinyu Tan Attachments: image-2023-04-28-11-08-43-542.png, image-2023-04-28-11-08-51-622.png, image-2023-04-28-11-08-57-549.png, image-2023-04-28-11-09-03-902.png When cluster is running properly, if replica A of a consensus group becomes the Leader, it continuously sends logs to other followers and updates wal's safelyDeletedSearchIndex after sending logs. wal files is deleted asynchronously. Therefore, if a restart occurs, some logs that have been synchronized to other nodes may not be deleted. After the restart, perhaps another replica B becomes the Leader and the current replica A becomes a Follower receiving logs. Because the current IoTConsensus does not use its recovered syncIndex to set the safelyDeletedSearchIndex of the underlying walnode at startup, replica A cannot delete wal files at this time, which results in the accumulation of WAL files. Write requests of all regions on the node are affected. !image-2023-04-28-11-08-43-542.png|thumbnail! !image-2023-04-28-11-08-51-622.png|thumbnail! !image-2023-04-28-11-08-57-549.png|thumbnail! !image-2023-04-28-11-09-03-902.png|thumbnail! The solution to this problem is to update the safelyDeletedSearchIndex of reader at startup -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5814) IoTDBSchemaTemplateIT unexpected failure
[ https://issues.apache.org/jira/browse/IOTDB-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yukun Zhou reassigned IOTDB-5814: - Assignee: yanze chen (was: Yukun Zhou) > IoTDBSchemaTemplateIT unexpected failure > > > Key: IOTDB-5814 > URL: https://issues.apache.org/jira/browse/IOTDB-5814 > Project: Apache IoTDB > Issue Type: Bug >Reporter: Yukun Zhou >Assignee: yanze chen >Priority: Major > Attachments: screenshot-1.png > > > There are occasional failure shown as following picure. > !image-2023-04-25-09-18-00-851.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5834) Unclear error msg when querying nonexistent schema template
Yukun Zhou created IOTDB-5834: - Summary: Unclear error msg when querying nonexistent schema template Key: IOTDB-5834 URL: https://issues.apache.org/jira/browse/IOTDB-5834 Project: Apache IoTDB Issue Type: Bug Reporter: Yukun Zhou Assignee: Yukun Zhou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5833) Concurrent bug caused by non-atomic operation in QueryExecution
Yuan Tian created IOTDB-5833: Summary: Concurrent bug caused by non-atomic operation in QueryExecution Key: IOTDB-5833 URL: https://issues.apache.org/jira/browse/IOTDB-5833 Project: Apache IoTDB Issue Type: Bug Components: Core/Query Reporter: Yuan Tian Assignee: Yuan Tian Attachments: image-2023-04-28-09-24-08-392.png !image-2023-04-28-09-24-08-392.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5832) Bug: The size of readyQueue is negative incorrectly
Alima777 created IOTDB-5832: --- Summary: Bug: The size of readyQueue is negative incorrectly Key: IOTDB-5832 URL: https://issues.apache.org/jira/browse/IOTDB-5832 Project: Apache IoTDB Issue Type: Bug Reporter: Alima777 Assignee: Alima777 Attachments: image-2023-04-27-22-07-07-356.png !image-2023-04-27-22-07-07-356.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5831) Drop database won't delete totally files in disk
Yuan Tian created IOTDB-5831: Summary: Drop database won't delete totally files in disk Key: IOTDB-5831 URL: https://issues.apache.org/jira/browse/IOTDB-5831 Project: Apache IoTDB Issue Type: Bug Components: Core/Schema Manager Reporter: Yuan Tian Assignee: Yukun Zhou While dropping database and inserting into that database concurrently, I found that there may exist stale data region related directories in disk like in consensus sub directory and wal sub directory. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5830) WAL file size is not right
Yuan Tian created IOTDB-5830: Summary: WAL file size is not right Key: IOTDB-5830 URL: https://issues.apache.org/jira/browse/IOTDB-5830 Project: Apache IoTDB Issue Type: Bug Components: Core/WAL Reporter: Yuan Tian Assignee: Hongyin Zhang Attachments: image-2023-04-27-20-05-36-019.png, image-2023-04-27-20-05-45-373.png !image-2023-04-27-20-05-36-019.png! !image-2023-04-27-20-05-45-373.png! The actual wal size is less than 1GB, but the monitor says that it occupy more than 50GB which is also used by writing thread to judge whether to throttle insert rate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5829) Query with limit clause will cause other concurrent query break down
Yuan Tian created IOTDB-5829: Summary: Query with limit clause will cause other concurrent query break down Key: IOTDB-5829 URL: https://issues.apache.org/jira/browse/IOTDB-5829 Project: Apache IoTDB Issue Type: Bug Components: Core/Query Reporter: Yuan Tian Assignee: Yuan Tian Attachments: image-2023-04-27-20-01-33-737.png !image-2023-04-27-20-01-33-737.png! 这个异常是因为Thread-A正在做IO,然后被其他线程interrup了(这里的原因有很多,比如超时,或者下游limit完了,需要提前结束上游的scan任务),然后,JVM就会自动把这个FileChannel给close了;但是我们的文件句柄是共享的,这就意味着如果此时同时有Thread-B正在使用这个文件句柄,就会因为文件句柄被close而异常退出。。 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5828) Optimize the implementation of some metric items in the metric module to prevent Prometheus pull timeouts
Xinyu Tan created IOTDB-5828: Summary: Optimize the implementation of some metric items in the metric module to prevent Prometheus pull timeouts Key: IOTDB-5828 URL: https://issues.apache.org/jira/browse/IOTDB-5828 Project: Apache IoTDB Issue Type: Improvement Reporter: Xinyu Tan Assignee: Liuxuxin Attachments: image-2023-04-27-17-01-37-144.png, image-2023-04-27-17-03-29-978.png !image-2023-04-27-17-03-29-978.png! !image-2023-04-27-17-01-37-144.png! Under high write pressure, even without Full GC, the elapsed time of individual monitor items in the monitoring framework will cause the Prometheus pull sampling timeout, resulting in missing monitor data, which ultimately affects performance problem troubleshooting. The three main time points found by jprofile sampling are the number of file handles, the number of client concurrency, and the number of threads. The implementation needs to be optimized -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IOTDB-5827) Change default multi_dir_strategy to SequenceStrategy and fix original bug
Jinrui Zhang created IOTDB-5827: --- Summary: Change default multi_dir_strategy to SequenceStrategy and fix original bug Key: IOTDB-5827 URL: https://issues.apache.org/jira/browse/IOTDB-5827 Project: Apache IoTDB Issue Type: Improvement Reporter: Jinrui Zhang Fix For: 1.1.1 # change default multi_dir_strategy to SequenceStrategy # fix original bug in SequenceStrategy where one folder won't used if others space is limited # use {{diskSpaceWarningThreshold}} to decide whether a folder is full or not -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IOTDB-5777) When writing data using non-root users, the permission authentication module takes too long
[ https://issues.apache.org/jira/browse/IOTDB-5777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinyu Tan reassigned IOTDB-5777: Assignee: Hongyin Zhang > When writing data using non-root users, the permission authentication module > takes too long > --- > > Key: IOTDB-5777 > URL: https://issues.apache.org/jira/browse/IOTDB-5777 > Project: Apache IoTDB > Issue Type: Improvement >Reporter: Liuxuxin >Assignee: Hongyin Zhang >Priority: Major > Fix For: master branch, 1.1.0 > > Attachments: 20230414-162617.html, image-2023-04-17-11-27-41-532.png > > > When writing data using non-root users, the time consumption of the > permission authentication module is too high, accounting for about 2/3 of the > total write time. The flame graph shows that the time consumption is mainly > concentrated on the initialization of PartialPath. > !image-2023-04-17-11-27-41-532.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)