15767714253 opened a new issue, #63117: URL: https://github.com/apache/doris/issues/63117
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 4.1.0 存算分离 ### What's Wrong? 长时间运行或执行 clear_file_cache 后,Doris file_cache 在磁盘上残留上万个空 hash 目录。这些目录占用 inode,但 GC、LRU、TTL、recycle_cached_data、clear_file_cache_async、甚至 leak cleaner (PR #59269) 都不会清理它们。最终 df -i 持续攀升而 df -h 仍有大量空闲空间。我把file_cache_leak_scan_interval_seconds 改成1分钟,file_cache_leak_fs_to_meta_ratio_threshold 1.05 也没有删除空目录,Inode持续增长。导致最终 这个盘无法使用,新数据只能从s3 获取 ### What You Expected? Inode应该被关注,空目录应该被删除, 对于频繁更新且数据量大的场景很需要。 Proposed Fix / 修复建议 Minimal fix — add directory cleanup in remove(): 在 remove() 中增加目录清理: // be/src/io/cache/block_file_cache.cpp void BlockFileCache::remove(FileBlockSPtr file_block, ...) { auto hash = file_block->get_hash_value(); // ... existing logic: remove from queue, _files, delete block file ... // Clean up empty parent directories auto hash_it = _files.find(hash); if (hash_it == _files.end() || hash_it->second.empty()) { auto cache_dir = file_block->get_cache_dir(); // hash dir path std::filesystem::remove(cache_dir); // no-op if not empty auto prefix_dir = cache_dir.parent_path(); if (std::filesystem::is_empty(prefix_dir)) { std::filesystem::remove(prefix_dir); } } } This automatically covers ALL eviction paths — GC, LRU, TTL, recycle_cached_data, and clear_file_cache_async. 该修改自动覆盖所有淘汰路径 — GC、LRU、TTL、recycle_cached_data 和 clear_file_cache_async。 ### How to Reproduce? 大量写入和更新,占据一定file_cache后,进行clear file cache 然后观察 Inode。 ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
