[
https://issues.apache.org/jira/browse/FLINK-38344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei Zhong reassigned FLINK-38344:
---------------------------------
Assignee: RocMarshal
> The local files of the HistoryServer may risk never being deleted.
> ------------------------------------------------------------------
>
> Key: FLINK-38344
> URL: https://issues.apache.org/jira/browse/FLINK-38344
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Web Frontend
> Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.1.1
> Reporter: RocMarshal
> Assignee: RocMarshal
> Priority: Minor
> Labels: pull-request-available
> Fix For: 2.2.0, 2.1.1, 1.20.4
>
> Attachments: image-2025-09-11-00-31-26-595.png,
> image-2025-09-11-00-32-25-793.png, image-2025-09-11-00-34-54-580.png
>
>
> When the {{historyserver.web.tmpdir }}configuration points to a non-system
> temporary directory, the contents of this directory will only be cleaned up
> if explicitly deleted.
> Under the current cleanup logic, this directory is cleared in the following
> two scenarios:
> 1.
> {*}When the HistoryServer encounters an exception{*}, it actively cleans
> up this directory. However, if the HistoryServer process is forcibly
> terminated externally, this cleanup logic will not be triggered.
> !image-2025-09-11-00-31-26-595.png!
>
> !image-2025-09-11-00-32-25-793.png!
>
> 2.
> {*}The {{{}HistoryServerArchiveFetcher{}}}{*} builds
> {{{}cachedArchivesPerRefreshDirectory{}}}based on the job information still
> present in the remote directory and uses this to determine which local job
> files need cleanup. Consequently, if the HistoryServer retains a large number
> of local job files that no longer exist in remote storage, these files will
> never be deleted. This may lead to excessive file handle usage on the local
> node, resulting in file descriptor leaks.
> !image-2025-09-11-00-34-54-580.png!
>
>
>
>
> A relatively straightforward fix would be:
> In the HistoryServer constructor, first clear all files in the
> {{{}historyserver.web.tmpdir{}}}directory before proceeding with the original
> initialization logic. This ensures that the local files marked for
> cleanup—based on
> {{{}HistoryServerArchiveFetcher#cachedArchivesPerRefreshDirectory{}}}—are
> free from leaks.
> I'd like to fix it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)