[ https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426554#comment-17426554 ]
zlzhang0122 commented on FLINK-24122: ------------------------------------- [~trohrmann] [~pnowojski] [~jark] [~gyfora] what do you think? Any suggestion is very appreciate! > Add support to do clean in history server > ----------------------------------------- > > Key: FLINK-24122 > URL: https://issues.apache.org/jira/browse/FLINK-24122 > Project: Flink > Issue Type: Bug > Components: Runtime / REST > Affects Versions: 1.12.3, 1.13.2 > Reporter: zlzhang0122 > Priority: Minor > Fix For: 1.14.1 > > > Now, the history server can clean history jobs by two means: > # if users have configured > {code:java} > historyserver.archive.clean-expired-jobs: true{code} > , then compare the files in hdfs over two clean interval and find the delete > and clean the local cache file. > # if users have configured the > {code:java} > historyserver.archive.retained-jobs:{code} > a positive number, then clean the oldest files in hdfs and local. > But the retained-jobs number is difficult to determine. > For example, users may want to check the history jobs yesterday while many > jobs failed today and exceed the retained-jobs number, then the history jobs > of yesterday will be delete. So what if add a configuration which contain a > retained-times that indicate the max time the history job retain? > Also it can't clean the job history files which was no longer in hdfs but > still cached in local filesystem and these files will store forever and can't > be cleaned unless users manually do this. Maybe we can give a option and do > this clean if the option says true. -- This message was sent by Atlassian Jira (v8.3.4#803005)