[ 
https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-24122:
--------------------------------
    Fix Version/s: 1.20.0

> Add support to do clean in history server
> -----------------------------------------
>
>                 Key: FLINK-24122
>                 URL: https://issues.apache.org/jira/browse/FLINK-24122
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / REST
>            Reporter: zlzhang0122
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.19.0, 1.20.0
>
>
> Now, the history server can clean history jobs by two means:
>  # if users have configured 
> {code:java}
> historyserver.archive.clean-expired-jobs: true{code}
> , then compare the files in hdfs over two clean interval and find the delete 
> and clean the local cache file.
>  # if users have configured the 
> {code:java}
> historyserver.archive.retained-jobs:{code}
> a positive number, then clean the oldest files in hdfs and local.
> But the retained-jobs number is difficult to determine.
> For example, users may want to check the history jobs yesterday while many 
> jobs failed today and exceed the retained-jobs number, then the history jobs 
> of yesterday will be delete. So what if add a configuration which contain a 
> retained-times that indicate the max time the history job retain?
> Also it can't clean the job history files which was no longer in hdfs but 
> still cached in local filesystem and these files will store forever and can't 
> be cleaned unless users manually do this. Maybe we can give a option and do 
> this clean if the option says true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to