> > Personally, I think the best solution is to create a new command line > sub-command responsible for log and/or database cleaning. Users could then > come up with their own mechanism on how to run it (trigger it when disk or > storage percentage reaches high value or simply periodically using cron) on > every node they use.
+1. Fully agree. We already have a nice (and fast in 2.0) command line for airflow that we are using for different purposes, and having more of the maintenance commands there is a great idea. I think we just need to make sure that they are generic enough to be applicable in a wide range of scenarios - but I think it is what you actually proposed looking at the example you provided. Also, I think if we implement the maintenance actions well, they might be still used in "maintenance Dags" via Python operator. This way we leave freedom to the users if some maintenance is run via CRON "airflow NNN" command or whether they write DAGs to run them on workers (both cases might be justified). BTW. Isn't that funny that Airflow's aim is to replace CRON, yet we want to use CRON for its maintenance ;). I think it is fully justified in this case though- using Airflow to run such maintenance tasks might be as overkill. -- Jarek Potiuk Polidea | Principal Software Engineer M: +48 660 796 129
