>
> Personally, I think the best solution is to create a new command line
> sub-command responsible for log and/or database cleaning. Users could then
> come up with their own mechanism on how to run it (trigger it when disk or
> storage percentage reaches high value or simply periodically using cron) on
> every node they use.


+1. Fully agree. We already have a nice (and fast in 2.0) command line
for airflow that
we are using for different purposes, and having more of the maintenance commands
there is a great idea. I think we just need to make sure that they are
generic enough
to be applicable in a wide range of scenarios - but I think it is what
you actually
proposed looking at the example you provided.

Also, I think if we implement the maintenance actions well, they might
be still used in
"maintenance Dags" via Python operator. This way we leave freedom to
the users if
some maintenance is run via CRON "airflow NNN" command or whether they
write DAGs
to run them on workers (both cases might be justified).

BTW. Isn't that funny that Airflow's aim is to replace CRON, yet we want to
use CRON for its maintenance ;). I think it is fully justified in this
case though- using Airflow to run
such maintenance tasks might be as overkill.


-- 

Jarek Potiuk
Polidea | Principal Software Engineer

M: +48 660 796 129

Reply via email to