I like the idea of supporting start_date=None, but that absolutely should
not mean that we interpret start_date as “now”. start_date=now is one of
the most common ways to shoot yourself in the foot writing DAGs. I think
interpreting start_date=None as “don’t do any sort of catchup and run the
next time you’re able” makes some amount of sense, but I like Philippe’s
idea a little more. Specifically, it seems like bool is simply not a
correct type for catchup, as we can describe at least 3 behaviors that make
sense. What if we change the default type to string, and support bool as a
legacy at least until 3.0?

Catchup="all" (or True): run all intervals. Make "all" the default.
Catchup="none" : do not run any past interval
Catchup="last" (or False) run only the most recent interval

On Tue, Mar 22, 2022 at 1:15 PM Daniel Standish
<[email protected]> wrote:

> There's some wiggliness here because of Airflow's behavior of actually
> *running* the dag at the end of the interval rather than the start.  So
> if we have start_date=None, then we default the start date to *now,* then
> maybe to be consistent, the first run needs to be not 00:00 tomorrow but
> 00:00 the next day.  The oddness is amplified when you consider a monthly
> dag, where if you deploy now, start date is now, first schedulable run is
> next month, therefore first run _more_ than a month away.  To fix this I
> think we need to add support in our timetables for running at the start of
> the interval instead of the end -- and I think this is something that
> timetables were introduced to support anyway.
>
>
>

Reply via email to