Happy for this feature to merged On Fri, Aug 23, 2019, 11:49 Ash Berlin-Taylor <a...@apache.org> wrote:
> This has come up a few times before, someone has now opened a PR that > makes this a global+per-dag setting: > https://github.com/apache/airflow/pull/5787 and it also includes docs > that I think does a good job of illustrating the two modes. > > Does anyone object to this being merged? If no one says anything by midday > on Tuesday I will take that as assent and will merge it. > > The docs from the PR included below. > > Thanks, > Ash > > Scheduled Time vs Execution Time > '''''''''''''''''''''''''''''''' > > A DAG with a ``schedule_interval`` will execute once per interval. By > default, the execution of a DAG will occur at the **end** of the > schedule interval. > > A few examples: > > - A DAG with ``schedule_interval='@hourly'``: The DAG run that processes > 2019-08-16 17:00 will start running just after 2019-08-16 17:59:59, > i.e. once that hour is over. > - A DAG with ``schedule_interval='@daily'``: The DAG run that processes > 2019-08-16 will start running shortly after 2019-08-17 00:00. > > The reasoning behind this execution vs scheduling behaviour is that > data for the interval to be processed won't be fully available until > the interval has elapsed. > > In cases where you wish the DAG to be executed at the **start** of the > interval, specify ``schedule_at_interval_end=False``, either in > ``airflow.cfg``, or on a per-DAG basis.