Changing mid-flight is always a massive edge case already for many parts of the 
scheduler. Can we easily test this sort of behaviour in unit tests?

I don't think DST needs extra tests as it uses the existing functions that are 
already well tested, no?

-a

> On 23 Aug 2019, at 13:24, Jarek Potiuk <jarek.pot...@polidea.com> wrote:
> 
> Happy for it as well. There are a number of cases where scheduling at start
> makes more sense and as we see Airflow is used now in multiple cases where
> there is no need to process data from an interval and wait until that data
> is ready.
> But indeed some more tests would be great - especially for edge cases.
> Changig mid-air is one but I think there should be test about Daylight
> Saving Time changing.
> There are some tests for DST so they just need to be extended to cover
> those two different cases.
> 
> 
> J.
> 
> On Fri, Aug 23, 2019 at 7:37 AM Kaxil Naik <kaxiln...@gmail.com> wrote:
> 
>> Happy for this feature to merged
>> 
>> On Fri, Aug 23, 2019, 11:49 Ash Berlin-Taylor <a...@apache.org> wrote:
>> 
>>> This has come up a few times before, someone has now opened a PR that
>>> makes this a global+per-dag setting:
>>> https://github.com/apache/airflow/pull/5787 and it also includes docs
>>> that I think does a good job of illustrating the two modes.
>>> 
>>> Does anyone object to this being merged? If no one says anything by
>> midday
>>> on Tuesday I will take that as assent and will merge it.
>>> 
>>> The docs from the PR included below.
>>> 
>>> Thanks,
>>> Ash
>>> 
>>> Scheduled Time vs Execution Time
>>> ''''''''''''''''''''''''''''''''
>>> 
>>> A DAG with a ``schedule_interval`` will execute once per interval. By
>>> default, the execution of a DAG will occur at the **end** of the
>>> schedule interval.
>>> 
>>> A few examples:
>>> 
>>> - A DAG with ``schedule_interval='@hourly'``: The DAG run that processes
>>> 2019-08-16 17:00 will start running just after 2019-08-16 17:59:59,
>>> i.e. once that hour is over.
>>> - A DAG with ``schedule_interval='@daily'``: The DAG run that processes
>>> 2019-08-16 will start running shortly after 2019-08-17 00:00.
>>> 
>>> The reasoning behind this execution vs scheduling behaviour is that
>>> data for the interval to be processed won't be fully available until
>>> the interval has elapsed.
>>> 
>>> In cases where you wish the DAG to be executed at the **start** of the
>>> interval, specify ``schedule_at_interval_end=False``, either in
>>> ``airflow.cfg``, or on a per-DAG basis.
>> 
> 
> 
> -- 
> 
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
> 
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>

Reply via email to