I think we should use the UTC date for cron instead of the naive local date 
time. I will check of croniter implements this so we can rely on that.

B.

> On 28 Oct 2018, at 02:09, Bolke de Bruin <bdbr...@gmail.com> wrote:
> 
> I wonder how to treat this:
> 
> This is what I think happens (need to verify more, but I am pretty sure) the 
> specified DAG should run every 5 minutes. At DST change (3AM -> 2AM) we 
> basically hit a schedule that we have already seen. 2AM -> 3AM has already 
> happened. Obviously the intention is to run every 5 minutes. But what do we 
> do with the execution_date? Is this still idempotent? Should we indeed 
> reschedule? 
> 
> B.
> 
>> On 30 Oct 2018, at 19:01, Ash Berlin-Taylor <a...@apache.org> wrote:
>> 
>> I've done a bit more digging - the issue is of our tz-aware handling inside 
>> following_schedule (and previous schedule) - causing it to loop.
>> 
>> This section of the croniter docs seems relevant 
>> https://github.com/kiorky/croniter#about-dst
>> 
>>   Be sure to init your croniter instance with a TZ aware datetime for this 
>> to work !:
>>>>> local_date = tz.localize(datetime(2017, 3, 26))
>>>>> val = croniter('0 0 * * *', local_date).get_next(datetime)
>> 
>> I think the problem is that we are _not_ passing a TZ aware dag in and we 
>> should be.
>> 
>>> On 30 Oct 2018, at 17:35, Bolke de Bruin <bdbr...@gmail.com> wrote:
>>> 
>>> Oh that’s a great environment to start digging. Thanks. I’ll have a look.
>>> 
>>> B.
>>> 
>>> Verstuurd vanaf mijn iPad
>>> 
>>>> Op 30 okt. 2018 om 18:25 heeft Ash Berlin-Taylor <a...@apache.org> het 
>>>> volgende geschreven:
>>>> 
>>>> This line in airflow.jobs (line 874 in my checkout) is causing the loop:
>>>> 
>>>>         last_run = dag.get_last_dagrun(session=session)
>>>>         if last_run and next_run_date:
>>>>             while next_run_date <= last_run.execution_date:
>>>>                 next_run_date = dag.following_schedule(next_run_date)
>>>> 
>>>> 
>>>> 
>>>>> On 30 Oct 2018, at 17:20, Ash Berlin-Taylor <a...@apache.org> wrote:
>>>>> 
>>>>> Hi, kaczors on gitter has produced a minmal reproduction case: 
>>>>> https://github.com/kaczors/airflow_1_10_tz_bug
>>>>> 
>>>>> Rough repro steps: In a VM, with time syncing disabled, and configured 
>>>>> with system timezone of Europe/Zurich (or any other CEST one) run 
>>>>> 
>>>>> - `date 10280250.00`
>>>>> - initdb, start scheduler, webserver, enable dag etc.
>>>>> - `date 10280259.00`
>>>>> - wait 5-10 mins for scheduler to catch up
>>>>> - After the on-the-hour task run the scheduler will spin up another 
>>>>> process to parse the dag... and it never returns.
>>>>> 
>>>>> I've only just managed to reproduce it, so haven't dug in to why yet. A 
>>>>> quick hacky debug print shows something is stuck in an infinite loop.
>>>>> 
>>>>> -ash
>>>>> 
>>>>>> On 29 Oct 2018, at 17:59, Bolke de Bruin <bdbr...@gmail.com> wrote:
>>>>>> 
>>>>>> Can this be confirmed? Then I can have a look at it. Preferably with dag 
>>>>>> definition code.
>>>>>> 
>>>>>> On the licensing requirements:
>>>>>> 
>>>>>> 1. Indeed licensing header for markdown documents. It was suggested to 
>>>>>> use html comments. I’m not sure how that renders with others like PDF 
>>>>>> though.
>>>>>> 2. The licensing notifications need to be tied to a specific version as 
>>>>>> licenses might change with versions.
>>>>>> 
>>>>>> Cheers
>>>>>> Bolke
>>>>>> 
>>>>>> Verstuurd vanaf mijn iPad
>>>>>> 
>>>>>>> Op 29 okt. 2018 om 12:39 heeft Ash Berlin-Taylor <a...@apache.org> het 
>>>>>>> volgende geschreven:
>>>>>>> 
>>>>>>> I was going to make a start on the release, but two people have 
>>>>>>> reported that there might be an issue around non-UTC dags and the 
>>>>>>> scheduler changing over from Summer time.
>>>>>>> 
>>>>>>>> 08:45 Emmanuel> Hi there, we are currently experiencing a very strange 
>>>>>>>> issue : we have hourly DAGs with a start_date in a local timezone (not 
>>>>>>>> UTC) and since (Sunday) the last winter time change they don’t run 
>>>>>>>> anymore. Any idea ?
>>>>>>>> 09:41 <Emmanuel> it impacted all our DAG that had a run at 3am 
>>>>>>>> (Europe/Paris), the exact time of winter time change :(
>>>>>>> 
>>>>>>> I am going to take a look at this today and see if I can get to the 
>>>>>>> bottom of it.
>>>>>>> 
>>>>>>> Bolke: are there any outstanding tasks/issues that you know of that 
>>>>>>> might slow down the vote for a 1.10.1? (i.e. did we sort of out all the 
>>>>>>> licensing issues that were asked of us? I thought I read something 
>>>>>>> about license declarations in markdown files?)
>>>>>>> 
>>>>>>> -ash
>>>>>>> 
>>>>>>>> On 28 Oct 2018, at 14:46, Bolke de Bruin <bdbr...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> I agree with that, but I would favor time based releases instead. We 
>>>>>>>> are again at the point that a release takes so much time that the gap 
>>>>>>>> is getting really big again. @ash why not start releasing now and move 
>>>>>>>> the remainder to 1.10.2? I dont think there are real blockers 
>>>>>>>> (although we might find them).
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 28 Oct 2018, at 15:35, airflowuser 
>>>>>>>>> <airflowu...@protonmail.com.INVALID> wrote:
>>>>>>>>> 
>>>>>>>>> I was really hoping that 
>>>>>>>>> https://github.com/apache/incubator-airflow/pull/4069 will be merged 
>>>>>>>>> into 1.10.1
>>>>>>>>> Deleting dags was a highly requested feature for 1.10 - this can fix 
>>>>>>>>> the problem with it.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>>>>>>>>>> On Friday, October 26, 2018 6:12 PM, Bolke de Bruin 
>>>>>>>>>>> <bdbr...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hey Ash,
>>>>>>>>>> 
>>>>>>>>>> I was wondering if you are picking up the 1.10.1 release? Master is 
>>>>>>>>>> speeding ahead and you were tracking fixes for 1.10.1 right?
>>>>>>>>>> 
>>>>>>>>>> B.
>>>>>>> 
>>>>> 
>>>> 
>> 
> 

Reply via email to