This is the issue:

[2018-08-05 22:08:21,952] {jobs.py:906} INFO - NEXT RUN DATE: 2018-08-03 
00:00:00+00:00 tzinfo: <Timezone [UTC]>
[2018-08-05 22:08:22,007] {jobs.py:1425} INFO - Created <DagRun 
example_http_operator @ 2018-08-03 02:00:00+02:00: 
scheduled__2018-08-03T00:00:00+00:00, externally triggered: False>

[2018-08-05 22:08:24,651] {jobs.py:906} INFO - NEXT RUN DATE: 2018-08-04 
02:00:00+02:00 tzinfo: psycopg2.tz.FixedOffsetTimezone(offset=120, name=None)
[2018-08-05 22:08:24,696] {jobs.py:1425} INFO - Created <DagRun 
example_http_operator @ 2018-08-04 02:00:00+02:00: 
scheduled__2018-08-04T02:00:00+02:00, externally triggered: False>

Notice at line 1+2: that the next run date is correctly in UTC but from the DB 
it gets a +2. At the next bit (3+4) we get a psycopg2.tz.FixedOffsetTimezone 
which should be set to UTC according to the specs of 
https://github.com/spoqa/sqlalchemy-utc 
<https://github.com/spoqa/sqlalchemy-utc> , but it isn’t. 

So changing your setting of the DB to UTC fixes the symptom but not the cause.

B.

> On 5 Aug 2018, at 22:03, Ash Berlin-Taylor <ash_airflowl...@firemirror.com> 
> wrote:
> 
> Sorry for being terse before.
> 
> So the issue is that the ts loaded from the DB is not in UTC, it's in GB/+01 
> (the default of the DB server)
> 
> For me, on a currently running 1.9 (no TZ) db:
> 
> airflow=# select * from task_instance;
> get_op            | example_http_operator | 2018-07-23 00:00:00
> 
> This date time appears in the log url, and the path it looks at on S3 is 
> 
> .../example_http_operator/2018-07-23T00:00:00/1.log
> 
> If my postgres server has a default timezone of GB (which the one running on 
> my laptop does), and I then apply the migration then it is converted to that 
> local time.
> 
> airflow=# select * from task_instance;
> get_op            | example_http_operator | 2018-07-23 01:00:00+01
> 
> airflow=# set timezone=UTC;
> airflow=# select * from task_instance;
> get_op            | example_http_operator | 2018-07-23 00:00:00+00
> 
> 
> This is all okay so far. The migration has kept the column at the same moment 
> in time.
> 
> The issue come when the UI tries to display logs for this old task: because 
> the timezone of the connection is not UTC, PG returns a date with a +01 TZ. 
> Thus after the migration this old task tries to look for a log file of
> 
> .../example_http_operator/2018-07-23T01:00:00/1.log
> 
> which doesn't exist - it's changed the time it has rendered from midnight (in 
> v1.9) to 1am (in v1.10).
> 
> (This is with my change to log_filename_template from UPDATING.md in my other 
> branch)
> 
> Setting the timezone to UTC per connection means the behaviour of Airflow 
> doesn't change depending on how the server is configured.
> 
> -ash
> 
>> On 5 Aug 2018, at 20:58, Bolke de Bruin <bdbr...@gmail.com> wrote:
>> 
>> Digging in a bit further. 
>> 
>> {{{{ ti.dag_id }}}}/{{{{ ti.task_id }}}}/{{{{ ts }}}}/{{{{ try_number 
>> }}}}.log
>> 
>> is the format
>> 
>> ts = execution_date.isoformat and should be in UTC afaik.
>> 
>> something is weird tbh.
>> 
>> B.
>> 
>> 
>>> On 5 Aug 2018, at 21:32, Bolke de Bruin <bdbr...@gmail.com> wrote:
>>> 
>>> Ash,
>>> 
>>> Reading your proposed changes on your “set-timezone-to-utc” branch and 
>>> below analysis, I am not sure what you are perceiving as an issue.
>>> 
>>> For conversion we assume everything is stored in UTC and in a naive format. 
>>> Conversion then adds the timezone information. This results in the following
>>> 
>>> postgres timezone = “Europe/Amsterdam”
>>> 
>>> 
>>> airflow=# select * from task_instance;
>>> get_op            | example_http_operator | 2018-07-27 02:00:00+02
>>> 
>>> airflow=# set timezone=UTC;
>>> airflow=# select * from task_instance;
>>> get_op            | example_http_operator | 2018-07-27 00:00:00+00
>>> 
>>> If we don’t set the timezone in the connection postgres assumes server 
>>> timezone (in my case “Europe/Amsterdam”). So every datetime Airflow 
>>> receives will be in “Europe/Amsterdam” format. However as we defined the 
>>> model to use UTCDateTime it will always convert the returned DateTime to 
>>> UTC.
>>> 
>>> If we have configured Airflow to support something else as UTC as the 
>>> default timezone or a DAG has a associated timezone we only convert to that 
>>> timezone when calculating the next runtime (not for cron btw). Nowhere else 
>>> and thus we are UTC everywhere.
>>> 
>>> What do you think is inconsistent?
>>> 
>>> Bolke
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On 5 Aug 2018, at 18:13, Ash Berlin-Taylor 
>>>> <ash_airflowl...@firemirror.com> wrote:
>>>> 
>>>> Relating to 2): I'm not sure that the upgrade from timezoneless to 
>>>> timezone aware colums in the task instance is right, or at least it's not 
>>>> what I expected.
>>>> 
>>>> Before weren't all TZs from schedule dates etc in UTC? For the same task 
>>>> instance (these outputs from psql directly):
>>>> 
>>>> before: execution_date=2017-09-04 00:00:00
>>>> after: execution_date=2017-09-04 01:00:00+01
>>>> 
>>>> **Okay the migration is fine**. It appears that the migration has done the 
>>>> right thing, but my local DB I'm testing with has a Timezone of GB set, so 
>>>> Postgres converts it to that TZ on returning an object.
>>>> 
>>>> 3) Do we need to set the TZ of the connection to UTC in SQLAlchemy to have 
>>>> consistent behaviour? Is this possible some how? I don't know SQLAlchemy 
>>>> that well.
>>>> 
>>>> 
>>>> -ash
>>>> 
>>>>> On 5 Aug 2018, at 16:01, Ash Berlin-Taylor 
>>>>> <ash_airflowl...@firemirror.com> wrote:
>>>>> 
>>>>> 1.) Missing UPDATING note about change of task_log_reader to now always 
>>>>> being "task" (was "s3.task" before.). Logging config is much simpler now 
>>>>> though. This may be particular to my logging config, but given how much 
>>>>> of a pain it was to set up S3 logging in 1.9 I have shared my config with 
>>>>> some people in the Gitter chat so It's not just me.
>>>>> 
>>>>> 2) The path that log-files are written to in S3 has changed (again - this 
>>>>> happened from 1.8 to 1.9). I'd like to avoid having to move all of my log 
>>>>> files again to continue viewing them. The change is that the path now (in 
>>>>> 1.10) has a timezone in it, and the date is in local time, before it was 
>>>>> UTC:
>>>>> 
>>>>> before: 2018-07-23T00:00:00/1.log
>>>>> after: 2018-07-23T01:00:00+01:00/1.log
>>>>> 
>>>>> We can possibly get away with an updating note about this to set a custom 
>>>>> log_filename_template. Testing this now.
>>>>> 
>>>>>> On 5 Aug 2018, at 15:00, Ash Berlin-Taylor <a...@firemirror.com> wrote:
>>>>>> 
>>>>>> -1(binding) from me.
>>>>>> 
>>>>>> Installed with:
>>>>>> 
>>>>>> AIRFLOW_GPL_UNIDECODE=yes pip install 
>>>>>> 'https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz#egg=apache-airflow[emr
>>>>>>  
>>>>>> <https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz#egg=apache-airflow[emr>,
>>>>>>  s3, crypto]>=1.10'
>>>>>> 
>>>>>> Install went fine.
>>>>>> 
>>>>>> Our DAGs that use SparkSubmitOperator are now failing as there is now a 
>>>>>> hard dependency on the Kubernetes client libs, but the `emr` group 
>>>>>> doesn't mention this.
>>>>>> 
>>>>>> Introduced in https://github.com/apache/incubator-airflow/pull/3112 
>>>>>> <https://github.com/apache/incubator-airflow/pull/3112>
>>>>>> 
>>>>>> I see two options for this - either conditionally enable k8s:// support 
>>>>>> if the import works, or (less preferred) add kube-client to the emr deps 
>>>>>> (which I like less)
>>>>>> 
>>>>>> Sorry - this is the first time I've been able to test it.
>>>>>> 
>>>>>> I will install this dep manually and continue testing.
>>>>>> 
>>>>>> -ash
>>>>>> 
>>>>>> (Normally no time at home due to new baby, but I got a standing desk, 
>>>>>> and a carrier meaning she can sleep on me and I can use my laptop. Win!)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 4 Aug 2018, at 22:32, Bolke de Bruin <bdbr...@gmail.com 
>>>>>>> <mailto:bdbr...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Bump. 
>>>>>>> 
>>>>>>> Committers please cast your vote. 
>>>>>>> 
>>>>>>> B.
>>>>>>> 
>>>>>>> Sent from my iPhone
>>>>>>> 
>>>>>>>> On 3 Aug 2018, at 13:23, Driesprong, Fokko <fo...@driesprong.frl 
>>>>>>>> <mailto:fo...@driesprong.frl>> wrote:
>>>>>>>> 
>>>>>>>> +1 Binding
>>>>>>>> 
>>>>>>>> Installed it using: SLUGIFY_USES_TEXT_UNIDECODE=yes pip install
>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz
>>>>>>>>  
>>>>>>>> <https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz>
>>>>>>>> 
>>>>>>>> Cheers, Fokko
>>>>>>>> 
>>>>>>>> 2018-08-03 9:47 GMT+02:00 Bolke de Bruin <bdbr...@gmail.com>:
>>>>>>>> 
>>>>>>>>> Hey all,
>>>>>>>>> 
>>>>>>>>> I have cut Airflow 1.10.0 RC3. This email is calling a vote on the 
>>>>>>>>> release,
>>>>>>>>> which will last for 72 hours. Consider this my (binding) +1.
>>>>>>>>> 
>>>>>>>>> Airflow 1.10.0 RC 3 is available at:
>>>>>>>>> 
>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/ <
>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/>
>>>>>>>>> 
>>>>>>>>> apache-airflow-1.10.0rc3+incubating-source.tar.gz is a source release 
>>>>>>>>> that
>>>>>>>>> comes with INSTALL instructions.
>>>>>>>>> apache-airflow-1.10.0rc3+incubating-bin.tar.gz is the binary Python
>>>>>>>>> "sdist"
>>>>>>>>> release.
>>>>>>>>> 
>>>>>>>>> Public keys are available at:
>>>>>>>>> 
>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/ <
>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/>
>>>>>>>>> 
>>>>>>>>> The amount of JIRAs fixed is over 700. Please have a look at the
>>>>>>>>> changelog.
>>>>>>>>> Since RC2 the following has been fixed:
>>>>>>>>> 
>>>>>>>>> * [AIRFLOW-2817] Force explicit choice on GPL dependency
>>>>>>>>> * [AIRFLOW-2716] Replace async and await py3.7 keywords
>>>>>>>>> * [AIRFLOW-2810] Fix typo in Xcom model timestamp
>>>>>>>>> 
>>>>>>>>> Please note that the version number excludes the `rcX` string as well
>>>>>>>>> as the "+incubating" string, so it's now simply 1.10.0. This will 
>>>>>>>>> allow us
>>>>>>>>> to rename the artifact without modifying the artifact checksums when 
>>>>>>>>> we
>>>>>>>>> actually release.
>>>>>>>>> 
>>>>>>>>> WARNING: Due to licensing requirements you will need to set
>>>>>>>>> SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when
>>>>>>>>> installing or upgrading. We will try to remove this requirement for 
>>>>>>>>> the
>>>>>>>>> next release.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Bolke
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Reply via email to