Rc4 will, but note 3.7 is not officially supported. Sent from my iPhone
> On 8 Aug 2018, at 07:09, Yongjun Park <thehp.key...@gmail.com> wrote: > > I tried to install this version of airflow with python 3.7. > There was still `async` keyword in setup.py. > > I think it should include AIRFLOW-2713 > <https://issues.apache.org/jira/browse/AIRFLOW-2713> for supporting python > 3.7 > > Regards, > Yongjun > > 2018년 8월 8일 (수) 오전 8:08, George Leslie-Waksman <waks...@gmail.com>님이 작성: > >> We just tried to upgrade a 1.8.1 install to 1.10rc3 and ran into a critical >> error on alembic migration execution. I have captured the issue in JIRA: >> https://issues.apache.org/jira/browse/AIRFLOW-2870 >> >> I would consider this a critical blocker for release because it hard blocks >> upgrading. >> >> George >> >>> On Tue, Aug 7, 2018 at 7:58 AM Bolke de Bruin <bdbr...@gmail.com> wrote: >>> >>> Done. When I roll rc4 it will be part of it. >>> >>> >>>> On 7 Aug 2018, at 16:26, Naik Kaxil <k.n...@reply.com> wrote: >>>> >>>> @bolke Can we also include the following commit to 1.10 release as we >>> would need this commit to generate docs at ReadTheDocs? >>>> >>>> - >>> >> https://github.com/apache/incubator-airflow/commit/8af0aa96bfe3caa51d67ab393db069d37b0c4169 >>>> >>>> Regards, >>>> Kaxil >>>> >>>> On 06/08/2018, 14:59, "James Meickle" <jmeic...@quantopian.com >> .INVALID> >>> wrote: >>>> >>>> Not a vote, but a comment: it might be worth noting that the new >>>> environment variable is also required if you have any Airflow plugin >>> test >>>> suites that install Airflow as part of their dependencies. In my >>> case, I >>>> had to set the new env var outsidfe of tox and add this: >>>> >>>> ``` >>>> [testenv] >>>> passenv = SLUGIFY_USES_TEXT_UNIDECODE >>>> ``` >>>> >>>> (`setenv` did not work as that provides env vars at runtime but not >>>> installtime, as far as I can tell.) >>>> >>>> >>>> On Sun, Aug 5, 2018 at 5:20 PM Bolke de Bruin <bdbr...@gmail.com> >>> wrote: >>>> >>>>> +1 :-) >>>>> >>>>> Sent from my iPhone >>>>> >>>>>> On 5 Aug 2018, at 23:08, Ash Berlin-Taylor < >>>>> ash_airflowl...@firemirror.com> wrote: >>>>>> >>>>>> Yup, just worked out the same thing. >>>>>> >>>>>> I think as "punishment" for me finding bugs so late in two RCs (this, >>>>> and 1.9) I should run the release for the next release. >>>>>> >>>>>> -ash >>>>>> >>>>>>> On 5 Aug 2018, at 22:05, Bolke de Bruin <bdbr...@gmail.com> wrote: >>>>>>> >>>>>>> Yeah I figured it out. Originally i was using a different >>>>> implementation of UTCDateTime, but that was unmaintained. I switched, >>> but >>>>> this version changed or has a different contract. While it transforms >> on >>>>> storing to UTC it does not so when it receives timezone aware fields >>> from >>>>> the db. Hence the issue. >>>>>>> >>>>>>> I will prepare a PR that removes the dependency and implements our >> own >>>>> extension of DateTime. Probably tomorrow. >>>>>>> >>>>>>> Good catch! Just in time :-(. >>>>>>> >>>>>>> B. >>>>>>> >>>>>>>> On 5 Aug 2018, at 22:43, Ash Berlin-Taylor < >>>>> ash_airflowl...@firemirror.com> wrote: >>>>>>>> >>>>>>>> Entirely possible, though I wasn't even dealing with the scheduler >> - >>>>> the issue I was addressing was entirely in the webserver for a >>> pre-existing >>>>> Task Instance. >>>>>>>> >>>>>>>> Ah, I hadn't noticed/twigged we are using sqlalchemy-utc. It >> appears >>>>> that isn't working right/ as expected. This line: >>>>> >>> >> https://github.com/spoqa/sqlalchemy-utc/blob/master/sqlalchemy_utc/sqltypes.py#L34 >>>>> doens't look right for us - as you mentioned the TZ is set to >> something >>>>> (rather than having no TZ value). >>>>>>>> >>>>>>>> Some background on how Pq handles TZs. It always returns DTs in the >>> TZ >>>>> of the connection. I'm not sure if this is unique to postgres or if >>> other >>>>> DBs behave the same. >>>>>>>> >>>>>>>> postgres=# select '2018-08-03 00:00:00+00:00'::timestamp with time >>>>> zone; >>>>>>>> timestamptz >>>>>>>> ------------------------ >>>>>>>> 2018-08-03 01:00:00+01 >>>>>>>> >>>>>>>> postgres=# select '2018-08-03 02:00:00+02'::timestamp with time >> zone; >>>>>>>> timestamptz >>>>>>>> ------------------------ >>>>>>>> 2018-08-03 01:00:00+01 >>>>>>>> >>>>>>>> The server will always return TZs in the connection timezone. >>>>>>>> >>>>>>>> postgres=# set timezone=utc; >>>>>>>> SET >>>>>>>> postgres=# select '2018-08-03 02:00:00+02'::timestamp with time >> zone; >>>>>>>> timestamptz >>>>>>>> ------------------------ >>>>>>>> 2018-08-03 00:00:00+00 >>>>>>>> (1 row) >>>>>>>> >>>>>>>> postgres=# select '2018-08-03 01:00:00+01'::timestamp with time >> zone; >>>>>>>> timestamptz >>>>>>>> ------------------------ >>>>>>>> 2018-08-03 00:00:00+00 >>>>>>>> (1 row) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -ash >>>>>>>> >>>>>>>>> On 5 Aug 2018, at 21:28, Bolke de Bruin <bdbr...@gmail.com> >> wrote: >>>>>>>>> >>>>>>>>> This is the issue: >>>>>>>>> >>>>>>>>> [2018-08-05 22:08:21,952] {jobs.py:906} INFO - NEXT RUN DATE: >>>>> 2018-08-03 00:00:00+00:00 tzinfo: <Timezone [UTC]> >>>>>>>>> [2018-08-05 22:08:22,007] {jobs.py:1425} INFO - Created <DagRun >>>>> example_http_operator @ 2018-08-03 02:00:00+02:00: >>>>> scheduled__2018-08-03T00:00:00+00:00, externally triggered: False> >>>>>>>>> >>>>>>>>> [2018-08-05 22:08:24,651] {jobs.py:906} INFO - NEXT RUN DATE: >>>>> 2018-08-04 02:00:00+02:00 tzinfo: psycopg2.tz >>> .FixedOffsetTimezone(offset=120, >>>>> name=None) >>>>>>>>> [2018-08-05 22:08:24,696] {jobs.py:1425} INFO - Created <DagRun >>>>> example_http_operator @ 2018-08-04 02:00:00+02:00: >>>>> scheduled__2018-08-04T02:00:00+02:00, externally triggered: False> >>>>>>>>> >>>>>>>>> Notice at line 1+2: that the next run date is correctly in UTC but >>>>> from the DB it gets a +2. At the next bit (3+4) we get a psycopg2.tz >>> .FixedOffsetTimezone >>>>> which should be set to UTC according to the specs of >>>>> https://github.com/spoqa/sqlalchemy-utc < >>>>> https://github.com/spoqa/sqlalchemy-utc> , but it isn’t. >>>>>>>>> >>>>>>>>> So changing your setting of the DB to UTC fixes the symptom but >> not >>>>> the cause. >>>>>>>>> >>>>>>>>> B. >>>>>>>>> >>>>>>>>>> On 5 Aug 2018, at 22:03, Ash Berlin-Taylor < >>>>> ash_airflowl...@firemirror.com> wrote: >>>>>>>>>> >>>>>>>>>> Sorry for being terse before. >>>>>>>>>> >>>>>>>>>> So the issue is that the ts loaded from the DB is not in UTC, >> it's >>>>> in GB/+01 (the default of the DB server) >>>>>>>>>> >>>>>>>>>> For me, on a currently running 1.9 (no TZ) db: >>>>>>>>>> >>>>>>>>>> airflow=# select * from task_instance; >>>>>>>>>> get_op | example_http_operator | 2018-07-23 00:00:00 >>>>>>>>>> >>>>>>>>>> This date time appears in the log url, and the path it looks at >> on >>>>> S3 is >>>>>>>>>> >>>>>>>>>> .../example_http_operator/2018-07-23T00:00:00/1.log >>>>>>>>>> >>>>>>>>>> If my postgres server has a default timezone of GB (which the one >>>>> running on my laptop does), and I then apply the migration then it is >>>>> converted to that local time. >>>>>>>>>> >>>>>>>>>> airflow=# select * from task_instance; >>>>>>>>>> get_op | example_http_operator | 2018-07-23 >> 01:00:00+01 >>>>>>>>>> >>>>>>>>>> airflow=# set timezone=UTC; >>>>>>>>>> airflow=# select * from task_instance; >>>>>>>>>> get_op | example_http_operator | 2018-07-23 >> 00:00:00+00 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This is all okay so far. The migration has kept the column at the >>>>> same moment in time. >>>>>>>>>> >>>>>>>>>> The issue come when the UI tries to display logs for this old >> task: >>>>> because the timezone of the connection is not UTC, PG returns a date >>> with a >>>>> +01 TZ. Thus after the migration this old task tries to look for a log >>> file >>>>> of >>>>>>>>>> >>>>>>>>>> .../example_http_operator/2018-07-23T01:00:00/1.log >>>>>>>>>> >>>>>>>>>> which doesn't exist - it's changed the time it has rendered from >>>>> midnight (in v1.9) to 1am (in v1.10). >>>>>>>>>> >>>>>>>>>> (This is with my change to log_filename_template from UPDATING.md >>> in >>>>> my other branch) >>>>>>>>>> >>>>>>>>>> Setting the timezone to UTC per connection means the behaviour of >>>>> Airflow doesn't change depending on how the server is configured. >>>>>>>>>> >>>>>>>>>> -ash >>>>>>>>>> >>>>>>>>>>> On 5 Aug 2018, at 20:58, Bolke de Bruin <bdbr...@gmail.com> >>> wrote: >>>>>>>>>>> >>>>>>>>>>> Digging in a bit further. >>>>>>>>>>> >>>>>>>>>>> {{{{ ti.dag_id }}}}/{{{{ ti.task_id }}}}/{{{{ ts }}}}/{{{{ >>>>> try_number }}}}.log >>>>>>>>>>> >>>>>>>>>>> is the format >>>>>>>>>>> >>>>>>>>>>> ts = execution_date.isoformat and should be in UTC afaik. >>>>>>>>>>> >>>>>>>>>>> something is weird tbh. >>>>>>>>>>> >>>>>>>>>>> B. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 5 Aug 2018, at 21:32, Bolke de Bruin <bdbr...@gmail.com> >>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Ash, >>>>>>>>>>>> >>>>>>>>>>>> Reading your proposed changes on your “set-timezone-to-utc” >>> branch >>>>> and below analysis, I am not sure what you are perceiving as an issue. >>>>>>>>>>>> >>>>>>>>>>>> For conversion we assume everything is stored in UTC and in a >>>>> naive format. Conversion then adds the timezone information. This >>> results >>>>> in the following >>>>>>>>>>>> >>>>>>>>>>>> postgres timezone = “Europe/Amsterdam” >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> airflow=# select * from task_instance; >>>>>>>>>>>> get_op | example_http_operator | 2018-07-27 >>> 02:00:00+02 >>>>>>>>>>>> >>>>>>>>>>>> airflow=# set timezone=UTC; >>>>>>>>>>>> airflow=# select * from task_instance; >>>>>>>>>>>> get_op | example_http_operator | 2018-07-27 >>> 00:00:00+00 >>>>>>>>>>>> >>>>>>>>>>>> If we don’t set the timezone in the connection postgres assumes >>>>> server timezone (in my case “Europe/Amsterdam”). So every datetime >>> Airflow >>>>> receives will be in “Europe/Amsterdam” format. However as we defined >> the >>>>> model to use UTCDateTime it will always convert the returned DateTime >> to >>>>> UTC. >>>>>>>>>>>> >>>>>>>>>>>> If we have configured Airflow to support something else as UTC >> as >>>>> the default timezone or a DAG has a associated timezone we only >> convert >>> to >>>>> that timezone when calculating the next runtime (not for cron btw). >>> Nowhere >>>>> else and thus we are UTC everywhere. >>>>>>>>>>>> >>>>>>>>>>>> What do you think is inconsistent? >>>>>>>>>>>> >>>>>>>>>>>> Bolke >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 5 Aug 2018, at 18:13, Ash Berlin-Taylor < >>>>> ash_airflowl...@firemirror.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Relating to 2): I'm not sure that the upgrade from >> timezoneless >>>>> to timezone aware colums in the task instance is right, or at least >> it's >>>>> not what I expected. >>>>>>>>>>>>> >>>>>>>>>>>>> Before weren't all TZs from schedule dates etc in UTC? For the >>>>> same task instance (these outputs from psql directly): >>>>>>>>>>>>> >>>>>>>>>>>>> before: execution_date=2017-09-04 00:00:00 >>>>>>>>>>>>> after: execution_date=2017-09-04 01:00:00+01 >>>>>>>>>>>>> >>>>>>>>>>>>> **Okay the migration is fine**. It appears that the migration >>> has >>>>> done the right thing, but my local DB I'm testing with has a Timezone >>> of GB >>>>> set, so Postgres converts it to that TZ on returning an object. >>>>>>>>>>>>> >>>>>>>>>>>>> 3) Do we need to set the TZ of the connection to UTC in >>>>> SQLAlchemy to have consistent behaviour? Is this possible some how? I >>> don't >>>>> know SQLAlchemy that well. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -ash >>>>>>>>>>>>> >>>>>>>>>>>>>> On 5 Aug 2018, at 16:01, Ash Berlin-Taylor < >>>>> ash_airflowl...@firemirror.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1.) Missing UPDATING note about change of task_log_reader to >>> now >>>>> always being "task" (was "s3.task" before.). Logging config is much >>> simpler >>>>> now though. This may be particular to my logging config, but given how >>> much >>>>> of a pain it was to set up S3 logging in 1.9 I have shared my config >>> with >>>>> some people in the Gitter chat so It's not just me. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2) The path that log-files are written to in S3 has changed >>>>> (again - this happened from 1.8 to 1.9). I'd like to avoid having to >>> move >>>>> all of my log files again to continue viewing them. The change is that >>> the >>>>> path now (in 1.10) has a timezone in it, and the date is in local >> time, >>>>> before it was UTC: >>>>>>>>>>>>>> >>>>>>>>>>>>>> before: 2018-07-23T00:00:00/1.log >>>>>>>>>>>>>> after: 2018-07-23T01:00:00+01:00/1.log >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can possibly get away with an updating note about this to >>> set >>>>> a custom log_filename_template. Testing this now. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 5 Aug 2018, at 15:00, Ash Berlin-Taylor < >>> a...@firemirror.com> >>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -1(binding) from me. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Installed with: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> AIRFLOW_GPL_UNIDECODE=yes pip install ' >>>>> >>> >> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz#egg=apache-airflow[emr >>>>> < >>>>> >>> >> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz#egg=apache-airflow[emr >>>> , >>>>> s3, crypto]>=1.10' >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Install went fine. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Our DAGs that use SparkSubmitOperator are now failing as >> there >>>>> is now a hard dependency on the Kubernetes client libs, but the `emr` >>> group >>>>> doesn't mention this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Introduced in >>>>> https://github.com/apache/incubator-airflow/pull/3112 < >>>>> https://github.com/apache/incubator-airflow/pull/3112> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I see two options for this - either conditionally enable >>> k8s:// >>>>> support if the import works, or (less preferred) add kube-client to >> the >>> emr >>>>> deps (which I like less) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry - this is the first time I've been able to test it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I will install this dep manually and continue testing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -ash >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> (Normally no time at home due to new baby, but I got a >>> standing >>>>> desk, and a carrier meaning she can sleep on me and I can use my >> laptop. >>>>> Win!) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 4 Aug 2018, at 22:32, Bolke de Bruin >>> <https://maps.google.com/?q=ruin++%3E+&entry=gmail&source=g>< >>> bdbr...@gmail.com >>>>> <mailto:bdbr...@gmail.com>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Bump. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Committers please cast your vote. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> B. >>>>>>>>>>>>>>>> < >> https://maps.google.com/?q=ruin++%3E+&entry=gmail&source=g> >>>>>>>>>>>>>>>> Sent from my iPhone >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 3 Aug 2018, at 13:23, Driesprong, Fokko >>>>> <fo...@driesprong.frl <mailto:fo...@driesprong.frl>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> +1 Binding >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Installed it using: SLUGIFY_USES_TEXT_UNIDECODE=yes pip >>>>> install >>>>>>>>>>>>>>>>> >>>>> >>> >> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz >>>>> < >>>>> >>> >> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/apache-airflow-1.10.0rc3+incubating-bin.tar.gz >>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, Fokko >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2018-08-03 9:47 GMT+02:00 Bolke de Bruin < >> bdbr...@gmail.com >>>> : >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hey all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have cut Airflow 1.10.0 RC3. This email is calling a >> vote >>>>> on the release, >>>>>>>>>>>>>>>>>> which will last for 72 hours. Consider this my (binding) >>> +1. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Airflow 1.10.0 RC 3 is available at: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/ < >>>>>>>>>>>>>>>>>> >>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/1.10.0rc3/> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> apache-airflow-1.10.0rc3+incubating-source.tar.gz is a >>>>> source release that >>>>>>>>>>>>>>>>>> comes with INSTALL instructions. >>>>>>>>>>>>>>>>>> apache-airflow-1.10.0rc3+incubating-bin.tar.gz is the >>> binary >>>>> Python >>>>>>>>>>>>>>>>>> "sdist" >>>>>>>>>>>>>>>>>> release. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Public keys are available at: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/ < >>>>>>>>>>>>>>>>>> >>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The amount of JIRAs fixed is over 700. Please have a look >>> at >>>>> the >>>>>>>>>>>>>>>>>> changelog. >>>>>>>>>>>>>>>>>> Since RC2 the following has been fixed: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> * [AIRFLOW-2817] Force explicit choice on GPL dependency >>>>>>>>>>>>>>>>>> * [AIRFLOW-2716] Replace async and await py3.7 keywords >>>>>>>>>>>>>>>>>> * [AIRFLOW-2810] Fix typo in Xcom model timestamp >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please note that the version number excludes the `rcX` >>>>> string as well >>>>>>>>>>>>>>>>>> as the "+incubating" string, so it's now simply 1.10.0. >>> This >>>>> will allow us >>>>>>>>>>>>>>>>>> to rename the artifact without modifying the artifact >>>>> checksums when we >>>>>>>>>>>>>>>>>> actually release. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> WARNING: Due to licensing requirements you will need to >> set >>>>>>>>>>>>>>>>>> SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when >>>>>>>>>>>>>>>>>> installing or upgrading. We will try to remove this >>>>> requirement for the >>>>>>>>>>>>>>>>>> next release. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>> Bolke >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Kaxil Naik >>>> >>>> Data Reply >>>> 2nd Floor, Nova South >>>> 160 Victoria Street, Westminster >>>> London SW1E 5LB - UK >>>> phone: +44 (0)20 7730 6000 <+44%2020%207730%206000> >>>> k.n...@reply.com >>>> www.reply.com >>> >>> >>