Re: [AIP-19] Making the webserver stateless

2019-04-14 Thread Kevin Yang
Thank you Julian, nice work. Good idea trying to put context info into TaskInstance. Overall I would be strongly preferring option 2a, for not upsetting owners of big DAGs and being more tractable, we can keep in mind we may 2b later when implementing 2a. Cheers, Kevin Y On Sat, Apr 13, 2019 at

Re: Database referral integrity

2019-04-14 Thread Kevin Yang
Ya I've benchmarked query performance and ended up with this PR to reduce the DB load. About the DB load, I do see the DB being the bottler neck for us to scale further( currently at ~3000 DAGs and ~20k TI at peak hours, we have 5 mins SLA so

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-14 Thread Felix Uellendall
That's sounds good. Start with the diffs and see what error-codes we want to enable. -feluelle Am 14/04/2019 um 23:02 schrieb Driesprong, Fokko: We could also consider first ignoring all the error-codes globally, and then enable them one by one, based on which violations we find important.

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-14 Thread Driesprong, Fokko
We could also consider first ignoring all the error-codes globally, and then enable them one by one, based on which violations we find important. Cheers, Fokko Op zo 14 apr. 2019 om 23:00 schreef Bas Harenslak < basharens...@godatadriven.com>: > Applying it on the diff first sounds good. At

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-14 Thread Bas Harenslak
Applying it on the diff first sounds good. At some point in time we’ll need to do a big bang to make the lesser-touched parts of Airflow compatible with Pylint. I’ll check how to apply it on the diff when I find time. Bas On 14 Apr 2019, at 22:37, Driesprong, Fokko

Re: Database referral integrity

2019-04-14 Thread Driesprong, Fokko
Thanks all for the input. After some testing, I found out that the possibilities of having FK's are also quite limited, for SQLite at least. For example, xcom.task_id = task_instance.task_id isn't possible because task_id on the task_instance isn't a key right now. This would involve adding

Re: [2.0 spring cleaning] Deprecate contrib folder?

2019-04-14 Thread Austin Bennett
It seems naming of contrib is used in practice with what I thought it would be (as a novice), but incubator not bad either -- yes, it gets messy if anything can get contributed. If using for that purpose, maybe there is some sort of timeframe for graduation or it winds up being removed? Indeed,

Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?

2019-04-14 Thread Driesprong, Fokko
I actually never used the plugin framework, and used standard Python imports. I'm all in for removing it since it is very confusing for newcomers. Cheers, Fokko Op zo 14 apr. 2019 om 10:26 schreef Ash Berlin-Taylor : > There is something to be said for the ease of putting plugins in the >

Re: [2.0 spring cleaning] Deprecate contrib folder?

2019-04-14 Thread Driesprong, Fokko
I'm in favor of removing the Contrib folder. It doesn't really add value in my opinion, and moving the hooks/operators will break the import. While DAG'ing I always have to look up if the operator is in contrib or not. Also, I think we should keep the operators and hooks part of the Airflow

Re: [DISCUSS]: Remove Mesos Executor from Airflow 2.0.0?

2019-04-14 Thread Driesprong, Fokko
https://jira.apache.org/jira/browse/AIRFLOW-4313 :-) Op za 13 apr. 2019 om 12:28 schreef Felix Uellendall < felix.uellend...@gmx.de>: > +1 (non-binding) > > Am 12/04/2019 um 23:42 schrieb Tao Feng: > > +1 on removing mesos executor. > > > > On Fri, Apr 12, 2019 at 2:25 PM Daniel Imberman < >

Re: Airflow end-to-end testing

2019-04-14 Thread Driesprong, Fokko
Hi Chris, Having some UI testing would be a great addition to Airflow's testing framework. However, I have some bad experiences with Selenium. In projects that I've worked with selenium, it was always very flaky, and hard to debug. Since we haven't written any tests, my first step would be to

Re: Is `airflow backfill` disfunctional?

2019-04-14 Thread Driesprong, Fokko
Good points James, Personally, I never use the CLI backfilling, and also recommend colleagues not to use it because of the points that you mention. I also resort to the poor man's backfilling (clearing the future and past in the UI). I'd rather get rid of the CLI, and would like to see the

Re: Changed behaviour of Timezone DST in Python 3.6 vs. 3.5

2019-04-14 Thread Jarek Potiuk
Ok. I think I fixed it and PR is running in Travis. It was quite an interesting one. Yet another tiny python3.5 vs. python 3.6 incompatibility :). This time because we are mixing standard datetime and pendulum - we are using croniter and it does not

Re: Changed behaviour of Timezone DST in Python 3.6 vs. 3.5

2019-04-14 Thread Jarek Potiuk
As per update - It looks like it's the changed fold behaviour, rather than TZ. I will try to dig deeper and see what's going on. J. On Sun, Apr 14, 2019 at 10:38 AM Ash Berlin-Taylor wrote: > Does anything in Airflow depend upon the TZ environment variable, or even > the stock tz behaviour? I

Re: Changed behaviour of Timezone DST in Python 3.6 vs. 3.5

2019-04-14 Thread Ash Berlin-Taylor
Does anything in Airflow depend upon the TZ environment variable, or even the stock tz behaviour? I thought we used the pendulum library as it was more predictable. If it's just that one test that fails it should be fixed, and perhaps added to our docs they TZ isn't respected (ie use the

Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?

2019-04-14 Thread Ash Berlin-Taylor
There is something to be said for the ease of putting plugins in the airflow home folder too that I would still like to keep, as it makes deploying site specific plugins much easier. On 14 April 2019 04:13:52 BST, Jarek Potiuk wrote: >+1 for using more entrypoints as discovery mechanism. Maybe

Re: Changed behaviour of Timezone DST in Python 3.6 vs. 3.5

2019-04-14 Thread Jarek Potiuk
After a bit of diffing it seems it's a different reason Local Time disambiguation changed in Python 3.6: https://docs.python.org/3/whatsnew/3.6.html#pep-495-local-time-disambiguation https://pendulum.eustace.io/docs/#using-the-timezone-library-directly - seems that pendulum which we use

Changed behaviour of Timezone DST in Python 3.6 vs. 3.5

2019-04-14 Thread Jarek Potiuk
Hello Airflowers. While testing the multi-staging docker I've found an incompatibility problem between python 3.5 and 3.6 w/regards to Timezone DST. Seems related to a known TZ behaviour change in Python 3.6 https://bugs.python.org/issue30062 I created a JIRA issue for it:

Re: [2.0 spring cleaning] Require unique conn_id

2019-04-14 Thread Kevin Yang
Yup unfortunately we Airbnb are relaying on the "feature" for some load balanching and also something like sensing partitions from 2 clusters at the same time( yup it is ugly). And at the same time we got bitten by having duplicate connections while one has outdated info. I think it does make

Re: [2.0 spring cleaning] Require unique conn_id

2019-04-14 Thread airflowuser
It can get more confusing because airflow allow to create two connection with same conn_id but different conn_type https://issues.apache.org/jira/browse/AIRFLOW-2784 Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Sunday, April 14, 2019 12:22 AM, Maxime Beauchemin