Re: Proposed roadmap for Airflow 2.0

2019-11-14 Thread Kevin Yang
Yes Fokko that is true, the overall aggregated saving from removing the overhead is actually gonna be esp. large for us as we start tens of millions of tasks everyday. Looking forward to include that change in our code base. Hi Jarek, automated performance testing sounds extremely tasty and we Air

Re: Proposed roadmap for Airflow 2.0

2019-10-23 Thread Jarek Potiuk
Just to let everyone know - in the context - we are planning @Polidea to work on automated performance testing for Airlfow. Since Performance and Reliability is super important, we think about defining a set of consistent performance tests that we will be able to run automatically ionusing differen

Re: Proposed roadmap for Airflow 2.0

2019-10-22 Thread Driesprong, Fokko
Removing overhead for starting the processes would not only benefit the k8s executor, but also the workers spawn subprocesses. I would definitely be interested to see some numbers on the improvement of AIP-17 in practice. Maybe we should build some benchmark to see if we introduced performance reg

Re: Proposed roadmap for Airflow 2.0

2019-10-21 Thread Kevin Yang
For sure Fokko! I'll go through the PRs after finishing reading the one for AIP-24. AIP-17 does need quite some rewrites but I think we're pretty close. We plan to roll it out in our production cluster and then open source it after we believe it is stable. At the moment we're doing it by reusing t

Re: Proposed roadmap for Airflow 2.0

2019-10-21 Thread James Meickle
I would feel better about a faster 2.0 release if we had a better plan for how often we'll do future major version increments. As-is this might be the first change to break backwards compat meaningfully in a while. On Mon, Oct 21, 2019 at 3:03 AM Driesprong, Fokko wrote: > Thanks Kevin, > > Kevi

Re: Proposed roadmap for Airflow 2.0

2019-10-21 Thread Driesprong, Fokko
Thanks Kevin, Kevin would love to have your input on this PR. This one tries to implement an async implementation of the operator, based on the sensor by Seelman. And also this one, which is required to mak

Re: Proposed roadmap for Airflow 2.0

2019-10-20 Thread Kevin Yang
Thanks Ash for putting together the doc, somehow I cannot do anything on confluence so I'll put my comments here. +1 for using this opportunity to define how we want to do releases, e.g. frequency, compatibility rules, etc. If the DAG isolation is being worked on I would love to see it in 2.0.

Re: Proposed roadmap for Airflow 2.0

2019-10-09 Thread Chao-Han Tsai
Although Airflow has the concept of task priority like Ash mentioned, it does not pre-empt running tasks. On Wed, Oct 9, 2019 at 12:42 AM Ash Berlin-Taylor wrote: > There's already a concept called priority_weight on tasks > http://airflow.apache.org/concepts.html?highlight=priority_weight#pools

Re: Proposed roadmap for Airflow 2.0

2019-10-09 Thread Ash Berlin-Taylor
There's already a concept called priority_weight on tasks http://airflow.apache.org/concepts.html?highlight=priority_weight#pools (the doc about it is in relation to pools, but everything is run in a pool of "default_pool" if not specified.) Is that what you want? On 9 October 2019 07:38:38 BS

Re: Proposed roadmap for Airflow 2.0

2019-10-08 Thread bharath palaksha
Hi, Is there any discussion thread on adding priority to tasks and cost-based optimization? priority and pre-emption as an option to the user. If priority is specified, scheduler has to schedule high priority tasks and if pre-emption is true, it can pre-empt current running task which is of lower

Re: Proposed roadmap for Airflow 2.0

2019-09-30 Thread James Meickle
For what I'm looking for out of a 2.0, as an operator/cluster admin (separate from what I'd like to see as a DAG developer), I'd love to see: - Combine breaking changes into 2.0, and do as few as possible after - A semver policy for 2.0 and onwards. (For instance we got bit hard by a breaking API

Re: Proposed roadmap for Airflow 2.0

2019-09-30 Thread Jarek Potiuk
All those are very important and we are going to work on some of them as well. I think if there are breaking changes, we should rather try to fit them in 2.0 release - at least to the point that they can be base for extending it in later versions in backwards-compatible way (maybe then we should a

Re: Proposed roadmap for Airflow 2.0

2019-09-24 Thread James Meickle
My question with that is, how often do we want to do major version increments? There's a few API breaking changes I'd love to see, but whether to propose them for 2.0 depends on what the wait until 3.0 looks like (or whether we'll allow more minor version breakages in the future) On Tue, Sep 24,

Re: Proposed roadmap for Airflow 2.0

2019-09-24 Thread Dan Davydov
I think along with "Improve Webserver Performance" we should solve the serialization and task execution isolation problems a little bit more completely since I imagine there could be backwards compatibility issues. e.g. mapping each task JSON to a Docker image or other serialized representation tha

Re: Proposed roadmap for Airflow 2.0

2019-09-24 Thread Ash Berlin-Taylor
I'm also in favour of py-test (and it's what I use for most of my development) which is why I created https://issues.apache.org/jira/browse/AIRFLOW-4863, but I don't think non-user-facing/impacting changes need to go on the road map. -ash > On 24 Sep 2019, at 13:53, Tomasz Urbaszek wrote: > >

Re: Proposed roadmap for Airflow 2.0

2019-09-24 Thread Tomasz Urbaszek
I am thinking about proposing migration from nosetest to pytest because it's "more up to date". I have even a POC but a lot of test fails due to probably side effects. Best, Tomek On Tue, Sep 24, 2019 at 2:38 PM Ash Berlin-Taylor wrote: > That formatted very badly in plain text. The list was: >

Re: Proposed roadmap for Airflow 2.0

2019-09-24 Thread Ash Berlin-Taylor
That formatted very badly in plain text. The list was: • Knative Executor (AIP-25, currently draft. Being worked on by Daniel Imberman ) • Improve Webserver performance (AIP-24, currently draft. Being worked on by myself, Kaxil Naik and Zhou Fang) • Enhanced real-time UI

Proposed roadmap for Airflow 2.0

2019-09-24 Thread Ash Berlin-Taylor
Hi everyone, I'd like to start working on a concrete plan to get Airflow 2.0 out, and as a result I've started updating https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0 In addition to all the tidy up work ("spring cleaning", finish tidy up after dropping Py2 etc) I'd propose the