Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-27 Thread Kamil Breguła
Yes. Here is PR. https://github.com/apache/airflow/pull/6697 It made this work easier. On Thu, Feb 27, 2020 at 12:18 PM Ash Berlin-Taylor wrote: > We're just finializing the AIP document around scheduler HA and hope to > have it published by the end of this week. > > DagFileProcessor is a new cl

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-27 Thread Ash Berlin-Taylor
We're just finializing the AIP document around scheduler HA and hope to have it published by the end of this week. DagFileProcessor is a new class, but the code in it is not new -- it's the same, gradually evolving scheduler and parsing code, but refactored/moved to live elsewhere, right? -ash

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-26 Thread Kamil Breguła
Hello, In my opinion, it will be easier because the database will be less loaded, but more importantly. I added wait_for_update in one method, which means that another scheduler will not be able to damage the state of the database. Most of these changes even streamline HA, because more things are

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-26 Thread Maxime Beauchemin
Hey, I wanted to echo the awesomeness once more, but also bring up the question as to whether any of this work may make it harder to distribute / HA the scheduler down the line (?) I almost started analyzing the code and thought it'd just be easier to ask the authors. Max On Wed, Feb 26, 2020 at

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-26 Thread Felix Uellendall
Really awesome indeed! I hadn’t enough time to look into all of it yet but will definitely do. Thanks Polidea and Databand for your efforts you put into this! Kamil especially! -Felix Sent from ProtonMail Mobile On Wed, Feb 26, 2020 at 08:54, Sumit Maheshwari wrote: > Awesome work guys!! Ku

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-25 Thread Sumit Maheshwari
Awesome work guys!! Kudos to all of you 👏 On Wed, Feb 26, 2020 at 6:59 AM Jiajie Zhong wrote: > Good work! Thanks Kamil > > Best Wish > — Jiajie

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-25 Thread Jiajie Zhong
Good work! Thanks Kamil Best Wish — Jiajie

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-25 Thread Maxime Beauchemin
Nice! On Tue, Feb 25, 2020 at 12:11 AM Robin Edwards wrote: > This is brilliant work, thank you! Looking forward to watching my RDS > metrics when this gets deployed :-) > > On Tue, 25 Feb 2020, 07:08 Driesprong, Fokko, > wrote: > > > Sweet work Kamil and others! I'll try to go through them tod

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-25 Thread Robin Edwards
This is brilliant work, thank you! Looking forward to watching my RDS metrics when this gets deployed :-) On Tue, 25 Feb 2020, 07:08 Driesprong, Fokko, wrote: > Sweet work Kamil and others! I'll try to go through them today! > > Cheers, Fokko > > Op ma 24 feb. 2020 om 22:37 schreef Tao Feng : >

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Driesprong, Fokko
Sweet work Kamil and others! I'll try to go through them today! Cheers, Fokko Op ma 24 feb. 2020 om 22:37 schreef Tao Feng : > Great work Kamil! Let us know once it is landed in one of the future > releases. Would love to try it out :) > > Best, > -Tao > > On Mon, Feb 24, 2020 at 12:54 PM Qingpi

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Tao Feng
Great work Kamil! Let us know once it is landed in one of the future releases. Would love to try it out :) Best, -Tao On Mon, Feb 24, 2020 at 12:54 PM Qingping Hou wrote: > Awesome work Kamil! Great to see us embracing query batching in the > code base. I can't wait to deploy those optimization

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Qingping Hou
Awesome work Kamil! Great to see us embracing query batching in the code base. I can't wait to deploy those optimizations into our production environment. Thanks, QP Hou On Mon, Feb 24, 2020 at 8:35 AM Kamil Breguła wrote: > > Hello, > > Polidea [1] together with Databand [2] has taken steps to

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Evgeny Shulman
This is a really great improvement! Great job by everybody, we are really excited about this contribution! These changes make it easier for Airflow to support much more complex/large scale use cases in the future. Looking forward to more improvements like this one! * Huge thanks to friends from Pol

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Jarek Potiuk
Those are all great improvements Kamil! It would be great to have them reviewed, tested and merged for 2.0 ! J. On Mon, Feb 24, 2020 at 5:35 PM Kamil Breguła wrote: > Hello, > > Polidea [1] together with Databand [2] has taken steps to optimize > scheduler performance. > I made many changes l

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Tomasz Urbaszek
Thanks Kamil for the work! I've reviewed your PRs and everything looks good so I keep my fingers crossed for this optimization to be true ;) T. On Mon, Feb 24, 2020 at 5:35 PM Kamil Breguła wrote: > > Hello, > > Polidea [1] together with Databand [2] has taken steps to optimize > scheduler per

Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-24 Thread Kamil Breguła
Hello, Polidea [1] together with Databand [2] has taken steps to optimize scheduler performance. I made many changes last weekend: 1. [AIRFLOW-6856] Bulk fetch paused_dag_ids https://github.com/apache/airflow/pull/7476 2. [AIRFLOW-6857] Bulk sync DAGs https://github.com/apache/airflow/pull/7477 3