Those are all great improvements Kamil! It would be great to have them reviewed, tested and merged for 2.0 !
J. On Mon, Feb 24, 2020 at 5:35 PM Kamil Breguła <kamil.breg...@polidea.com> wrote: > Hello, > > Polidea [1] together with Databand [2] has taken steps to optimize > scheduler performance. > I made many changes last weekend: > 1. [AIRFLOW-6856] Bulk fetch paused_dag_ids > https://github.com/apache/airflow/pull/7476 > 2. [AIRFLOW-6857] Bulk sync DAGs > https://github.com/apache/airflow/pull/7477 > 3. [AIRFLOW-6862] Do not check the freshness of fresh DAG > https://github.com/apache/airflow/pull/7481 > 4. [AIRFLOW-6869] Bulk fetch DAGRuns for _process_task_instances > https://github.com/apache/airflow/pull/7489 > 5. [AIRFLOW-6881] Bulk fetch DAGRun for create_dag_run > https://github.com/apache/airflow/pull/7502 > 6. [AIRFLOW-6887] Do not check the state of fresh DAGRun > https://github.com/apache/airflow/pull/7510 > These changes have not yet been merged to allow review by wider > audiences. Any feedback is very helpful. The result of the performance > benchmark is available in the description of each change. > > When it comes to the overall changes, It looks as follows. > > Before: > Average time: 8080.246 ms > Queries count: 2692 > After: > Average time: 628.801 ms > Queries count: 5 > Diff: > Average time: -7452 ms (-92%) > Queries count: 2687 (-99%) > > My changes focused only on DagFileProcessor, but this generates the > most database queries and takes a significant amount of scheduler's > time. > > Tomek Urbaszek's change has also been merged in the past to improve > performance. > 7. [AIRFLOW-6590] Use batch db operations in jobs > https://github.com/apache/airflow/pull/7370 > > This is not the last improvement of performance. We still keep working > and other changes will appear in the future. > > Many thanks to friends from Databand [https://databand.ai/] for support. > > Best regards, > Kamil Breguła > > [1] https://www.polidea.com/services/ > [2] https://databand.ai/about/ > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>