Those are all great improvements Kamil! It would be great to have them
reviewed, tested and merged for 2.0 !

J.


On Mon, Feb 24, 2020 at 5:35 PM Kamil Breguła <kamil.breg...@polidea.com>
wrote:

> Hello,
>
> Polidea [1]  together with Databand [2] has taken steps to optimize
> scheduler performance.
> I made many changes last weekend:
> 1. [AIRFLOW-6856] Bulk fetch paused_dag_ids
> https://github.com/apache/airflow/pull/7476
> 2. [AIRFLOW-6857] Bulk sync DAGs
> https://github.com/apache/airflow/pull/7477
> 3. [AIRFLOW-6862] Do not check the freshness of fresh DAG
> https://github.com/apache/airflow/pull/7481
> 4. [AIRFLOW-6869] Bulk fetch DAGRuns for _process_task_instances
> https://github.com/apache/airflow/pull/7489
> 5. [AIRFLOW-6881] Bulk fetch DAGRun for create_dag_run
> https://github.com/apache/airflow/pull/7502
> 6. [AIRFLOW-6887] Do not check the state of fresh DAGRun
> https://github.com/apache/airflow/pull/7510
> These changes have not yet been merged to allow review by wider
> audiences. Any feedback is very helpful. The result of the performance
> benchmark is available in the description of each change.
>
> When it comes to the overall changes, It looks as follows.
>
> Before:
> Average time: 8080.246 ms
> Queries count: 2692
> After:
> Average time: 628.801 ms
> Queries count:  5
> Diff:
> Average time: -7452 ms (-92%)
> Queries count: 2687 (-99%)
>
> My changes focused only on DagFileProcessor, but this generates the
> most database queries and takes a significant amount of scheduler's
> time.
>
> Tomek Urbaszek's change has also been merged in the past to improve
> performance.
> 7. [AIRFLOW-6590] Use batch db operations in jobs
> https://github.com/apache/airflow/pull/7370
>
> This is not the last improvement of performance. We still keep working
> and other changes will appear in the future.
>
> Many thanks to friends from Databand [https://databand.ai/] for support.
>
> Best regards,
> Kamil Breguła
>
> [1] https://www.polidea.com/services/
> [2] https://databand.ai/about/
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to