*TL;DR;* We are gearing up @ Polidea to work on Apache Airflow performance and I wanted to start a discussion that might lead to creating a new AIP and implementing it :). Here is a high-level summary of the discussions we had so far, so this might be a starting point to get some details worked out and end up with polished AIP.
*Motivation* Airflow has a number of areas that require performance testing and currently when releasing new version of Airflow we are not able to reason about potential performance impacts, discuss performance degradation problems, because we have only anecdotal evidence of performance characteristics of Airflow. There is this fantastic post <https://www.astronomer.io/blog/profiling-the-airflow-scheduler/> from Ash about profiling scheduler, but this is only one part of Airflow and it would be great to turn our performance work into regular and managed approach. *Areas* We think about two types of performance streams: - Instrumentation of the code - Synthetic CI-runnable E2E tests. Those can be implemented independently, but instrumentation might be of great help when synthetic tests are run. *Instrumentation* Instrumentation is targeted towards Airflow users and DAG developers. Instrumentation is mostly about gathering more performance characteristics, including numbers related to database queries and performance, latency of DAG scheduling, parsing time etc. This all can be exported using current statsd interface and visualised using one of the standard metric tools (Grafana, Prometheus etc.) and some of this can be surfaced in the existing UI where you can see it in the context of actual DAGS (mostly latency numbers). Most of it should be back-portable to 1.10.*. It can be used to track down performance numbers with some real in-production DAGs. Part of the effort should be instructions on how to setup monitoring and dashboards for Airflow, document which of the metrics mean what and making sure the data is actionable - it should be rather clear for the developers how to turn their observations into actions. An important part of this will be also to provide a way to export such performance information easily and share with community or service providers so that they can make more educated reasoning about performance problem their users experience. *Synthetic CI-runnable E2E tests* Synthetic tests are targeted towards Airflow committers/contributors The synthetic CI-runnable tests that will be able to produce generic performance numbers for the core of Airflow. We can prepare synthetic data and run Airflow with NullExecutors, empty tasks, various Executors, using different deployments to focus only on the performance of the core Airflow itself. We can run those performance tests on already released Airflow versions to compare the performance numbers, as well as it has the benefit that we can visualise trends, compare versions, and have automated performance testing of new releases. Some of the instrumentation changes described above (especially the part that will be easily back-portable to 1.10.* series) might also be helpful in getting some performance characteristics. *What's my ask?* I'd love to start community discussion on this. Please feel free to add your comments and let's start working on it soon. I would love to get some interested people and organise a SIG-performance group soon - @Kevin - I think you mentioned you would be interested, but anyone else is welcome to join as well). J. -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
