> > > As I mentioned in that article I think it's critical both for performance > testing of changes, and "regression" testing of releases that we have a > repeatable performance test framework. I somewhat simplified the script I > used in the article, so I need to tidy it up a bit more and work out where > to put it. >
Perfect. I think that as soon as I finish the prod image (at least to the stage people will be able to start testing it) I will switch to that I will prepare a short overview of the "framework" and put a proposal for AIP describing the architecture and will be happy to work on fully automating such performance tests. My proposal is to add "*performance-tests*" top-level directory in our repo where we might be able to put all such performance tests and make it into an automated framework. I would love to get it the stage where we: 1) choose the execution platform (say: Composer, Astronomer, Own Kubernetes cluster) 2) pick sizing of cluster to test 3) pick version of Airflow (master, latest 1.10.x to start with) 4) pick the set of performance tests we would like to run Providing we have the right credentials we should be able to create Airflow instance, prepare synthetic data, set-up monitoring, submit it for execution, wait for completion, download the monitoring data (or just generate links to the performance data), wind down the instance and produce a simple report summarising the most important information + links to the necessary dashboards. That might be a great start to add more of the tests and start adding more metrics. I think it's crucial to make it easy to run by anyone so that people can start adding more metrics/tests as needed. It's usually time consuming/boring having to set it all up manually and having a place that you can "extend" it makes it much easier. > Better metrics is definitely useful for the "operability" of an Airflow > cluster, but it usually lags behind in having enough metrics, as the way it > usually works is metrics get added _after_ someone goes "gah, something > funny is going on here, if only we had a metric X I would have seen it" :) > So some concerted work to add more of them up front would be great. > Agree. I will describe some of it in the framework doc I will propose. > > I would add a third class of performance work which is benchmarking > specific bits of code -- it's not quite instrumentation as it's not left in > the run-time code, but it is useful for be able to benchmark specific parts > of the code with a repeatable circumstances (particularly relevant for the > scheduler, where the structure of the dag, the state of other dags, and the > number of active runs all change the performance of code) so there is a lot > of "boiler-plate"/setup code to get the DB in the right state. > > Indeed. Like the "Scheduler" performance or "Celery Executor" One tool I've found very useful for capturing and visualising statsd > metrics locally is called "netdata" - it's like prometheus, statsd and > grafana all rolled in to one, and is amazing for being a single binary that > will capture metrics and display dashboards. I'm working on porting our > Airflow Grafana dashboards in to the netadata format for local > development. For example here is one of their live demos: > https://london.my-netdata.io/default.html < > https://london.my-netdata.io/default.html> > > Kaxil and I also have a large DAG (something like 1000 tasks overall with > many subdags) that we were using for testing of the DAG serialization > workflow > Yep. Would be great to share it. > > I don't think we need an AIP for this though, non of the changes are > contentious, or likely to require huge changes to existing code base, so I > think we can just get started on them, and ask for review for from a wider > dev audience if we think a change warrants it. > I like the idea of AIPs being "design docs" for what we want to achieve and be able to get people to contribute more of their ideas (or build on top of it and extend/add). I see an AIP more as a document that I can always point anyone to. I actually found that Google Docs are fare better for this kind of discussion than Wiki (and far better than discussion lists). So I am happy to have a Google Doc around it (and keep it open for discussion/update while implementing it - and maybe later moving to CWiki for archiving purpose). Maybe we should have another "term" for it (especially for non-functional changes). How about "Airflow Draft Design Docs" ? WDYT ? > -ash > > > On 11 Dec 2019, at 10:05, Jarek Potiuk <[email protected]> wrote: > > > > *TL;DR;* We are gearing up @ Polidea to work on Apache Airflow > performance > > and I wanted to start a discussion that might lead to creating a new AIP > > and implementing it :). Here is a high-level summary of the discussions > we > > had so far, so this might be a starting point to get some details worked > > out and end up with polished AIP. > > > > *Motivation* > > > > Airflow has a number of areas that require performance testing and > > currently when releasing new version of Airflow we are not able to reason > > about potential performance impacts, discuss performance degradation > > problems, because we have only anecdotal evidence of performance > > characteristics of Airflow. There is this fantastic post > > <https://www.astronomer.io/blog/profiling-the-airflow-scheduler/> from > Ash > > about profiling scheduler, but this is only one part of Airflow and it > > would be great to turn our performance work into regular and managed > > approach. > > > > *Areas* > > > > We think about two types of performance streams: > > > > - Instrumentation of the code > > - Synthetic CI-runnable E2E tests. > > > > Those can be implemented independently, but instrumentation might be of > > great help when synthetic tests are run. > > > > *Instrumentation* > > > > Instrumentation is targeted towards Airflow users and DAG developers. > > > > Instrumentation is mostly about gathering more performance > characteristics, > > including numbers related to database queries and performance, latency of > > DAG scheduling, parsing time etc. This all can be exported using current > > statsd interface and visualised using one of the standard metric tools > > (Grafana, Prometheus etc.) and some of this can be surfaced in the > existing > > UI where you can see it in the context of actual DAGS (mostly latency > > numbers). Most of it should be back-portable to 1.10.*. It can be used to > > track down performance numbers with some real in-production DAGs. > > > > Part of the effort should be instructions on how to setup monitoring and > > dashboards for Airflow, document which of the metrics mean what and > making > > sure the data is actionable - it should be rather clear for the > developers > > how to turn their observations into actions. > > > > An important part of this will be also to provide a way to export such > > performance information easily and share with community or service > > providers so that they can make more educated reasoning about performance > > problem their users experience. > > > > *Synthetic CI-runnable E2E tests* > > > > Synthetic tests are targeted towards Airflow committers/contributors > > > > The synthetic CI-runnable tests that will be able to produce generic > > performance numbers for the core of Airflow. We can prepare synthetic > data > > and run Airflow with NullExecutors, empty tasks, various Executors, using > > different deployments to focus only on the performance of the core > Airflow > > itself. We can run those performance tests on already released Airflow > > versions to compare the performance numbers, as well as it has the > benefit > > that we can visualise trends, compare versions, and have automated > > performance testing of new releases. Some of the instrumentation changes > > described above (especially the part that will be easily back-portable to > > 1.10.* series) might also be helpful in getting some performance > > characteristics. > > > > *What's my ask?* > > > > I'd love to start community discussion on this. Please feel free to add > > your comments and let's start working on it soon. I would love to get > some > > interested people and organise a SIG-performance group soon - @Kevin - I > > think you mentioned you would be interested, but anyone else is welcome > to > > join as well). > > > > J. > > > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
