Yeah, I also think it is a good idea to expose it in the Airflow UI. Although, atm we do not have an entity such as DAG file (and this metric is per DAG file) in Airflow database, so we would need to design it a little bit. And attaching to the DAG model is not correct.
But I totally agree, it would be good to have it in Airflow UI as well for "operation users" to have access to this information. On Fri, Jun 14, 2024 at 11:22 AM Jarek Potiuk <[email protected]> wrote: > Good idea, it would also be good if we could have access to the information > exposed in the UI - so that "operations users" can see it and maybe even > act on it + API/ CLI to check it. I think in the future of Airflow 3 where > we will have task isolation, having `0` for all the DAGs will be a > prerequisite for switching to "task isolation" mode and this could be > actually verified in a migration tool. > > On Fri, Jun 14, 2024 at 10:59 AM Eugen Kosteev <[email protected]> wrote: > > > Hi. > > > > I would like to discuss the proposal of adding a new column to the "DAG > > File Processing Stats" of DAG processor logs. > > > > Currently in the logs of DAG processor, there is following data > > (screenshot below) that includes # of DAGs, runtime, etc. per DAG file. > > [image: image.png] > > > > It seems that it would be beneficial to have also there data about the > > number of queries performed to the Airflow database during parsing of > each > > file. > > It maybe convenient to have it in case of debugging issues related to > high > > load on Airflow database, e.g. typical scenario is when DAG file(s) have > > a lot of queries to database done on the top level of code and those are > > executed each time during parsing of these DAG files. > > One common example is excessive usage of "Variables.get" as top-level > > statements in DAG files. > > > > Having information about "number of queries to Airflow database" per DAG > > file may help a lot during debugging issues related to high load on > > database or issues related to long parsing of the DAG files. > > > > One caveat is that due to e.g. caching enabled for Variables or because > of > > other reasons (dynamic DAGs), number of queries may be very different for > > each parsing of the DAG file, > > but at least we can have it as "Last Run Number of Queries" - that would > > already give some idea and engineer can also review logs historically to > > see its data in the past. > > > > What are your thoughts? > > > > -- > > Eugene > > > -- Eugene
