url54 commented on code in PR #34898: URL: https://github.com/apache/airflow/pull/34898#discussion_r1357451634
########## docs/apache-airflow/core-concepts/dags.rst: ########## @@ -514,6 +514,12 @@ For example, here is a DAG that uses a ``for`` loop to define some tasks: In general, we advise you to try and keep the *topology* (the layout) of your DAG tasks relatively stable; dynamic DAGs are usually better used for dynamically loading configuration options or changing operator options. +.. warning:: + + When utilizing Dynamic DAGs, please ensure you monitor key metrics like DagBagSize, to ensure your DAG script is not producing unexpected results. Performing iterations or looping, may cause unexpected results if your DAG + code does not properly handle all cases. Dynamic DAG creation may also adversely affect your environments overall health and performance. + + Review Comment: > I agree with Elad on the location of the document, but I'm also not sure about the warning itself. The Airflow DAG file is a normal python script, so if the user expects a certain number of dags to be created, they can add a check in the script itself (input validation, size limit from the list, even sending a notification or failing the dags creation). If we have multiple dynamic DAGs files, monitoring the DagBagSize metric could generate a lot of false alerts (the sum of multiple small changes could be big). The warning itself is meant to illustrate that there is always programmatic consequences for every action. I cannot tell you the number of times I have viewed a "perfectly" tested implementation that had a far edge case or "hole" in its implementation that caused performance degradation and odd behavior. I feel strongly that not providing some kind of informative disclaimer leads users to "assume" that any form of dynamic DAG creation is perfectly fine and has no consequences or performance hits on the environment overall. When in reality it takes a very carefully planned implementation to ensure its done correctly and consistently. I simply wanted to provide a little "caution: HOT" disclaimer to the Dynamic DAG section. To ensure users, whether they be newer or more seasoned, understand to take the idea more seriously. Because its more than likely going to be a newer user that falls into the trap of not handling all edge-cases and when performance degrades, believes that its the service that is performing poorly. It is crucial to point it out to the newer users, since they are the users who have not yet decided if Airflow is the appropriate platform for them. They may be dissuaded from using the service from false information. But as I mentioned to @eladkal, it could be in a better spot and covered in more detail. Which I will work on =). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org