url54 commented on code in PR #34898:
URL: https://github.com/apache/airflow/pull/34898#discussion_r1357451634


##########
docs/apache-airflow/core-concepts/dags.rst:
##########
@@ -514,6 +514,12 @@ For example, here is a DAG that uses a ``for`` loop to 
define some tasks:
 
 In general, we advise you to try and keep the *topology* (the layout) of your 
DAG tasks relatively stable; dynamic DAGs are usually better used for 
dynamically loading configuration options or changing operator options.
 
+.. warning::
+
+  When utilizing Dynamic DAGs, please ensure you monitor key metrics like 
DagBagSize, to ensure your DAG script is not producing unexpected results. 
Performing iterations or looping, may cause unexpected results if your DAG 
+  code does not properly handle all cases. Dynamic DAG creation may also 
adversely  affect your environments overall health and performance.
+
+

Review Comment:
   > I agree with Elad on the location of the document, but I'm also not sure 
about the warning itself. The Airflow DAG file is a normal python script, so if 
the user expects a certain number of dags to be created, they can add a check 
in the script itself (input validation, size limit from the list, even sending 
a notification or failing the dags creation). If we have multiple dynamic DAGs 
files, monitoring the DagBagSize metric could generate a lot of false alerts 
(the sum of multiple small changes could be big).
   
   The warning itself is meant to illustrate that there is always programmatic 
consequences for every action. I cannot tell you the number of times I have 
viewed a "perfectly" tested implementation that had a far edge case or "hole" 
in its implementation that caused performance degradation and odd behavior.
   
   I feel strongly that not providing some kind of informative disclaimer leads 
users to "assume" that any form of dynamic DAG creation is perfectly fine and 
has no consequences or performance hits on the environment overall. When in 
reality it takes a very carefully planned implementation to ensure its done 
correctly and consistently. 
   
   I simply wanted to provide a little "caution: HOT" disclaimer to the Dynamic 
DAG section. To ensure users, whether they be newer or more seasoned, 
understand to take the idea more seriously. Because its more than likely going 
to be a newer user that falls into the trap of not handling all edge-cases and 
when performance degrades, believes that its the service that is performing 
poorly. It is crucial to point it out to the newer users, since they are the 
users who have not yet decided if Airflow is the appropriate platform for them. 
They may be dissuaded from using the service from false information.
   
   But as I mentioned to @eladkal, it could be in a better spot and covered in 
more detail. Which I will work on =).   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to