uranusjr commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r689285513



##########
File path: docs/apache-airflow/dag-run.rst
##########
@@ -54,17 +54,31 @@ Cron Presets
 Your DAG will be instantiated for each schedule along with a corresponding
 DAG Run entry in the database backend.
 
-.. note::
+Data Interval
+-------------
+
+Each DAG run in Airflow has an assigned "data interval" that represents the 
time
+range it operates in. For a DAG scheduled with ``@daily``, for example, each of
+its data interval would start at midnight of each day, and end at midnight of
+the next day.
+
+A DAG run happens *after* its associated data interval has ended, to ensure the
+run is able to collect all the actual data within the time period. Therefore, a
+run covering the data period of 2020-01-01 will not start to run until
+2020-01-01 has ended, i.e. 2020-01-02 onwards.
+
+All dates in Airflow are tied to the data interval concept in some way. The
+"logical date" (also called ``execution_date`` from previous Airflow version)
+of a DAG run, for example, usually denotes the start of the data interval, not
+when the DAG is actually executed. Similarly, since the ``start_date`` argument
+for the DAG and its tasks points to the same logical date, a run will only
+be created after that data interval ends. So a DAG with ``@daily`` schedule and
+``start_date`` of 2020-01-01, for example, will not be created until 
2020-01-02.

Review comment:
       The previous paragraph tries to establish the idea of data intervals in 
general, and this paragraph explains the concept underlies almost all of 
Airflow’s date-related configurations, including (the most commonly 
misunderstood) `start_date`. The intuitive interpretation from the argument 
name is my DAG should *start running* after this date, while in fact only the 
DAG’s *first data interval* starts after it. I’ve reworded both paragraphs a 
bit to make them read more different.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to