kaxil commented on a change in pull request #17552:
URL: https://github.com/apache/airflow/pull/17552#discussion_r708733553



##########
File path: docs/apache-airflow/dag-run.rst
##########
@@ -54,17 +54,36 @@ Cron Presets
 Your DAG will be instantiated for each schedule along with a corresponding
 DAG Run entry in the database backend.
 
-.. note::
 
-    If you run a DAG on a schedule_interval of one day, the run stamped 
2020-01-01
-    will be triggered soon after 2020-01-01T23:59. In other words, the job 
instance is
-    started once the period it covers has ended.  The ``execution_date`` 
available in the context
-    will also be 2020-01-01.
+.. _data-interval:
 
-    The first DAG Run is created based on the minimum ``start_date`` for the 
tasks in your DAG.
-    Subsequent DAG Runs are created by the scheduler process, based on your 
DAG’s ``schedule_interval``,
-    sequentially. If your start_date is 2020-01-01 and schedule_interval is 
@daily, the first run
-    will be created on 2020-01-02 i.e., after your start date has passed.
+Data Interval
+-------------
+
+Each DAG run in Airflow has an assigned "data interval" that represents the 
time
+range it operates in. For a DAG scheduled with ``@daily``, for example, each of
+its data interval would start at midnight of each day and end at midnight of 
the
+next day.
+
+A DAG run is usually scheduled *after* its associated data interval has ended,
+to ensure the run is able to collect all the data within the time period. In
+other words, a run covering the data period of 2020-01-01 generally does not
+start to run until 2020-01-01 has ended, i.e. after 2020-01-02 00:00:00.
+
+All dates in Airflow are tied to the data interval concept in some way. The
+"logical date" (also called ``execution_date`` in Airflow versions prior to 
2.2)
+of a DAG run, for example, denotes the start of the data interval, not when the
+DAG is actually executed.
+
+Similarly, since the ``start_date`` argument for the DAG and its tasks points 
to
+the same logical date, it marks the start of *the DAG's fist data interval*, 
not
+when tasks in the DAG will start running. In other words, a DAG run will only 
be
+scheduled one interval after ``start_date``.
+
+.. tip::
+
+    If ``schedule_interval`` is not enough to express your DAG's schedule,
+    logical date, or data interval, see :doc:`Customizing imetables 
</howto/timetable>`.

Review comment:
       ```suggestion
       logical date, or data interval, see :doc:`Customizing timetables 
</howto/timetable>`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to