Acehaidrey commented on a change in pull request #9544:
URL: https://github.com/apache/airflow/pull/9544#discussion_r447208601



##########
File path: airflow/models/dagrun.py
##########
@@ -411,6 +412,44 @@ def _are_premature_tis(
                 return True
         return False
 
+    @provide_session
+    def _emit_true_scheduling_delay_stats_for_finished_state(self, 
session=None):
+        """
+        This is a helper method to emit the true scheduling delay stats, which 
is defined as
+        the time when the first task in DAG starts minus the expected DAG run 
datetime.
+        This method will be used in the update_state method when the state of 
the DagRun
+        is updated to a completed status (either success or failure). The 
method will find the first
+        started task within the DAG and calculate the expected DagRun start 
time (based on
+        dag.execution_date & dag.schedule_interval), and minus these two to 
get the delay.
+
+        The emitted data may contains outlier (e.g. when the first task was 
cleared, so
+        the second task's start_date will be used), but we can get ride of the 
the outliers
+        on the stats side through the dashboards.
+
+        Note, the stat will only be emitted if the DagRun is a scheduler 
triggered one
+        (i.e. external_trigger is False).
+        """
+        if self.state == State.RUNNING:
+            return
+
+        try:
+            if self.external_trigger:
+                return
+            # Get the task that has the earliest start_date
+            qry = session.query(TI).filter(
+                TI.dag_id == self.dag_id,
+                TI.execution_date == self.execution_date,
+                TI.start_date.isnot(None),
+            ).order_by(TI.start_date.asc())
+            ti = qry.first()
+            dag = self.get_dag()
+            if ti and dag:

Review comment:
       If there is any way to get that information without doing the query I am 
all ears to make that change!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to