Copilot commented on code in PR #63247:
URL: https://github.com/apache/airflow/pull/63247#discussion_r2915920397


##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -281,7 +281,7 @@ def ti_run(
         if ti.next_method:
             context.next_method = ti.next_method
             context.next_kwargs = ti.next_kwargs
-
+            context.start_date = ti.start_date

Review Comment:
   There are existing unit tests around resume/start_date behavior 
(`test_next_kwargs_determines_start_date_update`), but nothing asserts the new 
`start_date` field is returned on deferral resume or covers the empty-dict 
`next_kwargs` case. Please add a regression test that sets `ti.next_method` 
with `ti.next_kwargs = {}` and verifies (1) the DB `ti.start_date` is not 
overwritten on resume and (2) the `ti_run` response includes the original 
`start_date` for the supervisor to use.



##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -281,7 +281,7 @@ def ti_run(
         if ti.next_method:
             context.next_method = ti.next_method
             context.next_kwargs = ti.next_kwargs
-
+            context.start_date = ti.start_date

Review Comment:
   `ti` is a SQLAlchemy row built from the `old = select(...)` statement, but 
that select does not include `TI.start_date`. Referencing `ti.start_date` here 
will raise an AttributeError when `ti.next_method` is set (i.e., on deferral 
resume). Include `TI.start_date` in the initial select (or fetch it another 
way) before using it to populate `context.start_date`.
   ```suggestion
               context.start_date = session.scalar(
                   select(TI.start_date).where(TI.id == task_instance_id)
               )
   ```



##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -281,7 +281,7 @@ def ti_run(
         if ti.next_method:
             context.next_method = ti.next_method
             context.next_kwargs = ti.next_kwargs
-
+            context.start_date = ti.start_date

Review Comment:
   `ti_run()` currently decides whether to preserve the original TI start_date 
based on `ti.next_kwargs` truthiness (see earlier `if ti.next_kwargs: 
data.pop("start_date")`). Since deferred payloads default `next_kwargs` to `{}` 
(empty dict), `next_kwargs` can be present-but-falsy, causing the API to 
overwrite `ti.start_date` on resume and then return the wrong 
`context.start_date` here. Consider switching the resume check to 
`ti.next_method` (or `ti.next_kwargs is not None`) so empty dicts are treated 
as “resuming from deferral”.



##########
airflow-core/src/airflow/api_fastapi/execution_api/datamodels/taskinstance.py:
##########
@@ -374,6 +374,15 @@ class TIRunContext(BaseModel):
     should_retry: bool = False
     """If the ti encounters an error, whether it should enter retry or failed 
state."""
 
+    start_date: UtcDateTime | None = None
+    """
+    The original start date of the task instance.
+
+    When resuming from deferral, this is set to the task's original 
``start_date`` so the
+    supervisor uses it instead of ``datetime.now()``.  This ensures 
``context["ti"].start_date``
+    always reflects when the task *first* started, not when it was 
rescheduled/resumed.
+    """

Review Comment:
   This introduces new behavior/contract for the Execution API response 
(`TIRunContext.start_date`). Please add/adjust a Cadwyn version migration so 
older `Airflow-API-Version` values don’t unexpectedly see a new response field, 
and bump/add the appropriate CalVer version entry if required by the project’s 
versioning process.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to