Copilot commented on code in PR #63247:
URL: https://github.com/apache/airflow/pull/63247#discussion_r2915920397
##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -281,7 +281,7 @@ def ti_run(
if ti.next_method:
context.next_method = ti.next_method
context.next_kwargs = ti.next_kwargs
-
+ context.start_date = ti.start_date
Review Comment:
There are existing unit tests around resume/start_date behavior
(`test_next_kwargs_determines_start_date_update`), but nothing asserts the new
`start_date` field is returned on deferral resume or covers the empty-dict
`next_kwargs` case. Please add a regression test that sets `ti.next_method`
with `ti.next_kwargs = {}` and verifies (1) the DB `ti.start_date` is not
overwritten on resume and (2) the `ti_run` response includes the original
`start_date` for the supervisor to use.
##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -281,7 +281,7 @@ def ti_run(
if ti.next_method:
context.next_method = ti.next_method
context.next_kwargs = ti.next_kwargs
-
+ context.start_date = ti.start_date
Review Comment:
`ti` is a SQLAlchemy row built from the `old = select(...)` statement, but
that select does not include `TI.start_date`. Referencing `ti.start_date` here
will raise an AttributeError when `ti.next_method` is set (i.e., on deferral
resume). Include `TI.start_date` in the initial select (or fetch it another
way) before using it to populate `context.start_date`.
```suggestion
context.start_date = session.scalar(
select(TI.start_date).where(TI.id == task_instance_id)
)
```
##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -281,7 +281,7 @@ def ti_run(
if ti.next_method:
context.next_method = ti.next_method
context.next_kwargs = ti.next_kwargs
-
+ context.start_date = ti.start_date
Review Comment:
`ti_run()` currently decides whether to preserve the original TI start_date
based on `ti.next_kwargs` truthiness (see earlier `if ti.next_kwargs:
data.pop("start_date")`). Since deferred payloads default `next_kwargs` to `{}`
(empty dict), `next_kwargs` can be present-but-falsy, causing the API to
overwrite `ti.start_date` on resume and then return the wrong
`context.start_date` here. Consider switching the resume check to
`ti.next_method` (or `ti.next_kwargs is not None`) so empty dicts are treated
as “resuming from deferral”.
##########
airflow-core/src/airflow/api_fastapi/execution_api/datamodels/taskinstance.py:
##########
@@ -374,6 +374,15 @@ class TIRunContext(BaseModel):
should_retry: bool = False
"""If the ti encounters an error, whether it should enter retry or failed
state."""
+ start_date: UtcDateTime | None = None
+ """
+ The original start date of the task instance.
+
+ When resuming from deferral, this is set to the task's original
``start_date`` so the
+ supervisor uses it instead of ``datetime.now()``. This ensures
``context["ti"].start_date``
+ always reflects when the task *first* started, not when it was
rescheduled/resumed.
+ """
Review Comment:
This introduces new behavior/contract for the Execution API response
(`TIRunContext.start_date`). Please add/adjust a Cadwyn version migration so
older `Airflow-API-Version` values don’t unexpectedly see a new response field,
and bump/add the appropriate CalVer version entry if required by the project’s
versioning process.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]