hkc-8010 opened a new issue, #66905:
URL: https://github.com/apache/airflow/issues/66905

   ### Apache Airflow version
   
   main branch
   
   ### What happened?
   
   `TriggerDagRunOperator` can fail with `DagRunAlreadyExists` even though the 
child Dag run was created successfully.
   
   In the Airflow 3 task-sdk path, `DagRunOperations.trigger()` sends `POST 
/execution/dag-runs/{dag_id}/{run_id}` through the generic execution API retry 
layer. If the server creates the Dag run but the client sees an ambiguous 
transport or request error, the retry can POST the same run ID again and 
receive `409 Conflict`.
   
   The task runner then treats that as a real pre-existing run, marks the 
parent task failed, and does not write the `trigger_run_id` XCom.
   
   ### What you think should happen instead?
   
   A transport-level ambiguity after a trigger POST should not be converted 
into a duplicate-run failure when the requested Dag run now exists.
   
   ### How to reproduce
   
   1. Mock `POST /dag-runs/{dag_id}/{run_id}` so the server-side run is created 
but the client sees `httpx.RequestError`.
   2. Return an existing Dag run from `GET /dag-runs/{dag_id}/{run_id}`.
   3. The trigger operation should treat this as success for that run ID 
instead of surfacing `DAGRUN_ALREADY_EXISTS`.
   
   ### Code pointers
   
   `task-sdk/src/airflow/sdk/api/client.py`
   `task-sdk/src/airflow/sdk/execution_time/task_runner.py`
   
   ### Are you willing to submit PR?
   
   Yes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to