SameerMesiah97 opened a new issue, #64048:
URL: https://github.com/apache/airflow/issues/64048

   ### Apache Airflow Provider(s)
   
   airbyte
   
   ### Versions of Apache Airflow Providers
   
   `apache-airflow-providers-airbyte==5.3.3`
   
   ### Apache Airflow version
   
   main
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   When using `AirbyteTriggerSyncOperator` in deferrable mode with 
`execution_timeout` set, the Airflow task times out but the underlying Airbyte 
job continues running.
   
   In non-deferrable mode, exceeding `execution_timeout` causes both the task 
to fail and the Airbyte job to be cancelled via `on_kill()`. In deferrable 
mode, the task fails due to timeout, but the external Airbyte job is not 
stopped. This can result in long-running or orphaned Airbyte jobs.
   
   This creates inconsistent behavior between deferrable and non-deferrable 
execution modes.
   
   ### What you think should happen instead
   
   `execution_timeout` should be enforced consistently regardless of execution 
mode.
   
   When a deferrable `AirbyteTriggerSyncOperator` exceeds `execution_timeout`:
   
   * The Airflow task should fail due to execution timeout
   * The associated Airbyte job should be cancelled
   
   This ensures predictable timeout behavior and prevents orphaned Airbyte jobs.
   
   ### How to reproduce
   
   1. Configure an Airbyte connection in Airflow (`airbyte_default`).
   
   2. Create an Airbyte connection (source → destination) that runs longer than 
~30 seconds
      (for example: a slow running sync between 2 datasets).
   
   3. Use the following DAG (replace `<CONNECTION_ID>` with your Airbyte 
connection ID):
   
   ```python
   from airflow import DAG
   from airflow.providers.airbyte.operators.airbyte import 
AirbyteTriggerSyncOperator
   from datetime import datetime, timedelta
   
   with DAG(
       dag_id="airbyte_deferrable_execution_timeout_repro",
       start_date=datetime(2024, 1, 1),
       schedule=None,
       catchup=False,
   ) as dag:
   
       run_airbyte = AirbyteTriggerSyncOperator(
           task_id="run_airbyte",
           airbyte_conn_id="airbyte_default",
           connection_id="<CONNECTION_ID>",
           asynchronous=False,
           deferrable=True,
           execution_timeout=timedelta(seconds=30),
       )
   ```
   
   4. Trigger the DAG and wait for the task to exceed `execution_timeout`.
   
   **Observed Behavior**
   
   * The Airflow task fails due to execution timeout
   * The Airbyte job continues running and is not cancelled
   
   
   ### Anything else
   
   This inconsistency makes `execution_timeout` unreliable for deferrable 
Airbyte jobs and can lead to unintended resource usage.
   
   While deferrable operators execute via the triggerer rather than a worker 
process, this does not preclude enforcing `execution_timeout` semantics. The 
expectation is that task-level timeout behavior remains consistent across 
execution modes, even if the underlying enforcement mechanism differs.
   
   This issue is similar to #61467, which reports the same class of bug for 
`DbtCloudRunJobOperator`.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to