SameerMesiah97 opened a new issue, #64048:
URL: https://github.com/apache/airflow/issues/64048
### Apache Airflow Provider(s)
airbyte
### Versions of Apache Airflow Providers
`apache-airflow-providers-airbyte==5.3.3`
### Apache Airflow version
main
### Operating System
Debian GNU/Linux 12 (bookworm)
### Deployment
Other
### Deployment details
_No response_
### What happened
When using `AirbyteTriggerSyncOperator` in deferrable mode with
`execution_timeout` set, the Airflow task times out but the underlying Airbyte
job continues running.
In non-deferrable mode, exceeding `execution_timeout` causes both the task
to fail and the Airbyte job to be cancelled via `on_kill()`. In deferrable
mode, the task fails due to timeout, but the external Airbyte job is not
stopped. This can result in long-running or orphaned Airbyte jobs.
This creates inconsistent behavior between deferrable and non-deferrable
execution modes.
### What you think should happen instead
`execution_timeout` should be enforced consistently regardless of execution
mode.
When a deferrable `AirbyteTriggerSyncOperator` exceeds `execution_timeout`:
* The Airflow task should fail due to execution timeout
* The associated Airbyte job should be cancelled
This ensures predictable timeout behavior and prevents orphaned Airbyte jobs.
### How to reproduce
1. Configure an Airbyte connection in Airflow (`airbyte_default`).
2. Create an Airbyte connection (source → destination) that runs longer than
~30 seconds
(for example: a slow running sync between 2 datasets).
3. Use the following DAG (replace `<CONNECTION_ID>` with your Airbyte
connection ID):
```python
from airflow import DAG
from airflow.providers.airbyte.operators.airbyte import
AirbyteTriggerSyncOperator
from datetime import datetime, timedelta
with DAG(
dag_id="airbyte_deferrable_execution_timeout_repro",
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
) as dag:
run_airbyte = AirbyteTriggerSyncOperator(
task_id="run_airbyte",
airbyte_conn_id="airbyte_default",
connection_id="<CONNECTION_ID>",
asynchronous=False,
deferrable=True,
execution_timeout=timedelta(seconds=30),
)
```
4. Trigger the DAG and wait for the task to exceed `execution_timeout`.
**Observed Behavior**
* The Airflow task fails due to execution timeout
* The Airbyte job continues running and is not cancelled
### Anything else
This inconsistency makes `execution_timeout` unreliable for deferrable
Airbyte jobs and can lead to unintended resource usage.
While deferrable operators execute via the triggerer rather than a worker
process, this does not preclude enforcing `execution_timeout` semantics. The
expectation is that task-level timeout behavior remains consistent across
execution modes, even if the underlying enforcement mechanism differs.
This issue is similar to #61467, which reports the same class of bug for
`DbtCloudRunJobOperator`.
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]