SameerMesiah97 opened a new pull request, #66815:
URL: https://github.com/apache/airflow/pull/66815

   **Description**
   
   This change introduces deferrable execution support for 
`AzureBatchOperator`. Previously, the operator only supported synchronous 
execution, where a worker process continuously polled Azure Batch until all 
tasks in the job reached a terminal state. This update allows the operator to 
defer while waiting for Batch job completion.
   
   When `deferrable=True`, the operator creates the pool, waits for nodes to 
become ready, creates the job and task, and then defers execution to a new 
`AzureBatchTrigger`. The trigger polls the Batch job state and emits a 
completion event when all tasks reach a terminal state or when the configured 
timeout is exceeded. The operator resumes in `execute_complete()` to log 
success or raise an exception if the Batch job failed or timed out.
   
   The implementation preserves the existing synchronous pool provisioning 
behavior while offloading long-running Batch job monitoring to the triggerer 
process.
   
   **Rationale**
   
   Azure Batch jobs can run for extended periods depending on workload size, VM 
provisioning time, and task execution duration. In synchronous mode the 
operator continuously polls Batch job state from a worker process, which can 
block worker slots for the duration of the run. Deferrable execution offloads 
this polling to the triggerer process, allowing the task to suspend while 
waiting for completion.
   
   **Pool provisioning and node readiness checks remain synchronous. While node 
provisioning may take several minutes, it is typically short relative to 
overall Batch job execution time. Fully deferring pool provisioning would 
require additional trigger complexity for relatively limited worker-slot 
savings.**
   
   This change also maintains consistency with other long-running operators in 
the Azure provider which support deferrable execution semantics. Adding 
deferrable support to `AzureBatchOperator` ensures that long-running Azure 
Batch workloads can be monitored efficiently without unnecessarily consuming 
worker resources.
   
   **Tests**
   
   Added operator tests verifying that:
   
   * The operator defers execution when `deferrable=True`.
   * `execute_complete()` logs success when the trigger reports successful 
completion.
   * `execute_complete()` raises `RuntimeError` when the trigger reports 
timeout or failure conditions.
   * `execute_complete()` raises `RuntimeError` when no event or an unexpected 
event payload is received.
   
   Added trigger tests verifying that:
   
   * The trigger serializes correctly with the expected classpath and arguments.
   * `_build_trigger_event()` returns the correct event for successful and 
failed Batch task states and `None` for non-terminal states.
   * The trigger emits success and failure events when terminal Batch task 
states are reached during polling.
   * Timeout conditions emit timeout events correctly.
   * Exceptions raised during polling result in an error event.
   
   **Example DAGs**
   
   An example DAG has been added demonstrating `AzureBatchOperator` running in 
deferrable mode.
   
   **Documentation**
   
   Docstrings for `AzureBatchOperator` and `AzureBatchTrigger` have been 
updated to document deferrable execution, trigger behavior, and the 
`deferrable` and `poll_interval` parameters.
   
   **Backwards Compatibility**
   
   Deferrable execution is opt-in and controlled by the `deferrable` parameter, 
which defaults to `False`. Existing DAGs continue to use synchronous execution 
unless `deferrable=True` is explicitly enabled.
   
   **Acknowledgements**
   
   This change builds on the earlier deferrable implementation work explored in 
PR #59798 by @Ankurdeewan.
   
   Closes: #59779


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to