seanghaeli opened a new pull request, #68922: URL: https://github.com/apache/airflow/pull/68922
## Why A `paused` Redshift cluster cannot be deleted. `delete_cluster` raises: ``` InvalidClusterStateFault: There is an operation running on the Cluster. Please try to delete it at a later time. ``` `RedshiftDeleteClusterOperator` already retries this error, but the retry cannot recover a paused cluster — a paused cluster never leaves that state on its own, so every attempt hits the same fault. The retries exhaust and the cluster is **left behind**, silently leaked until external cleanup reaps it. This was observed in practice with the `example_redshift` system test: a cluster left in the `paused` phase (e.g. after an upstream task failed before `resume_cluster`) was never deleted by the `delete_cluster` teardown task, and accumulated as a stale resource. ## What Add `_resume_if_paused()` and call it at the start of `execute()`: if the cluster is `paused`, resume it and wait for `available` before issuing the delete. Clusters in any other state are unaffected (early return), and the existing busy-retry loop for transient `InvalidClusterStateFault` during deletion is unchanged. ## Tests - `test_delete_paused_cluster_resumes_first` — a paused cluster is resumed, waited on (`cluster_available`), then deleted. - `test_delete_available_cluster_does_not_resume` — a non-paused cluster is deleted directly, with no spurious resume. - Existing delete-operator tests (deferrable paths, busy-retry exhaustion) unchanged and passing. Verified locally in Breeze: all `TestDeleteClusterOperator` tests pass. Generated-by: Claude Code (Opus via Claude Code) on behalf of Sean Ghaeli -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
