This is an automated email from the ASF dual-hosted git repository.
vincbeck pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new fc6c9844d28 fix(tests): Add retries to EMR on EKS system test job
submission (#68257)
fc6c9844d28 is described below
commit fc6c9844d2856858c0eedd89c76b35e8b1087834
Author: D. Ferruzzi <[email protected]>
AuthorDate: Tue Jun 9 08:00:46 2026 -0700
fix(tests): Add retries to EMR on EKS system test job submission (#68257)
Switch EmrContainerOperator to wait_for_completion=True with 2 retries
and a 2 minute retry delay. This handles the intermittent failure where
the EKS OIDC webhook hasn't propagated by the time the EMR Spark
driver pod is scheduled, causing the job to fail on first attempt.
The sensor is retained for documentation purposes but will pass
immediately since the operator now waits for completion.
---
providers/amazon/tests/system/amazon/aws/example_emr_eks.py | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/providers/amazon/tests/system/amazon/aws/example_emr_eks.py
b/providers/amazon/tests/system/amazon/aws/example_emr_eks.py
index 82f0a38a564..56af4e1f931 100644
--- a/providers/amazon/tests/system/amazon/aws/example_emr_eks.py
+++ b/providers/amazon/tests/system/amazon/aws/example_emr_eks.py
@@ -20,7 +20,7 @@ import json
import logging
import subprocess
import time
-from datetime import datetime
+from datetime import datetime, timedelta
import boto3
from tenacity import retry, retry_if_exception_type, stop_after_delay,
wait_exponential
@@ -338,8 +338,9 @@ with DAG(
name="pi.py",
)
# [END howto_operator_emr_container]
- job_starter.wait_for_completion = False
- job_starter.job_retry_max_attempts = 5
+ job_starter.wait_for_completion = True
+ job_starter.retries = 2
+ job_starter.retry_delay = timedelta(minutes=2)
# [START howto_sensor_emr_container]
job_waiter = EmrContainerSensor(