nrobinson-intelycare opened a new issue, #45622:
URL: https://github.com/apache/airflow/issues/45622
### Apache Airflow Provider(s)
amazon
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==8.29.0
apache-airflow-providers-common-compat==1.3.0
apache-airflow-providers-common-io==1.5.0
apache-airflow-providers-common-sql==1.21.0
apache-airflow-providers-fab==1.5.2
apache-airflow-providers-ftp==3.12.0
apache-airflow-providers-http==5.0.0
apache-airflow-providers-imap==3.8.0
apache-airflow-providers-postgres==5.14.0
apache-airflow-providers-sendgrid==3.6.0
apache-airflow-providers-smtp==1.9.0
apache-airflow-providers-snowflake==5.8.1
apache-airflow-providers-sqlite==4.0.0
### Apache Airflow version
2.10.4
### Operating System
Amazon Linux 2023.6.20241212
### Deployment
Virtualenv installation
### Deployment details
Custom CDK stack with:
- EC2 instance running Airflow, managed by systemd
- IAM role granting permissions to AWS services
- RDS instance running Postgres
The Airflow virtualenv is managed by uv.
### What happened
When running a DAG with a deferrable BatchOperator and using boto3
credential strategy (`{base_aws.py:180} INFO - No connection ID provided.
Fallback on boto3 credential strategy (region_name='us-east-1')`) a deferrable
BatchOperator task can have it's trigger immediately fail after submitting a
batch job.
Although the trigger fails immediately, the batch job had launched
successfully, and executes until successful exit, unbeknownst to Airflow.
Due to the scheduling of the DAG, there currently have not been any overlaps
with the failed task's batch job and a subsequent task run yet, but having
overlapping runs would be undesirable.
This error happens about once a week. I believe it has something to do with
amazon-ssm-agent not rotating the credentials quickly enough.
### What you think should happen instead
`async_wait()` should catch the `NoCredentialsError` and continue to the
next waiter attempt.
https://github.com/apache/airflow/blob/main/providers/src/airflow/providers/amazon/aws/utils/waiter_with_logging.py#L133
### How to reproduce
Hard to reproduce, but invalidating AWS credentials right before the trigger
initializes would likely produce a similar traceback.
### Anything else
Traceback from task log:
```
[2025-01-10, 20:00:19 EST] {baseoperator.py:1806} ERROR - Trigger failed:
Traceback (most recent call last):
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py",
line 558, in cleanup_finished_triggers
result = details["task"].result()
^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py",
line 630, in run_trigger
async for event in trigger.run():
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/providers/amazon/aws/triggers/base.py",
line 143, in run
await async_wait(
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/providers/amazon/aws/utils/waiter_with_logging.py",
line 133, in async_wait
await waiter.wait(**args, WaiterConfig={"MaxAttempts": 1})
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/waiter.py",
line 49, in wait
return await AIOWaiter.wait(self, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/waiter.py",
line 95, in wait
response = await self._operation_method(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/waiter.py",
line 78, in __call__
return await self._client_method(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/client.py",
line 394, in _make_api_call
http, parsed_response = await self._make_request(
^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/client.py",
line 420, in _make_request
return await self._endpoint.make_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/endpoint.py",
line 96, in _send_request
request = await self.create_request(request_dict, operation_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/endpoint.py",
line 84, in create_request
await self._event_emitter.emit(
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/hooks.py",
line 68, in _emit
response = await resolve_awaitable(handler(**kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/_helpers.py",
line 6, in resolve_awaitable
return await obj
^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/signers.py",
line 24, in handler
return await self.sign(operation_name, request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/signers.py",
line 90, in sign
auth.add_auth(request)
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/botocore/auth.py", line
423, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials
[2025-01-10, 20:00:19 EST] {taskinstance.py:3311} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/models/taskinstance.py",
line 767, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/models/taskinstance.py",
line 733, in _execute_callable
return ExecutionCallableRunner(
^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/utils/operator_helpers.py",
line 252, in run
return self.func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/models/baseoperator.py",
line 1807, in resume_execution
raise TaskDeferralError(next_kwargs.get("error", "Unknown"))
airflow.exceptions.TaskDeferralError: Trigger failure
```
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]