frobb opened a new issue, #50766:
URL: https://github.com/apache/airflow/issues/50766

   ### Apache Airflow Provider(s)
   
   amazon
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==9.6.0
   
   ### Apache Airflow version
   
   2.10.1
   
   ### Operating System
   
   MWAA
   
   ### Deployment
   
   Amazon (AWS) MWAA
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   In apache-airflow-providers-amazon==9.6.0, when using RDS operators such as 
RdsStartExportTaskOperator or RdsCreateDbSnapshotOperator, the aws_conn_id 
parameter provided during operator instantiation is not being honored. Instead, 
the operator attempts to use the aws_default connection.
   This behavior appears to be a regression from earlier versions of the 
provider (e.g., 8.x.x series) where the specified aws_conn_id was correctly 
used.
   The issue seems to stem from the RdsBaseOperator's __init__ method (from 
which other RDS operators inherit). In version 9.6.0 of 
`airflow/providers/amazon/aws/operators/rds.py`, the aws_conn_id parameter in 
the RdsBaseOperator.__init__ signature:
   
   class RdsBaseOperator(AwsBaseOperator[RdsHook]):
       # ...
       def __init__(
           self,
           *args,
           aws_conn_id: str | None = "aws_conn_id",  # <--- This line
           region_name: str | None = None,
           **kwargs,
       ):
           self.aws_conn_id = aws_conn_id
           self.region_name = region_name
           super().__init__(*args, **kwargs)
           # ...
   
   This explicitly captures the aws_conn_id if passed in kwargs. Consequently, 
when super().__init__(*args, **kwargs) is called to initialize the parent 
AwsBaseOperator, the aws_conn_id is no longer present in kwargs. 
AwsBaseOperator then falls back to its own default for its aws_conn_id 
parameter, which is "aws_default". The hook used by the operator then 
incorrectly uses this default connection.
   
   ### What you think should happen instead
   
   The RDS operator should use the specific AWS connection ID provided in its 
aws_conn_id parameter. If a user specifies aws_conn_id="my_custom_conn", the 
operator should use the Airflow connection named my_custom_conn, not 
aws_default.
   
   ### How to reproduce
   
   1. Prerequisites:
   Airflow version 2.10
   apache-airflow-providers-amazon==9.6.0 (or any subsequent version where this 
issue persists).
   
   2. Airflow Connections:
   In the Airflow UI, create an AWS connection named my_custom_rds_conn. The 
actual credentials can be placeholders for this reproduction; the existence of 
the named connection is key.
   
   Ensure that there is no Airflow connection named aws_default. 
(Alternatively, if aws_default must exist for other reasons, ensure its 
credentials would obviously fail or be different from my_custom_rds_conn for 
the RDS operation).
   
   3. Minimal DAG:
   Create and run the following DAG:
   
       from __future__ import annotations
   
       import pendulum
   
       from airflow.models.dag import DAG
       from airflow.providers.amazon.aws.operators.rds import 
RdsCreateDbSnapshotOperator # Or RdsStartExportTaskOperator
   
       with DAG(
           dag_id="rds_aws_conn_id_bug_repro",
           start_date=pendulum.datetime(2025, 5, 19, tz="UTC"),
           catchup=False,
           schedule=None,
           tags=["bug-repro"],
       ) as dag:
           create_snapshot_task = RdsCreateDbSnapshotOperator(
               task_id="create_db_snapshot_bug_test",
               aws_conn_id="my_custom_rds_conn",  # Intended connection
               db_type="instance",              # Placeholder
               db_identifier="my-dummy-db-id",  # Placeholder
               db_snapshot_identifier="my-dummy-snapshot-id" # Placeholder
           )
   
   4. Observe:
   Trigger the DAG.
   The create_db_snapshot_bug_test task is expected to fail (as the DB 
identifier is a dummy).
   Inspect the logs for the failed task instance.
   
   5. Expected vs. Actual Outcome:
   Expected (if bug wasn't present): If my_custom_rds_conn was actually used, 
the error might relate to "DB instance my-dummy-db-id not found" using the 
(placeholder) credentials from my_custom_rds_conn.
   
   Actual (due to bug): The task will fail with an error indicating that the 
connection aws_default could not be found (e.g., 
airflow.exceptions.AirflowNotFoundException: The conn_id aws_default isn't 
defined), or an equivalent Boto3/AWS SDK error if it tries to use default 
credential chain after failing to find aws_default. This demonstrates that the 
operator ignored aws_conn_id="my_custom_rds_conn" and attempted to use 
aws_default.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to