shahar1 opened a new issue, #36484: URL: https://github.com/apache/airflow/issues/36484
### Apache Airflow version main (development) ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? ### Introduction PR #33786 aims to prevent templated field logic in `__init__` method of operators automatically by introducing a new pre-commit that validates each operator (see added limitations in the PR). In short, for templated fields, there should be only an assignment operation in the constructor (i.e., `self.field = field`), and the name of the assigned field should be the same as the assigned parameter (i.e., the following is invalid: `self.field = something_else`). Before merging this PR, we need to fix the existing operators in separate PRs that don't follow the new limitations. The following list includes currently invalid operators by file paths that need to be fixed (plus details of what should be fixed). It was manually scraped on Dec. 12 2023 from the PR's output of currently invalid operators' constructors. Please feel free to create PRs to fix any of the files. ### Tasks list - [ ] `airflow/providers/apache/hive/transfers/hive_to_samba.py` ``` HiveToSambaOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['hql'] HiveToSambaOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.hql'] ``` - [ ] `airflow/providers/google/cloud/operators/dataproc.py` ``` DataprocSubmitPigJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['impersonation_chain', 'dataproc_jars', 'cluster_name', 'dataproc_properties', 'region', 'job_name'] DataprocSubmitHiveJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['impersonation_chain', 'region', 'job_name', 'cluster_name', 'dataproc_jars', 'dataproc_properties'] DataprocSubmitSparkSqlJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['impersonation_chain', 'region', 'job_name', 'cluster_name', 'dataproc_jars', 'dataproc_properties'] DataprocSubmitSparkJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['impersonation_chain', 'region', 'job_name', 'cluster_name', 'dataproc_jars', 'dataproc_properties'] DataprocSubmitHadoopJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['impersonation_chain', 'region', 'job_name', 'cluster_name', 'dataproc_jars', 'dataproc_properties'] DataprocSubmitPySparkJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['impersonation_chain', 'region', 'job_name', 'cluster_name', 'dataproc_properties', 'dataproc_jars'] ``` - [ ] airflow/providers/apache/livy/operators/livy.py LivyOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['spark_params'] LivyOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.spark_params'] - [ ] `airflow/providers/google/cloud/operators/bigquery.py` ``` BigQueryGetDataOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['max_results'] BigQueryGetDataOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.max_results'] BigQueryCreateExternalTableOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['gcs_schema_bucket', 'table_resource', 'destination_project_dataset_table', 'source_objects', 'schema_object', 'bucket'] ``` - [ ] `airflow/providers/google/cloud/transfers/bigquery_to_postgres.py` ``` BigQueryToPostgresOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['table_id', 'dataset_id'] ``` - [ ] `airflow/providers/alibaba/cloud/operators/analyticdb_spark.py` ``` AnalyticDBSparkSQLOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['spark_params'] AnalyticDBSparkSQLOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.spark_params'] AnalyticDBSparkBatchOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['spark_params'] AnalyticDBSparkBatchOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.spark_params'] ``` - [ ] `airflow/providers/microsoft/azure/operators/container_instances.py` ``` AzureContainerInstancesOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['name'] AzureContainerInstancesOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.name'] ``` - [ ] airflow/providers/databricks/operators/databricks.py ``` DatabricksCreateJobsOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.json'] DatabricksRunNowOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.json'] ``` - [ ] `airflow/providers/google/cloud/transfers/bigquery_to_mysql.py` ``` BigQueryToMySqlOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['dataset_id', 'table_id'] airflow/providers/amazon/aws/operators/emr.py EmrCreateJobFlowOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['waiter_max_attempts', 'waiter_delay'] EmrCreateJobFlowOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.waiter_max_attempts', 'self.waiter_delay'] ``` - [ ] airflow/providers/papermill/operators/papermill.py ``` PapermillOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['input_nb', 'output_nb'] ``` - [ ] `airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py` ``` CloudDataTransferServiceCreateJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['body'] CloudDataTransferServiceCreateJobOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.body'] ``` - [ ] `airflow/providers/google/cloud/operators/vertex_ai/auto_ml.py` ``` CreateAutoMLForecastingTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['region', 'impersonation_chain', 'parent_model'] CreateAutoMLImageTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['region', 'impersonation_chain', 'parent_model'] CreateAutoMLTabularTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['region', 'impersonation_chain', 'parent_model'] CreateAutoMLVideoTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['region', 'impersonation_chain', 'parent_model'] ``` - [ ] `airflow/providers/google/cloud/transfers/sftp_to_gcs.py` ``` SFTPToGCSOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['destination_path', 'destination_bucket'] SFTPToGCSOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.destination_path', 'self.destination_bucket'] ``` - [ ] `airflow/providers/amazon/aws/transfers/base.py` ``` AwsToAwsBaseOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['source_aws_conn_id', 'dest_aws_conn_id'] AwsToAwsBaseOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.dest_aws_conn_id'] ``` - [ ] `airflow/providers/amazon/aws/operators/datasync.py` ``` DataSyncOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['create_source_location_kwargs', 'create_destination_location_kwargs'] DataSyncOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.create_source_location_kwargs', 'self.create_destination_location_kwargs'] ``` - [ ] `airflow/providers/weaviate/operators/weaviate.py` ``` WeaviateIngestOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['input_data'] ``` - [ ] `airflow/providers/amazon/aws/transfers/gcs_to_s3.py` ``` GCSToS3Operator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['gcs_bucket'] ``` - [ ] `airflow/providers/amazon/aws/transfers/redshift_to_s3.py` ``` RedshiftToS3Operator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['unload_options', 'select_query', 's3_key'] RedshiftToS3Operator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.s3_key', 'self.unload_options'] airflow/providers/google/cloud/operators/compute.py ComputeEngineInsertInstanceOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.resource_id', 'self.resource_id'] ComputeEngineInsertInstanceFromTemplateOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.resource_id', 'self.resource_id'] ComputeEngineInsertInstanceTemplateOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.resource_id', 'self.resource_id'] ComputeEngineInstanceGroupUpdateManagerTemplateOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.zone'] ComputeEngineInsertInstanceGroupManagerOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.resource_id', 'self.resource_id'] ``` - [ ] `airflow/providers/cncf/kubernetes/operators/pod.py` ``` KubernetesPodOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['volume_mounts', 'env_vars', 'volumes'] KubernetesPodOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.env_vars', 'self.volume_mounts', 'self.volumes'] ``` - [ ] `airflow/providers/amazon/aws/operators/eks.py` ``` EksCreateClusterOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['wait_for_completion'] EksCreateClusterOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.wait_for_completion'] EksCreateNodegroupOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['wait_for_completion'] EksCreateNodegroupOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.wait_for_completion'] EksCreateFargateProfileOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['wait_for_completion'] EksCreateFargateProfileOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.wait_for_completion'] EksDeleteClusterOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['wait_for_completion'] EksDeleteClusterOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.wait_for_completion'] ``` - [ ] `airflow/providers/apache/hive/operators/hive_stats.py` ``` HiveStatsCollectionOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['dttm', 'ds'] HiveStatsCollectionOperator's constructor contains invalid assignments to the following instance members that should be corresponding to template fields (i.e., self.field_name = field_name): ['self.ds', 'self.dttm'] ``` - [ ] `airflow/providers/google/cloud/operators/vertex_ai/custom_job.py` ``` CreateCustomContainerTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['dataset_id', 'region', 'parent_model', 'impersonation_chain'] CreateCustomPythonPackageTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['dataset_id', 'region', 'parent_model', 'impersonation_chain'] CreateCustomTrainingJobOperator's constructor lacks direct assignments for instance members corresponding to the following template fields (i.e., self.field_name = field_name or super.__init__(field_name=field_name, ...) ): ['dataset_id', 'region', 'parent_model', 'impersonation_chain'] ``` ### What you think should happen instead? _No response_ ### How to reproduce N/A ### Operating System N/A ### Versions of Apache Airflow Providers _No response_ ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org