MaksYermak commented on code in PR #45610:
URL: https://github.com/apache/airflow/pull/45610#discussion_r1923378670


##########
providers/src/airflow/providers/google/cloud/operators/bigquery.py:
##########
@@ -1717,12 +1717,14 @@ def execute(self, context: Context) -> None:
                 "tableId": table_id,
             },
             "labels": self.labels,
-            "schema": {"fields": schema_fields},
             "externalDataConfiguration": external_data_configuration,
             "location": self.location,
             "encryptionConfiguration": self.encryption_configuration,
         }
 
+        if self.schema_fields:

Review Comment:
   > Hey @MaksYermak Indeed, it should not reference `self.schema_fields` but 
`schema_fields` instead. I'll be happy to add unit test for that, but passing 
anything except `table_resource` raises `AirflowProviderDeprecationWarning`. Do 
you know if there's a chance to overcome that in CI/CD for particular tests?
   > 
   > ```python
   >     
@mock.patch("airflow.providers.google.cloud.operators.bigquery.BigQueryHook")
   >     def test_execute_with_schema_fields_not_set(self, mock_hook: 
MagicMock):
   >         operator = BigQueryCreateExternalTableOperator(
   >             task_id=TASK_ID,
   >             bucket=TEST_GCS_BUCKET,
   >             source_objects=TEST_GCS_CSV_DATA,
   >             schema_object="gcs://dummy",
   >             autodetect=True,
   >             
destination_project_dataset_table=f"{TEST_GCP_PROJECT_ID}.{TEST_DATASET}.{TEST_TABLE_ID}",
   >         )
   > 
   >         mock_hook.return_value.split_tablename.return_value = (
   >             TEST_GCP_PROJECT_ID,
   >             TEST_DATASET,
   >             TEST_TABLE_ID,
   >         )
   > 
   >         with 
mock.patch("airflow.providers.google.cloud.operators.bigquery.GCSHook") as 
gcs_hook:
   >             gcs_hook.return_value.download.return_value = b'{"c1": "int"}'
   >             operator.execute(context=MagicMock())
   > 
   >         assert "schema" in 
mock_hook.return_value.create_empty_table.call_args.kwargs["table_resource"]
   > ```
   > 
   > ```shell
   > Launching pytest with arguments --disable-forbidden-warnings 
test_bigquery.py::TestBigQueryCreateExternalTableOperator::test_execute_with_schema_fields_not_set
 --no-header --no-summary -q in airflow/providers/tests/google/cloud/operators
   > 
   > ============================= test session starts 
==============================
   > collecting ... collected 1 item
   > 
   > 
test_bigquery.py::TestBigQueryCreateExternalTableOperator::test_execute_with_schema_fields_not_set
 
   > 
   > ========================= 1 passed, 1 warning in 1.53s 
=========================
   > ```
   
   Hello @EugeneYushin,
   
   I have checked the code for `BigQueryCreateExternalTableOperator` operator 
one more time and I noticed that the `autodetect` parameter is located in the 
deprecated code. And it's a reason why you couldn't add a new Unit test for 
this code. In my opinion it doesn't make sense to update this code, because in 
the near future it will be  removed.
   
   About your problem with unworked `autodetect`. I think it can work with 
non-deprecated configuration. I have updated the code from your example. Could 
you please check this code on your environment and share the result?
   ```python
       bq_op = BigQueryCreateExternalTableOperator(
           gcp_conn_id="gcp_pp",
           location="us-east4",
           task_id="bq_op",
           table_resource={
               "tableReference": {
                   "projectId": PROJECT_ID,
                   "datasetId": "user_eyushin",
                   "tableId": "airflow_test_orig",
               },
               "externalDataConfiguration": {
                   "sourceFormat": "CSV",
                   "autodetect": True,
                   "compression": "NONE",
                   "sourceUris": [f"gs://{GCS_BUCKET}/test.csv"],
                   "csvOptions": {
                       "skipLeadingRows": 1,
                   }
               },
               "location": "us-east4",
           },
       )
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to