MaksYermak commented on code in PR #45610:
URL: https://github.com/apache/airflow/pull/45610#discussion_r1923378670
##########
providers/src/airflow/providers/google/cloud/operators/bigquery.py:
##########
@@ -1717,12 +1717,14 @@ def execute(self, context: Context) -> None:
"tableId": table_id,
},
"labels": self.labels,
- "schema": {"fields": schema_fields},
"externalDataConfiguration": external_data_configuration,
"location": self.location,
"encryptionConfiguration": self.encryption_configuration,
}
+ if self.schema_fields:
Review Comment:
> Hey @MaksYermak Indeed, it should not reference `self.schema_fields` but
`schema_fields` instead. I'll be happy to add unit test for that, but passing
anything except `table_resource` raises `AirflowProviderDeprecationWarning`. Do
you know if there's a chance to overcome that in CI/CD for particular tests?
>
> ```python
>
@mock.patch("airflow.providers.google.cloud.operators.bigquery.BigQueryHook")
> def test_execute_with_schema_fields_not_set(self, mock_hook:
MagicMock):
> operator = BigQueryCreateExternalTableOperator(
> task_id=TASK_ID,
> bucket=TEST_GCS_BUCKET,
> source_objects=TEST_GCS_CSV_DATA,
> schema_object="gcs://dummy",
> autodetect=True,
>
destination_project_dataset_table=f"{TEST_GCP_PROJECT_ID}.{TEST_DATASET}.{TEST_TABLE_ID}",
> )
>
> mock_hook.return_value.split_tablename.return_value = (
> TEST_GCP_PROJECT_ID,
> TEST_DATASET,
> TEST_TABLE_ID,
> )
>
> with
mock.patch("airflow.providers.google.cloud.operators.bigquery.GCSHook") as
gcs_hook:
> gcs_hook.return_value.download.return_value = b'{"c1": "int"}'
> operator.execute(context=MagicMock())
>
> assert "schema" in
mock_hook.return_value.create_empty_table.call_args.kwargs["table_resource"]
> ```
>
> ```shell
> Launching pytest with arguments --disable-forbidden-warnings
test_bigquery.py::TestBigQueryCreateExternalTableOperator::test_execute_with_schema_fields_not_set
--no-header --no-summary -q in airflow/providers/tests/google/cloud/operators
>
> ============================= test session starts
==============================
> collecting ... collected 1 item
>
>
test_bigquery.py::TestBigQueryCreateExternalTableOperator::test_execute_with_schema_fields_not_set
>
> ========================= 1 passed, 1 warning in 1.53s
=========================
> ```
Hello @EugeneYushin,
I have checked the code for `BigQueryCreateExternalTableOperator` operator
one more time and I noticed that the `autodetect` parameter is located in the
deprecated code. And it's a reason why you couldn't add a new Unit test for
this code. In my opinion it doesn't make sense to update this code, because in
the near future it will be removed.
About your problem with unworked `autodetect`. I think it can work with
non-deprecated configuration. I have updated the code from your example. Could
you please check this code on your environment and share the result?
```python
bq_op = BigQueryCreateExternalTableOperator(
gcp_conn_id="gcp_pp",
location="us-east4",
task_id="bq_op",
table_resource={
"tableReference": {
"projectId": PROJECT_ID,
"datasetId": "user_eyushin",
"tableId": "airflow_test_orig",
},
"externalDataConfiguration": {
"sourceFormat": "CSV",
"autodetect": True,
"compression": "NONE",
"sourceUris": [f"gs://{GCS_BUCKET}/test.csv"],
"csvOptions": {
"skipLeadingRows": 1,
}
},
"location": "us-east4",
},
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]