Kaxil Naik created AIRFLOW-6891: ----------------------------------- Summary: GCS to BQ operator fails when JSON is the source format Key: AIRFLOW-6891 URL: https://issues.apache.org/jira/browse/AIRFLOW-6891 Project: Apache Airflow Issue Type: Bug Components: gcp Affects Versions: 1.10.9 Reporter: Kaxil Naik Assignee: Kaxil Naik
>From >https://stackoverflow.com/questions/60358764/airflow-gcs-to-bq-operator-fails-when-json-is-the-source-format I have a GoogleCloudStorageToBigQueryOperator operator running on airflow in a dag. It works perfect when working CSV files... I am now trying to ingest a JSON file, and I'm receiving errors: such like: *skipLeadingRows* is not a valid src_fmt_configs for type *NEWLINE_DELIMITED_JSON* The weird thing is that I'm not calling *skipLeadingRows* in my calling. as below: {noformat} load_Users_to_GBQ = GoogleCloudStorageToBigQueryOperator( task_id='Table1_GCS_to_GBQ', bucket='bucket1', source_objects=['table*.json'], source_format='NEWLINE_DELIMITED_JSON', destination_project_dataset_table='DB.table1', autodetect=False, schema_fields=[ {'name': 'fieldid', 'type': 'integer', 'mode': 'NULLABLE'}, {'name': 'filed2', 'type': 'integer', 'mode': 'NULLABLE'}, {'name': 'field3', 'type': 'string', 'mode': 'NULLABLE'}, {'name': 'field4', 'type': 'string', 'mode': 'NULLABLE'}, {'name': 'field5', 'type': 'string', 'mode': 'NULLABLE'} ], write_disposition='WRITE_TRUNCATE', google_cloud_storage_conn_id='Conn1', bigquery_conn_id='Conn1', dag=dag) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)