Kaxil Naik created AIRFLOW-6891:
-----------------------------------

             Summary: GCS to BQ operator fails when JSON is the source format
                 Key: AIRFLOW-6891
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6891
             Project: Apache Airflow
          Issue Type: Bug
          Components: gcp
    Affects Versions: 1.10.9
            Reporter: Kaxil Naik
            Assignee: Kaxil Naik


>From 
>https://stackoverflow.com/questions/60358764/airflow-gcs-to-bq-operator-fails-when-json-is-the-source-format

I have a GoogleCloudStorageToBigQueryOperator operator running on airflow in a 
dag. It works perfect when working CSV files... I am now trying to ingest a 
JSON file, and I'm receiving errors: such like:

*skipLeadingRows* is not a valid src_fmt_configs for type 
*NEWLINE_DELIMITED_JSON*
The weird thing is that I'm not calling *skipLeadingRows* in my calling. as 
below:

 
{noformat}
load_Users_to_GBQ = GoogleCloudStorageToBigQueryOperator(
    task_id='Table1_GCS_to_GBQ',
    bucket='bucket1',
    source_objects=['table*.json'],
    source_format='NEWLINE_DELIMITED_JSON',
    destination_project_dataset_table='DB.table1',
    autodetect=False,
    schema_fields=[
        {'name': 'fieldid', 'type': 'integer', 'mode': 'NULLABLE'},
        {'name': 'filed2', 'type': 'integer', 'mode': 'NULLABLE'},
        {'name': 'field3', 'type': 'string', 'mode': 'NULLABLE'},
        {'name': 'field4', 'type': 'string', 'mode': 'NULLABLE'},
        {'name': 'field5', 'type': 'string', 'mode': 'NULLABLE'}
    ],
    write_disposition='WRITE_TRUNCATE',
    google_cloud_storage_conn_id='Conn1',
    bigquery_conn_id='Conn1',
    dag=dag)
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to