[ 
https://issues.apache.org/jira/browse/AIRFLOW-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637345#comment-15637345
 ] 

Giovanni Briggs commented on AIRFLOW-611:
-----------------------------------------

Alright then.  Wanted to make sure I wasn't creating more work than it was 
worth!

> BigQuery Hooks and Operators "source_format" error
> --------------------------------------------------
>
>                 Key: AIRFLOW-611
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-611
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: gcp
>            Reporter: Giovanni Briggs
>            Priority: Minor
>
> Found an issue with the *source_format* parameter for the 
> GoogleCloudStorageToBigQueryOperator.
> I was trying to upload a JSON file from GCS to BQ and was using the value 
> *"JSON"* for *source_format*, assuming that this would work.  The upload 
> process started, but then came back with an error saying:
> {code:javascript}
> {'message': 'Error detected while parsing row starting at position: 0. Error: 
> Data between close double quote (") and field separator.', 'reason': 
> 'invalid'}
> {code}
> There is nothing wrong with the JSON format of the doc, so I went and looked 
> at the job description on BigQuery and saw that there was no "Source Format" 
> entry.  When I've successfully uploaded CSV files, the "Source Format" entry 
> is present and says "CSV."
> According to Google's docs for [source format 
> |https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.tableDefinitions.(key).sourceFormat],
>  acceptable values are: "CSV", "NEWLINE_DELIMTED_JSON", "AVRO" and 
> "GOOGLE_SHEETS."  However, BigQuery doesn't raise an error if you pass a 
> format not represented in that list (such as "JSON").  Instead, it looks like 
> BigQuery assumes you mean CSV and tries to parse the file as a CSV file which 
> results in a completely different error.
> Not sure what the appropriate fix is (or if there even is one).  At least 
> having some additional documentation for the BigQuery hook and operators that 
> points to the list of available values would be helpful.  Otherwise, 
> BigQuery's error leads you to believe that there is something wrong with the 
> format of your data which is different than having something wrong with the 
> setup of the API call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to