[ 
https://issues.apache.org/jira/browse/AIRFLOW-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801497#comment-16801497
 ] 

jack commented on AIRFLOW-3489:
-------------------------------

I think there is no right answer for this.

There is a stackoverflow topic about that:

[https://stackoverflow.com/questions/37660579/bigquery-create-column-of-json-datatype]

Since there is no Json column type in BigQuery.

The options are:

1.Save the Json as string.

2. convert the Json to repeated field.

 

I doubt that Airflow operator can handle the logic of (2) it can get very 
complicated if the Json itself is nested and it might not be what the user 
expect.

I think it's better to save it as string and if the user would like to change 
it he can use BigQuery operator to modify the table data manually.

 

You can raise a PR to convert this PostgreSQL type to String and see what are 
the comments from the community

 

> PostgresToGoogleCloudStorageOperator doesn't handle PostgreSQL json properly 
> -----------------------------------------------------------------------------
>
>                 Key: AIRFLOW-3489
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3489
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Duan Shiqiang
>            Priority: Major
>
> PostgresToGoogleCloudStorageOperator saves json data (postgres json or jsonb) 
> as native python types (i.e. dictionary) to gcs new line separated json data.
> But it generates bigquery schema for that field as data type string which 
> won't work if user want to import the gcs data into bigquery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to