[jira] [Updated] (BEAM-8841) Add ability to perform BigQuery file loads using avro

2023-04-13 Thread Anonymous (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated BEAM-8841:

Status: Triage Needed  (was: Resolved)

> Add ability to perform BigQuery file loads using avro
> -
>
> Key: BEAM-8841
> URL: https://issues.apache.org/jira/browse/BEAM-8841
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp
>Reporter: Chun Yang
>Assignee: Chun Yang
>Priority: P3
> Fix For: 2.21.0
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Currently, JSON format is used for file loads into BigQuery in the Python 
> SDK. JSON has some disadvantages including size of serialized data and 
> inability to represent NaN and infinity float values.
> BigQuery supports loading files in avro format, which can overcome these 
> disadvantages. The Java SDK already supports loading files using avro format 
> (BEAM-2879) so it makes sense to support it in the Python SDK as well.
> The change will be somewhere around 
> [{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (BEAM-8841) Add ability to perform BigQuery file loads using avro

2020-03-05 Thread Chun Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Yang updated BEAM-8841:

Fix Version/s: 2.21.0

> Add ability to perform BigQuery file loads using avro
> -
>
> Key: BEAM-8841
> URL: https://issues.apache.org/jira/browse/BEAM-8841
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp
>Reporter: Chun Yang
>Assignee: Chun Yang
>Priority: Minor
> Fix For: 2.21.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently, JSON format is used for file loads into BigQuery in the Python 
> SDK. JSON has some disadvantages including size of serialized data and 
> inability to represent NaN and infinity float values.
> BigQuery supports loading files in avro format, which can overcome these 
> disadvantages. The Java SDK already supports loading files using avro format 
> (BEAM-2879) so it makes sense to support it in the Python SDK as well.
> The change will be somewhere around 
> [{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8841) Add ability to perform BigQuery file loads using avro

2020-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8841:
---
Status: Open  (was: Triage Needed)

> Add ability to perform BigQuery file loads using avro
> -
>
> Key: BEAM-8841
> URL: https://issues.apache.org/jira/browse/BEAM-8841
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp
>Reporter: Chun Yang
>Assignee: Chun Yang
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, JSON format is used for file loads into BigQuery in the Python 
> SDK. JSON has some disadvantages including size of serialized data and 
> inability to represent NaN and infinity float values.
> BigQuery supports loading files in avro format, which can overcome these 
> disadvantages. The Java SDK already supports loading files using avro format 
> (BEAM-2879) so it makes sense to support it in the Python SDK as well.
> The change will be somewhere around 
> [{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8841) Add ability to perform BigQuery file loads using avro

2019-11-27 Thread Chun Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Yang updated BEAM-8841:

Description: 
Currently, JSON format is used for file loads into BigQuery in the Python SDK. 
JSON has some disadvantages including size of serialized data and inability to 
represent NaN and infinity float values.

BigQuery supports loading files in avro format, which can overcome these 
disadvantages. The Java SDK already supports loading files using avro format 
(BEAM-2879) so it makes sense to support it in the Python SDK as well.

The change will be somewhere around 
[{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].

  was:
Currently, JSON format is used for file loads into BigQuery in the Python SDK. 
JSON has some disadvantages including size of serialized data and inability to 
represent NaN and infinity float values.

BigQuery supports loading files in avro format, which can overcome these 
disadvantages. The Java SDK already supports loading files using avro format 
(BEAM-2879) so it makes sense to support it in the Python SDK as well.


> Add ability to perform BigQuery file loads using avro
> -
>
> Key: BEAM-8841
> URL: https://issues.apache.org/jira/browse/BEAM-8841
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp
>Reporter: Chun Yang
>Priority: Minor
>
> Currently, JSON format is used for file loads into BigQuery in the Python 
> SDK. JSON has some disadvantages including size of serialized data and 
> inability to represent NaN and infinity float values.
> BigQuery supports loading files in avro format, which can overcome these 
> disadvantages. The Java SDK already supports loading files using avro format 
> (BEAM-2879) so it makes sense to support it in the Python SDK as well.
> The change will be somewhere around 
> [{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)