This is an automated email from the ASF dual-hosted git repository. amaliujia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push: new 92e92bc Update SQL BigQuery doc new 11c60b8 Merge pull request #10260 from 11moon11/UpdateBigQueryDoc 92e92bc is described below commit 92e92bc0b8fb01b9395e6480480a81832a86111f Author: kirillkozlov <kirillkoz...@google.com> AuthorDate: Mon Dec 2 16:11:16 2019 -0800 Update SQL BigQuery doc --- .../dsls/sql/extensions/create-external-table.md | 23 ++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/website/src/documentation/dsls/sql/extensions/create-external-table.md b/website/src/documentation/dsls/sql/extensions/create-external-table.md index 81d7dae..2489bb3 100644 --- a/website/src/documentation/dsls/sql/extensions/create-external-table.md +++ b/website/src/documentation/dsls/sql/extensions/create-external-table.md @@ -89,18 +89,33 @@ tableElement: columnName fieldType [ NOT NULL ] CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, tableElement ]*) TYPE bigquery LOCATION '[PROJECT_ID]:[DATASET].[TABLE]' +TBLPROPERTIES '{"method": "DEFAULT"}' ``` -* `LOCATION:`Location of the table in the BigQuery CLI format. - * `PROJECT_ID`: ID of the Google Cloud Project - * `DATASET`: BigQuery Dataset ID - * `TABLE`: BigQuery Table ID within the Dataset +* `LOCATION`: Location of the table in the BigQuery CLI format. + * `PROJECT_ID`: ID of the Google Cloud Project. + * `DATASET`: BigQuery Dataset ID. + * `TABLE`: BigQuery Table ID within the Dataset. +* `TBLPROPERTIES`: + * `method`: Optional. Read method to use. Following options are available: + * `DEFAULT`: If no property is set, will be used as default. Currently uses `EXPORT`. + * `DIRECT_READ`: Use the BigQuery Storage API. + * `EXPORT`: Export data to Google Cloud Storage in Avro format and read data files from that location. ### Read Mode Beam SQL supports reading columns with simple types (`simpleType`) and arrays of simple types (`ARRAY<simpleType>`). +When reading using `EXPORT` method the following pipeline options should be set: +* `project`: ID of the Google Cloud Project. +* `tempLocation`: Bucket to store intermediate data in. Ex: `gs://temp-storage/temp`. + +When reading using `DIRECT_READ` method, an optimizer will attempt to perform +project and predicate push-down, potentially reducing the time requited to read the data from BigQuery. + +More information about the BigQuery Storage API can be found [here](https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api). + ### Write Mode if the table does not exist, Beam creates the table specified in location when