[ https://issues.apache.org/jira/browse/BEAM-6749?focusedWorklogId=204868&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-204868 ]
ASF GitHub Bot logged work on BEAM-6749: ---------------------------------------- Author: ASF GitHub Bot Created on: 27/Feb/19 00:01 Start Date: 27/Feb/19 00:01 Worklog Time Spent: 10m Work Description: kmjung commented on pull request #7950: [BEAM-6749] Add BigQuery Storage API info to docs URL: https://github.com/apache/beam/pull/7950#discussion_r260542965 ########## File path: website/src/documentation/io/built-in-google-bigquery.md ########## @@ -262,13 +265,43 @@ in the following example: {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py tag:model_bigqueryio_read_query_std_sql %}``` +### Using the BigQuery Storage API {#storage-api} + +The [BigQuery Storage API](https://cloud.google.com/bigquery/docs/reference/storage/) +allows you to directly access tables in BigQuery storage. As a result, your +pipeline can read from BigQuery storage faster than previously possible. +The Beam SDK for Java (version 2.11.0 and later) supports using the +[BigQuery Storage API](#storage-api) when reading from a table. Using the +BigQuery Storage API with a query string is not supported. + +***Note:*** The SDK for Python does not support the BigQuery Storage API. + +The BigQuery Storage API is distinct from the existing BigQuery API. You must +[enable the BigQuery Storage API](https://cloud.google.com/bigquery/docs/reference/storage/#enabling_the_api) +for your Google Cloud Platform project. Then, add +`.withMethod(Method.DIRECT_READ)` to your pipeline code when you read from a +BigQuery table. The following example uses `read(SerializableFunction)` and the +BigQuery Storage API. + +```java +PCollection<KV<String, Long>> output = + p.apply(BigQueryIO.read(new ParseKeyValue()) + .from(tableSpec) + .withMethod(Method.DIRECT_READ); Review comment: This covers how to use the new API, but in order to use some new features -- specifically column selection and filter push-down -- callers must also specify a TableReadOptions proto [1] using the .withReadOptions() method. [1] https://github.com/googleapis/google-cloud-java/blob/c011130c0114ccdcde9bda60202f334722df331d/google-api-grpc/proto-google-cloud-bigquerystorage-v1beta1/src/main/proto/google/cloud/bigquery/storage/v1beta1/read_options.proto ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 204868) Time Spent: 20m (was: 10m) > Add BigQuery Storage API info to docs > ------------------------------------- > > Key: BEAM-6749 > URL: https://issues.apache.org/jira/browse/BEAM-6749 > Project: Beam > Issue Type: Improvement > Components: website > Reporter: Melissa Pashniak > Assignee: Melissa Pashniak > Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)