(beam) branch master updated: Fix small doc issues (#29578)

damccorm Fri, 01 Dec 2023 06:09:26 -0800

This is an automated email from the ASF dual-hosted git repository.

damccorm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git



The following commit(s) were added to refs/heads/master by this push:
     new 35a0b68c06b Fix small doc issues (#29578)
35a0b68c06b is described below

commit 35a0b68c06b5e446d17c7c7081d2a7f13c85372c
Author: liferoad <[email protected]>
AuthorDate: Fri Dec 1 09:08:21 2023 -0500

    Fix small doc issues (#29578)
---
 CHANGES.md                                      |  2 +-
 sdks/python/apache_beam/io/gcp/bigquery.py      | 18 +++----
 website/www/site/content/en/blog/beam-2.52.0.md | 69 ++++++++++++++++++++++++-
 3 files changed, 78 insertions(+), 11 deletions(-)

diff --git a/CHANGES.md b/CHANGES.md
index 9318e85d477..34a653d75ce 100644
--- a/CHANGES.md
+++ b/CHANGES.md
@@ -107,7 +107,7 @@ should handle this. 
([#25252](https://github.com/apache/beam/issues/25252)).
 * Add `UseDataStreamForBatch` pipeline option to the Flink runner. When it is 
set to true, Flink runner will run batch
   jobs using the DataStream API. By default the option is set to false, so the 
batch jobs are still executed
   using the DataSet API.
-* `upload_graph` as one of the Experiments options for DataflowRunner is no 
longer required when the graph is larger than 10MB for Java SDK 
([PR#28621](https://github.com/apache/beam/pull/28621).
+* `upload_graph` as one of the Experiments options for DataflowRunner is no 
longer required when the graph is larger than 10MB for Java SDK 
([PR#28621](https://github.com/apache/beam/pull/28621)).
 * state amd side input cache has been enabled to a default of 100 MB. Use 
`--max_cache_memory_usage_mb=X` to provide cache size for the user state API 
and side inputs. (Python) 
([#28770](https://github.com/apache/beam/issues/28770)).
 * Beam YAML stable release. Beam pipelines can now be written using YAML and 
leverage the Beam YAML framework which includes a preliminary set of IO's and 
turnkey transforms. More information can be found in the YAML root folder and 
in the 
[README](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/README.md).
 
diff --git a/sdks/python/apache_beam/io/gcp/bigquery.py 
b/sdks/python/apache_beam/io/gcp/bigquery.py
index 184138af752..ac06425e95a 100644
--- a/sdks/python/apache_beam/io/gcp/bigquery.py
+++ b/sdks/python/apache_beam/io/gcp/bigquery.py
@@ -72,7 +72,8 @@ When creating a BigQuery input transform, users should 
provide either a query
 or a table. Pipeline construction will fail with a validation error if neither
 or both are specified.
 
-When reading via `ReadFromBigQuery`, bytes are returned decoded as bytes.
+When reading via `ReadFromBigQuery` using `EXPORT`,
+bytes are returned decoded as bytes.
 This is due to the fact that ReadFromBigQuery uses Avro exports by default.
 When reading from BigQuery using `apache_beam.io.BigQuerySource`, bytes are
 returned as base64-encoded bytes. To get base64-encoded bytes using
@@ -2597,6 +2598,8 @@ class StorageWriteToBigQuery(PTransform):
 
 
 class ReadFromBigQuery(PTransform):
+  # pylint: disable=line-too-long,W1401
+
   """Read data from BigQuery.
 
     This PTransform uses a BigQuery export job to take a snapshot of the table
@@ -2653,8 +2656,7 @@ class ReadFromBigQuery(PTransform):
       :data:`None`, then the temp_location parameter is used.
     bigquery_job_labels (dict): A dictionary with string labels to be passed
       to BigQuery export and query jobs created by this transform. See:
-      https://cloud.google.com/bigquery/docs/reference/rest/v2/\
-              Job#JobConfiguration
+      
https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfiguration
     use_json_exports (bool): By default, this transform works by exporting
       BigQuery data into Avro files, and reading those files. With this
       parameter, the transform will instead export to JSON files. JSON files
@@ -2666,11 +2668,10 @@ class ReadFromBigQuery(PTransform):
       types (datetime.date, datetime.datetime, datetime.datetime,
       and datetime.datetime respectively). Avro exports are recommended.
       To learn more about BigQuery types, and Time-related type
-      representations, see: https://cloud.google.com/bigquery/docs/reference/\
-              standard-sql/data-types
+      representations,
+      see: 
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types
       To learn more about type conversions between BigQuery and Avro, see:
-      https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro\
-              #avro_conversions
+      
https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro\#avro_conversions
     temp_dataset (``apache_beam.io.gcp.internal.clients.bigquery.\
         DatasetReference``):
         Temporary dataset reference to use when reading from BigQuery using a
@@ -2690,8 +2691,7 @@ class ReadFromBigQuery(PTransform):
       (`PYTHON_DICT`). There is experimental support for producing a
       PCollection with a schema and yielding Beam Rows via the option
       `BEAM_ROW`. For more information on schemas, see
-      https://beam.apache.org/documentation/programming-guide/\
-      #what-is-a-schema)
+      
https://beam.apache.org/documentation/programming-guide/#what-is-a-schema)
       """
   class Method(object):
     EXPORT = 'EXPORT'  #  This is currently the default.
diff --git a/website/www/site/content/en/blog/beam-2.52.0.md 
b/website/www/site/content/en/blog/beam-2.52.0.md
index 5654f16ceb3..2e604c8fabf 100644
--- a/website/www/site/content/en/blog/beam-2.52.0.md
+++ b/website/www/site/content/en/blog/beam-2.52.0.md
@@ -41,7 +41,7 @@ should handle this. 
([#25252](https://github.com/apache/beam/issues/25252)).
 * Add `UseDataStreamForBatch` pipeline option to the Flink runner. When it is 
set to true, Flink runner will run batch
   jobs using the DataStream API. By default the option is set to false, so the 
batch jobs are still executed
   using the DataSet API.
-* `upload_graph` as one of the Experiments options for DataflowRunner is no 
longer required when the graph is larger than 10MB for Java SDK 
([PR#28621](https://github.com/apache/beam/pull/28621).
+* `upload_graph` as one of the Experiments options for DataflowRunner is no 
longer required when the graph is larger than 10MB for Java SDK 
([PR#28621](https://github.com/apache/beam/pull/28621)).
 * state amd side input cache has been enabled to a default of 100 MB. Use 
`--max_cache_memory_usage_mb=X` to provide cache size for the user state API 
and side inputs. (Python) 
([#28770](https://github.com/apache/beam/issues/28770)).
 * Beam YAML stable release. Beam pipelines can now be written using YAML and 
leverage the Beam YAML framework which includes a preliminary set of IO's and 
turnkey transforms. More information can be found in the YAML root folder and 
in the 
[README](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/README.md).
 
@@ -69,69 +69,136 @@ as a workaround, a copy of "old" `CountingSource` class 
should be placed into a
 According to git shortlog, the following people contributed to the 2.52.0 
release. Thank you to all contributors!
 
 Ahmed Abualsaud
+
 Ahmet Altay
+
 Aleksandr Dudko
+
 Alexey Romanenko
+
 Anand Inguva
+
 Andrei Gurau
+
 Andrey Devyatkin
+
 BjornPrime
+
 Bruno Volpato
+
 Bulat
+
 Chamikara Jayalath
+
 Damon
+
 Danny McCormick
+
 Devansh Modi
+
 Dominik Dębowczyk
+
 Ferran Fernández Garrido
+
 Hai Joey Tran
+
 Israel Herraiz
+
 Jack McCluskey
+
 Jan Lukavský
+
 JayajP
+
 Jeff Kinard
+
 Jeffrey Kinard
+
 Jiangjie Qin
+
 Jing
+
 Joar Wandborg
+
 Johanna Öjeling
+
 Julien Tournay
+
 Kanishk Karanawat
+
 Kenneth Knowles
+
 Kerry Donny-Clark
+
 Luís Bianchin
+
 Minbo Bae
+
 Pranav Bhandari
+
 Rebecca Szper
+
 Reuven Lax
+
 Ritesh Ghorse
+
 Robert Bradshaw
+
 Robert Burke
+
 RyuSA
+
 Shunping Huang
+
 Steven van Rossum
+
 Svetak Sundhar
+
 Tony Tang
+
 Vitaly Terentyev
+
 Vivek Sumanth
+
 Vlado Djerek
+
 Yi Hu
+
 aku019
+
 brucearctor
+
 caneff
+
 damccorm
+
 ddebowczyk92
+
 dependabot[bot]
+
 dpcollins-google
+
 edman124
+
 gabry.wu
+
 illoise
+
 johnjcasey
+
 jonathan-lemos
+
 kennknowles
+
 liferoad
+
 magicgoody
+
 martin trieu
+
 nancyxu123
+
 pablo rodriguez defino
+
 tvalentyn
+

(beam) branch master updated: Fix small doc issues (#29578)

Reply via email to