(beam) branch asf-site updated: Publishing website 2024/05/30 05:37:11 at commit 16d6282

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 9cb48449947 Publishing website 2024/05/30 05:37:11 at commit 16d6282
9cb48449947 is described below

commit 9cb48449947dd34f2b676665b17e0df81d3b1250
Author: runner 
AuthorDate: Thu May 30 05:37:11 2024 +

Publishing website 2024/05/30 05:37:11 at commit 16d6282
---
 website/generated-content/sitemap.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index 28e52669437..c04dc692012 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-http://www.sitemaps.org/schemas/sitemap/0.9; 
xmlns:xhtml="http://www.w3.org/1999/xhtml;>/blog/beam-2.56.0/2024-05-29T19:20:21-04:00/categories/blog/2024-05-29T19:20:21-04:00/blog/2024-05-29T19:20:21-04:00/categories/2024-05-29T19:20:21-04:00/catego
 [...]
\ No newline at end of file
+http://www.sitemaps.org/schemas/sitemap/0.9; 
xmlns:xhtml="http://www.w3.org/1999/xhtml;>/blog/beam-2.56.0/2024-05-29T20:03:39-04:00/categories/blog/2024-05-29T20:03:39-04:00/blog/2024-05-29T20:03:39-04:00/categories/2024-05-29T20:03:39-04:00/catego
 [...]
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 1c9ca4678c3 Updating config from bot
1c9ca4678c3 is described below

commit 1c9ca4678c3c95fe05da7a98fff60af79912
Author: github-actions 
AuthorDate: Thu May 30 04:25:30 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31450.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31450.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31450.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31450.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 85b5dc95488 Updating config from bot
85b5dc95488 is described below

commit 85b5dc9548826bbf0b6c5e9b750d9ebbca209497
Author: github-actions 
AuthorDate: Thu May 30 04:24:17 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31451.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31451.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31451.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31451.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch nightly-refs/heads/master updated (f7519774e3c -> 16d62827551)

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch nightly-refs/heads/master
in repository https://gitbox.apache.org/repos/asf/beam.git


from f7519774e3c BigQueryIO read throttling detection python (#31404)
 add 7d281558dd8 [#29697] Add prism artifact building workflow. (#31369)
 add 4daedbf5a8a [#30083] Add synthetic processing time to prism. (#30492)
 add 49a4290426d Add options to specify read and write http timeout for gcs 
as well as lower batching limit for rewrite operations which are copying. 
(#31410)
 add b1a6eb06051 [YAML] Fix simple YAML mappings type hinting (#31427)
 add 6842136e0c9 Add SDK capability to detect if the SDK Fn Harness data 
channel is busy.
 add ad841c6004f Regenerate Go protos.
 add 8b33e1f65c3 Merge pull request #31442 SDK protocol to detect if the 
SDK Fn Harness data channel is busy
 add df8bead5945 Refactor RowMutationInformation to use string type (#31323)
 add 06e103d87e8 Add ApplyBucketsWithInterpolation TFTransform (#31291)
 add 8d77c8fad07 Add try-excepts around data sampler encoding (#31396)
 add 0b5ffd7d153 Add SDK capability to detect if the SDK Fn Harness data 
channel is busy or not (#31420)
 add 19630e576fe Add in-memory variants of side inputs. (#31232)
 add 80d85aa38ff Add docs for YAML AssertThat. (#31448)
 add 90f020921c1 Update bigquery_tools.py (#31444)
 add 16d62827551 Update bigquery.py documentation (#31443)

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_release_candidate.yml  |  188 +-
 CHANGES.md |1 +
 .../beam/model/fn_execution/v1/beam_fn_api.proto   |5 +
 .../beam/model/pipeline/v1/beam_runner_api.proto   |5 +
 sdks/go/pkg/beam/core/runtime/exec/datasource.go   |   15 +-
 .../pkg/beam/core/runtime/exec/datasource_test.go  |8 +-
 sdks/go/pkg/beam/core/runtime/graphx/translate.go  |   12 +-
 sdks/go/pkg/beam/core/runtime/harness/harness.go   |9 +-
 .../go/pkg/beam/core/runtime/harness/monitoring.go |8 +-
 .../beam/model/fnexecution_v1/beam_fn_api.pb.go| 1582 ++
 .../beam/model/pipeline_v1/beam_runner_api.pb.go   | 3245 ++--
 .../model/pipeline_v1/external_transforms.pb.go|4 +
 .../prism/internal/engine/elementmanager.go|  284 +-
 .../runners/prism/internal/engine/engine_test.go   |   43 +
 .../beam/runners/prism/internal/engine/holds.go|   39 +-
 .../prism/internal/engine/processingtime.go|   96 +
 .../prism/internal/engine/processingtime_test.go   |  139 +
 .../runners/prism/internal/engine/teststream.go|   12 +-
 .../beam/runners/prism/internal/engine/timers.go   |  166 +
 .../runners/prism/internal/engine/timers_test.go   |  291 ++
 sdks/go/pkg/beam/runners/prism/internal/execute.go |   10 +-
 .../prism/internal/jobservices/management.go   |7 +-
 sdks/go/pkg/beam/runners/prism/internal/stage.go   |   25 +-
 .../beam/runners/prism/internal/worker/worker.go   |4 +-
 sdks/go/test/integration/integration.go|7 +-
 sdks/go/test/integration/primitives/timers.go  |  151 +
 sdks/go/test/integration/primitives/timers_test.go |   10 +-
 .../sdk/fn/data/BeamFnDataInboundObserver.java |   14 +
 .../java/org/apache/beam/sdk/transforms/View.java  |  198 +-
 .../beam/sdk/util/construction/Environments.java   |1 +
 .../apache/beam/sdk/values/PCollectionViews.java   |  488 +++
 .../org/apache/beam/sdk/transforms/ViewTest.java   |  145 +
 .../sdk/extensions/gcp/options/GcsOptions.java |   18 +
 .../beam/sdk/extensions/gcp/util/GcsUtil.java  |  142 +-
 .../gcp/util/RetryHttpRequestInitializer.java  |   15 +-
 .../beam/sdk/extensions/gcp/util/Transport.java|   41 +-
 .../beam/sdk/extensions/gcp/util/GcsUtilTest.java  |   83 +-
 .../fn/harness/control/ProcessBundleHandler.java   |3 +
 .../beam/sdk/io/gcp/bigquery/AppendClientInfo.java |2 +-
 .../AvroGenericRecordToStorageApiProto.java|   17 +-
 .../io/gcp/bigquery/BeamRowToStorageApiProto.java  |   16 +-
 .../beam/sdk/io/gcp/bigquery/RowMutation.java  |   27 +-
 .../io/gcp/bigquery/RowMutationInformation.java|  111 +-
 .../beam/sdk/io/gcp/bigquery/StorageApiCDC.java|9 +
 .../StorageApiDynamicDestinationsBeamRow.java  |4 +-
 ...StorageApiDynamicDestinationsGenericRecord.java |7 +-
 .../StorageApiDynamicDestinationsTableRow.java |4 +-
 .../io/gcp/bigquery/TableRowToStorageApiProto.java |   40 +-
 .../sdk/io/gcp/testing/FakeDatasetService.java |3 +-
 .../AvroGenericRecordToStorageApiProtoTest.java|3 +-
 .../gcp/bigquery/BeamRowToStorageApiProtoTest.java |4 +-
 .../sdk/io/gcp/bigquery/BigQueryIOWriteTest.java   |   88 +-
 .../gcp/bigquery/RowMutationInformationTest.java   |  132 +
 .../io/gcp/bigquery/StorageApiSinkRowUpdateIT.java |   63 +-
 .../bigquery/TableRowToStorageApiProtoTest.java|3 +-
 

(beam) branch master updated: Update bigquery.py documentation (#31443)

2024-05-29 Thread tvalentyn
This is an automated email from the ASF dual-hosted git repository.

tvalentyn pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 16d62827551 Update bigquery.py documentation (#31443)
16d62827551 is described below

commit 16d62827551170c7af104995327ba88ddc1fcb88
Author: liferoad 
AuthorDate: Wed May 29 20:03:39 2024 -0400

Update bigquery.py documentation (#31443)

* Update bigquery.py

fix #31372

* fix lint
---
 sdks/python/apache_beam/io/gcp/bigquery.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sdks/python/apache_beam/io/gcp/bigquery.py 
b/sdks/python/apache_beam/io/gcp/bigquery.py
index a4d710b1288..caeed6b7b9b 100644
--- a/sdks/python/apache_beam/io/gcp/bigquery.py
+++ b/sdks/python/apache_beam/io/gcp/bigquery.py
@@ -283,7 +283,8 @@ method) could look like::
   def chain_after(result):
 try:
   # This works for FILE_LOADS, where we run load and possibly copy jobs.
-  return (result.load_jobid_pairs, result.copy_jobid_pairs) | 
beam.Flatten()
+  return (result.destination_load_jobid_pairs,
+  result.destination_copy_jobid_pairs) | beam.Flatten()
 except AttributeError:
   # Works for STREAMING_INSERTS, where we return the rows BigQuery rejected
   return result.failed_rows



(beam) branch asf-site updated: Publishing website 2024/05/29 23:37:23 at commit 90f0209

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new ad23bea7e65 Publishing website 2024/05/29 23:37:23 at commit 90f0209
ad23bea7e65 is described below

commit ad23bea7e6573e1d235707dc4bc4a41358a6da05
Author: runner 
AuthorDate: Wed May 29 23:37:24 2024 +

Publishing website 2024/05/29 23:37:23 at commit 90f0209
---
 website/generated-content/sitemap.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index 9fd0bee61e0..28e52669437 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-http://www.sitemaps.org/schemas/sitemap/0.9; 
xmlns:xhtml="http://www.w3.org/1999/xhtml;>/blog/beam-2.56.0/2024-05-29T10:28:36-07:00/categories/blog/2024-05-29T10:28:36-07:00/blog/2024-05-29T10:28:36-07:00/categories/2024-05-29T10:28:36-07:00/catego
 [...]
\ No newline at end of file
+http://www.sitemaps.org/schemas/sitemap/0.9; 
xmlns:xhtml="http://www.w3.org/1999/xhtml;>/blog/beam-2.56.0/2024-05-29T19:20:21-04:00/categories/blog/2024-05-29T19:20:21-04:00/blog/2024-05-29T19:20:21-04:00/categories/2024-05-29T19:20:21-04:00/catego
 [...]
\ No newline at end of file



(beam) branch master updated (80d85aa38ff -> 90f020921c1)

2024-05-29 Thread tvalentyn
This is an automated email from the ASF dual-hosted git repository.

tvalentyn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


from 80d85aa38ff Add docs for YAML AssertThat. (#31448)
 add 90f020921c1 Update bigquery_tools.py (#31444)

No new revisions were added by this update.

Summary of changes:
 sdks/python/apache_beam/io/gcp/bigquery_tools.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 5074c0a06ad Updating config from bot
5074c0a06ad is described below

commit 5074c0a06add224ba706bbe3b198e262d5c1eeaf
Author: github-actions 
AuthorDate: Wed May 29 23:05:32 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31449.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31449.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31449.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31449.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch master updated: Add docs for YAML AssertThat. (#31448)

2024-05-29 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 80d85aa38ff Add docs for YAML AssertThat. (#31448)
80d85aa38ff is described below

commit 80d85aa38ff91699a5123f14d5c5df96d826140c
Author: Robert Bradshaw 
AuthorDate: Wed May 29 15:59:26 2024 -0700

Add docs for YAML AssertThat. (#31448)

This is the first transform in the (alphabetical) list, so it'd
be good to not have it empty.

Also produce slightly nicer examples for repeated arguments.
---
 sdks/python/apache_beam/yaml/generate_yaml_docs.py | 24 +++--
 sdks/python/apache_beam/yaml/yaml_provider.py  | 25 +-
 2 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/sdks/python/apache_beam/yaml/generate_yaml_docs.py 
b/sdks/python/apache_beam/yaml/generate_yaml_docs.py
index b11062cce4d..4719bc3e66a 100644
--- a/sdks/python/apache_beam/yaml/generate_yaml_docs.py
+++ b/sdks/python/apache_beam/yaml/generate_yaml_docs.py
@@ -28,6 +28,18 @@ from apache_beam.yaml import json_utils
 from apache_beam.yaml import yaml_provider
 
 
+def _singular(name):
+  # Simply removing an 's' (or 'es', or 'ies', ...) may result in surprising
+  # manglings. Better to play it safe and leave a correctly-spelled plural
+  # than a botched singular in our examples configs.
+  return {
+  'args': 'arg',
+  'attributes': 'attribute',
+  'elements': 'element',
+  'fields': 'field',
+  }.get(name, name)
+
+
 def _fake_value(name, beam_type):
   type_info = beam_type.WhichOneof("type_info")
   if type_info == "atomic_type":
@@ -38,9 +50,17 @@ def _fake_value(name, beam_type):
 else:
   return name
   elif type_info == "array_type":
-return [_fake_value(name, beam_type.array_type.element_type), '...']
+return [
+_fake_value(_singular(name), beam_type.array_type.element_type),
+_fake_value(_singular(name), beam_type.array_type.element_type),
+'...'
+]
   elif type_info == "iterable_type":
-return [_fake_value(name, beam_type.iterable_type.element_type), '...']
+return [
+_fake_value(_singular(name), beam_type.iterable_type.element_type),
+_fake_value(_singular(name), beam_type.iterable_type.element_type),
+'...'
+]
   elif type_info == "map_type":
 if beam_type.map_type.key_type.atomic_type == schema_pb2.STRING:
   return {
diff --git a/sdks/python/apache_beam/yaml/yaml_provider.py 
b/sdks/python/apache_beam/yaml/yaml_provider.py
index 5f53302028c..52452daff7e 100755
--- a/sdks/python/apache_beam/yaml/yaml_provider.py
+++ b/sdks/python/apache_beam/yaml/yaml_provider.py
@@ -557,7 +557,30 @@ def dicts_to_rows(o):
 
 class YamlProviders:
   class AssertEqual(beam.PTransform):
-def __init__(self, elements):
+"""Asserts that the input contains exactly the elements provided.
+
+This is primarily used for testing; it will cause the entire pipeline to
+fail if the input to this transform is not exactly the set of `elements`
+given in the config parameter.
+
+As with Create, YAML/JSON-style mappings are interpreted as Beam rows,
+e.g.::
+
+type: AssertEqual
+input: SomeTransform
+config:
+  elements:
+ - {a: 0, b: "foo"}
+ - {a: 1, b: "bar"}
+
+would ensure that `SomeTransform` produced exactly two elements with values
+`(a=0, b="foo")` and `(a=1, b="bar")` respectively.
+
+Args:
+elements: The set of elements that should belong to the PCollection.
+YAML/JSON-style mappings will be interpreted as Beam rows.
+"""
+def __init__(self, elements: Iterable[Any]):
   self._elements = elements
 
 def expand(self, pcoll):



(beam) branch master updated (0b5ffd7d153 -> 19630e576fe)

2024-05-29 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


from 0b5ffd7d153 Add SDK capability to detect if the SDK Fn Harness data 
channel is busy or not (#31420)
 add 19630e576fe Add in-memory variants of side inputs. (#31232)

No new revisions were added by this update.

Summary of changes:
 CHANGES.md |   1 +
 .../java/org/apache/beam/sdk/transforms/View.java  | 198 -
 .../apache/beam/sdk/values/PCollectionViews.java   | 488 +
 .../org/apache/beam/sdk/transforms/ViewTest.java   | 145 ++
 4 files changed, 812 insertions(+), 20 deletions(-)



(beam) branch master updated (8d77c8fad07 -> 0b5ffd7d153)

2024-05-29 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


from 8d77c8fad07 Add try-excepts around data sampler encoding (#31396)
 add 0b5ffd7d153 Add SDK capability to detect if the SDK Fn Harness data 
channel is busy or not (#31420)

No new revisions were added by this update.

Summary of changes:
 sdks/go/pkg/beam/core/runtime/exec/datasource.go  | 15 +--
 sdks/go/pkg/beam/core/runtime/exec/datasource_test.go |  8 ++--
 sdks/go/pkg/beam/core/runtime/graphx/translate.go | 12 +++-
 sdks/go/pkg/beam/core/runtime/harness/harness.go  |  9 +
 sdks/go/pkg/beam/core/runtime/harness/monitoring.go   |  8 
 .../beam/sdk/fn/data/BeamFnDataInboundObserver.java   | 14 ++
 .../apache/beam/sdk/util/construction/Environments.java   |  1 +
 .../beam/fn/harness/control/ProcessBundleHandler.java |  3 +++
 .../python/apache_beam/runners/worker/bundle_processor.py |  9 +
 sdks/python/apache_beam/runners/worker/sdk_worker.py  |  5 -
 sdks/python/apache_beam/runners/worker/sdk_worker_test.py |  4 ++--
 sdks/python/apache_beam/transforms/environments.py|  1 +
 12 files changed, 69 insertions(+), 20 deletions(-)



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 407f7e303f6 Updating config from bot
407f7e303f6 is described below

commit 407f7e303f69b5d1a52263159f82c0d8f465d1fc
Author: github-actions 
AuthorDate: Wed May 29 22:09:28 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31448.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31448.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31448.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31448.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 9b7a2a32232 Updating config from bot
9b7a2a32232 is described below

commit 9b7a2a3223277c10bfe88fa9bfe6f2e1d03f07be
Author: github-actions 
AuthorDate: Wed May 29 21:37:25 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31447.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31447.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31447.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31447.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch master updated (06e103d87e8 -> 8d77c8fad07)

2024-05-29 Thread ningk
This is an automated email from the ASF dual-hosted git repository.

ningk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


from 06e103d87e8 Add ApplyBucketsWithInterpolation TFTransform (#31291)
 add 8d77c8fad07 Add try-excepts around data sampler encoding (#31396)

No new revisions were added by this update.

Summary of changes:
 .../apache_beam/runners/worker/data_sampler.py | 35 +-
 1 file changed, 21 insertions(+), 14 deletions(-)



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new e267eaa7be9 Updating config from bot
e267eaa7be9 is described below

commit e267eaa7be9ed66dd860dd94db73825a4f154e48
Author: github-actions 
AuthorDate: Wed May 29 18:41:05 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31446.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31446.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31446.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31446.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch master updated: Add ApplyBucketsWithInterpolation TFTransform (#31291)

2024-05-29 Thread jrmccluskey
This is an automated email from the ASF dual-hosted git repository.

jrmccluskey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 06e103d87e8 Add ApplyBucketsWithInterpolation TFTransform (#31291)
06e103d87e8 is described below

commit 06e103d87e8ac883f606475dbadbefed4ba77c9a
Author: Jack McCluskey <34928439+jrmcclus...@users.noreply.github.com>
AuthorDate: Wed May 29 14:23:57 2024 -0400

Add ApplyBucketsWithInterpolation TFTransform (#31291)

* Add ApplyBucketsWithInterpolation TFTransform

* Update sdks/python/apache_beam/ml/transforms/tft.py

Co-authored-by: tvalentyn 

* add tft documentation link

* change docstring wording around bucket_boundaries

* Update sdks/python/apache_beam/ml/transforms/tft.py

Co-authored-by: tvalentyn 

-

Co-authored-by: tvalentyn 
---
 sdks/python/apache_beam/ml/transforms/tft.py  | 49 ---
 sdks/python/apache_beam/ml/transforms/tft_test.py | 30 ++
 2 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/sdks/python/apache_beam/ml/transforms/tft.py 
b/sdks/python/apache_beam/ml/transforms/tft.py
index 370043bc0d9..e2f02971e7c 100644
--- a/sdks/python/apache_beam/ml/transforms/tft.py
+++ b/sdks/python/apache_beam/ml/transforms/tft.py
@@ -337,16 +337,16 @@ class ApplyBuckets(TFTOperation):
   name: Optional[str] = None):
 """
 This functions is used to map the element to a positive index i for
-which bucket_boundaries[i-1] <= element < bucket_boundaries[i],
-if it exists. If input < bucket_boundaries[0], then element is
-mapped to 0. If element >= bucket_boundaries[-1], then element is
+which `bucket_boundaries[i-1] <= element < bucket_boundaries[i]`,
+if it exists. If `input < bucket_boundaries[0]`, then element is
+mapped to 0. If `element >= bucket_boundaries[-1]`, then element is
 mapped to len(bucket_boundaries). NaNs are mapped to
 len(bucket_boundaries).
 
 Args:
   columns: A list of column names to apply the transformation on.
-  bucket_boundaries: A rank 2 Tensor or list representing the bucket
-boundaries sorted in ascending order.
+  bucket_boundaries: An iterable of ints or floats representing the bucket
+boundaries. Must be sorted in ascending order.
   name: (Optional) A string that specifies the name of the operation.
 """
 super().__init__(columns)
@@ -363,6 +363,45 @@ class ApplyBuckets(TFTOperation):
 return output
 
 
+@register_input_dtype(float)
+class ApplyBucketsWithInterpolation(TFTOperation):
+  def __init__(
+  self,
+  columns: List[str],
+  bucket_boundaries: Iterable[Union[int, float]],
+  name: Optional[str] = None):
+"""Interpolates values within the provided buckets and then normalizes to
+[0, 1].
+
+Input values are bucketized based on the provided boundaries such that the
+input is mapped to a positive index i for which `bucket_boundaries[i-1] <=
+element < bucket_boundaries[i]`, if it exists. The values are then
+normalized to the range [0,1] within the bucket, with NaN values being
+mapped to 0.5.
+
+For more information, see:
+
https://www.tensorflow.org/tfx/transform/api_docs/python/tft/apply_buckets_with_interpolation
+
+Args:
+  columns: A list of column names to apply the transformation on.
+  bucket_boundaries: An iterable of ints or floats representing the bucket
+boundaries sorted in ascending order.
+  name: (Optional) A string that specifies the name of the operation.
+"""
+super().__init__(columns)
+self.bucket_boundaries = [bucket_boundaries]
+self.name = name
+
+  def apply_transform(
+  self, data: common_types.TensorType,
+  output_column_name: str) -> Dict[str, common_types.TensorType]:
+output = {
+output_column_name: tft.apply_buckets_with_interpolation(
+x=data, bucket_boundaries=self.bucket_boundaries, name=self.name)
+}
+return output
+
+
 @register_input_dtype(float)
 class Bucketize(TFTOperation):
   def __init__(
diff --git a/sdks/python/apache_beam/ml/transforms/tft_test.py 
b/sdks/python/apache_beam/ml/transforms/tft_test.py
index 5c42ecc012f..f5615e9d4ad 100644
--- a/sdks/python/apache_beam/ml/transforms/tft_test.py
+++ b/sdks/python/apache_beam/ml/transforms/tft_test.py
@@ -364,6 +364,36 @@ class ApplyBucketsTest(unittest.TestCase):
   actual_output, equal_to(expected_output, equals_fn=np.array_equal))
 
 
+class ApplyBucketsWithInterpolationTest(unittest.TestCase):
+  def setUp(self) -> None:
+self.artifact_location = tempfile.mkdtemp()
+
+  def tearDown(self):
+shutil.rmtree(self.artifact_location)
+
+  @parameterized.expand([
+  ([-1, 9, 10, 11], [10], [0., 0., 1., 1.]),
+  ([15, 20, 25], [10, 20], [.5, 1, 1]),
+ 

(beam) branch master updated: Refactor RowMutationInformation to use string type (#31323)

2024-05-29 Thread damondouglas
This is an automated email from the ASF dual-hosted git repository.

damondouglas pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new df8bead5945 Refactor RowMutationInformation to use string type (#31323)
df8bead5945 is described below

commit df8bead5945c801854f07dce6e708b1241c94696
Author: Damon 
AuthorDate: Wed May 29 10:49:24 2024 -0700

Refactor RowMutationInformation to use string type (#31323)

* Refactor RowMutationInformation to use string type

* Remove unnecessary test

* Add javadoc

* Add segment too large test cases

* Add hex based test cases to integration test
---
 .../beam/sdk/io/gcp/bigquery/AppendClientInfo.java |   2 +-
 .../AvroGenericRecordToStorageApiProto.java|  17 ++-
 .../io/gcp/bigquery/BeamRowToStorageApiProto.java  |  16 ++-
 .../beam/sdk/io/gcp/bigquery/RowMutation.java  |  27 +++--
 .../io/gcp/bigquery/RowMutationInformation.java| 111 -
 .../beam/sdk/io/gcp/bigquery/StorageApiCDC.java|   9 ++
 .../StorageApiDynamicDestinationsBeamRow.java  |   4 +-
 ...StorageApiDynamicDestinationsGenericRecord.java |   7 +-
 .../StorageApiDynamicDestinationsTableRow.java |   4 +-
 .../io/gcp/bigquery/TableRowToStorageApiProto.java |  40 +--
 .../sdk/io/gcp/testing/FakeDatasetService.java |   3 +-
 .../AvroGenericRecordToStorageApiProtoTest.java|   3 +-
 .../gcp/bigquery/BeamRowToStorageApiProtoTest.java |   4 +-
 .../sdk/io/gcp/bigquery/BigQueryIOWriteTest.java   |  88 --
 .../gcp/bigquery/RowMutationInformationTest.java   | 132 +
 .../io/gcp/bigquery/StorageApiSinkRowUpdateIT.java |  63 +-
 .../bigquery/TableRowToStorageApiProtoTest.java|   3 +-
 17 files changed, 457 insertions(+), 76 deletions(-)

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AppendClientInfo.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AppendClientInfo.java
index 3094af5855e..211027c12b0 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AppendClientInfo.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AppendClientInfo.java
@@ -145,7 +145,7 @@ abstract class AppendClientInfo {
 true,
 null,
 null,
--1);
+null);
 return msg.toByteString();
   }
 
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
index 7141869b228..519f9391db6 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
@@ -162,6 +162,19 @@ public class AvroGenericRecordToStorageApiProto {
 return builder.build();
   }
 
+  /**
+   * Forwards {@param changeSequenceNum} to {@link 
#messageFromGenericRecord(Descriptor,
+   * GenericRecord, String, String)} via {@link Long#toHexString}.
+   */
+  public static DynamicMessage messageFromGenericRecord(
+  Descriptor descriptor,
+  GenericRecord record,
+  @Nullable String changeType,
+  long changeSequenceNum) {
+return messageFromGenericRecord(
+descriptor, record, changeType, Long.toHexString(changeSequenceNum));
+  }
+
   /**
* Given an Avro {@link GenericRecord} object, returns a protocol-buffer 
message that can be used
* to write data using the BigQuery Storage streaming API.
@@ -174,7 +187,7 @@ public class AvroGenericRecordToStorageApiProto {
   Descriptor descriptor,
   GenericRecord record,
   @Nullable String changeType,
-  long changeSequenceNum) {
+  @Nullable String changeSequenceNum) {
 Schema schema = record.getSchema();
 DynamicMessage.Builder builder = DynamicMessage.newBuilder(descriptor);
 for (Schema.Field field : schema.getFields()) {
@@ -195,7 +208,7 @@ public class AvroGenericRecordToStorageApiProto {
   builder.setField(
   org.apache.beam.sdk.util.Preconditions.checkStateNotNull(
   descriptor.findFieldByName(StorageApiCDC.CHANGE_SQN_COLUMN)),
-  changeSequenceNum);
+  
org.apache.beam.sdk.util.Preconditions.checkStateNotNull(changeSequenceNum));
 }
 return builder.build();
   }
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BeamRowToStorageApiProto.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BeamRowToStorageApiProto.java
index 

(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 29ac24b3ead Updating config from bot
29ac24b3ead is described below

commit 29ac24b3ead8ebda57543247988481d004070cca
Author: github-actions 
AuthorDate: Wed May 29 17:34:49 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/reviewers-for-label-python.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-python.json 
b/scripts/ci/pr-bot/state/reviewers-for-label-python.json
index 4cf1f46248b..ceb48eef132 100644
--- a/scripts/ci/pr-bot/state/reviewers-for-label-python.json
+++ b/scripts/ci/pr-bot/state/reviewers-for-label-python.json
@@ -9,7 +9,7 @@
 "pabloem": 1681281324703,
 "y1chi": 1667002607045,
 "damccorm": 1716812030541,
-"jrmccluskey": 1716671130850,
+"jrmccluskey": 1717004086284,
 "riteshghorse": 1716859827701,
 "liferoad": 1716514219816,
 "shunping": 1716932066289



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 401b23bd72d Updating config from bot
401b23bd72d is described below

commit 401b23bd72dba460c503bf3a5277233adef658d0
Author: github-actions 
AuthorDate: Wed May 29 17:34:55 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/reviewers-for-label-python.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-python.json 
b/scripts/ci/pr-bot/state/reviewers-for-label-python.json
index ceb48eef132..ed5961bd871 100644
--- a/scripts/ci/pr-bot/state/reviewers-for-label-python.json
+++ b/scripts/ci/pr-bot/state/reviewers-for-label-python.json
@@ -5,7 +5,7 @@
 "yeandy": 1665802753763,
 "TheNeuralBit": 1667896849319,
 "ryanthompson591": 1670002443548,
-"tvalentyn": 1716812026358,
+"tvalentyn": 1717004092542,
 "pabloem": 1681281324703,
 "y1chi": 1667002607045,
 "damccorm": 1716812030541,



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new f43d3de5d9e Updating config from bot
f43d3de5d9e is described below

commit f43d3de5d9ed58b73066cff05c35e443a25aa80a
Author: github-actions 
AuthorDate: Wed May 29 17:34:51 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/reviewers-for-label-io.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-io.json 
b/scripts/ci/pr-bot/state/reviewers-for-label-io.json
index 87e911904a4..77da0bdd7a3 100644
--- a/scripts/ci/pr-bot/state/reviewers-for-label-io.json
+++ b/scripts/ci/pr-bot/state/reviewers-for-label-io.json
@@ -4,7 +4,7 @@
 "chamikaramj": 1716671130850,
 "johnjcasey": 1716478475429,
 "pabloem": 1691787951165,
-"Abacn": 1716380039048,
+"Abacn": 1717004086284,
 "ahmedabu98": 1716836967201,
 "bvolpato": 1712595969392,
 "manavgarg": 1690826779210,



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new e9ffb5c1603 Updating config from bot
e9ffb5c1603 is described below

commit e9ffb5c1603287a47898def356593f700389b237
Author: github-actions 
AuthorDate: Wed May 29 17:34:47 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31444.json | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31444.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31444.json
new file mode 100644
index 000..75c264d0c37
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31444.json
@@ -0,0 +1,11 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {
+"python": "jrmccluskey",
+"io": "Abacn"
+  },
+  "nextAction": "Reviewers",
+  "stopReviewerNotifications": false,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 9e02dd5d399 Updating config from bot
9e02dd5d399 is described below

commit 9e02dd5d39997cd494b6aab43b085c35d57e8ef5
Author: github-actions 
AuthorDate: Wed May 29 17:34:53 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31443.json | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31443.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31443.json
new file mode 100644
index 000..75d3535cad2
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31443.json
@@ -0,0 +1,11 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {
+"python": "tvalentyn",
+"io": "johnjcasey"
+  },
+  "nextAction": "Reviewers",
+  "stopReviewerNotifications": false,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new dfc469c59a1 Updating config from bot
dfc469c59a1 is described below

commit dfc469c59a1f8c912672e2fb3434f69ec15d568d
Author: github-actions 
AuthorDate: Wed May 29 17:34:57 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/reviewers-for-label-io.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/reviewers-for-label-io.json 
b/scripts/ci/pr-bot/state/reviewers-for-label-io.json
index 77da0bdd7a3..82005472ffc 100644
--- a/scripts/ci/pr-bot/state/reviewers-for-label-io.json
+++ b/scripts/ci/pr-bot/state/reviewers-for-label-io.json
@@ -2,7 +2,7 @@
   "label": "io",
   "dateOfLastReviewAssignment": {
 "chamikaramj": 1716671130850,
-"johnjcasey": 1716478475429,
+"johnjcasey": 1717004092542,
 "pabloem": 1691787951165,
 "Abacn": 1717004086284,
 "ahmedabu98": 1716836967201,



(beam) branch master updated (b1a6eb06051 -> 8b33e1f65c3)

2024-05-29 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


from b1a6eb06051 [YAML] Fix simple YAML mappings type hinting (#31427)
 add 6842136e0c9 Add SDK capability to detect if the SDK Fn Harness data 
channel is busy.
 add ad841c6004f Regenerate Go protos.
 new 8b33e1f65c3 Merge pull request #31442 SDK protocol to detect if the 
SDK Fn Harness data channel is busy

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../beam/model/fn_execution/v1/beam_fn_api.proto   |5 +
 .../beam/model/pipeline/v1/beam_runner_api.proto   |5 +
 .../beam/model/fnexecution_v1/beam_fn_api.pb.go| 1582 ++
 .../beam/model/pipeline_v1/beam_runner_api.pb.go   | 3245 ++--
 .../model/pipeline_v1/external_transforms.pb.go|4 +
 5 files changed, 2592 insertions(+), 2249 deletions(-)



(beam) 01/01: Merge pull request #31442 SDK protocol to detect if the SDK Fn Harness data channel is busy

2024-05-29 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 8b33e1f65c38d7650118858d632d393bda21329a
Merge: b1a6eb06051 ad841c6004f
Author: Robert Bradshaw 
AuthorDate: Wed May 29 10:28:36 2024 -0700

Merge pull request #31442 SDK protocol to detect if the SDK Fn Harness data 
channel is busy

 .../beam/model/fn_execution/v1/beam_fn_api.proto   |5 +
 .../beam/model/pipeline/v1/beam_runner_api.proto   |5 +
 .../beam/model/fnexecution_v1/beam_fn_api.pb.go| 1582 ++
 .../beam/model/pipeline_v1/beam_runner_api.pb.go   | 3245 ++--
 .../model/pipeline_v1/external_transforms.pb.go|4 +
 5 files changed, 2592 insertions(+), 2249 deletions(-)



(beam) branch master updated (49a4290426d -> b1a6eb06051)

2024-05-29 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


from 49a4290426d Add options to specify read and write http timeout for gcs 
as well as lower batching limit for rewrite operations which are copying. 
(#31410)
 add b1a6eb06051 [YAML] Fix simple YAML mappings type hinting (#31427)

No new revisions were added by this update.

Summary of changes:
 sdks/python/apache_beam/yaml/yaml_combine.py  |  2 +-
 sdks/python/apache_beam/yaml/yaml_mapping.py  | 23 ++---
 sdks/python/apache_beam/yaml/yaml_mapping_test.py | 30 +++
 3 files changed, 50 insertions(+), 5 deletions(-)



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 25d4a203ca2 Updating config from bot
25d4a203ca2 is described below

commit 25d4a203ca21d151f280d67fa57a1625c6bee530
Author: github-actions 
AuthorDate: Wed May 29 16:36:58 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31291.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31291.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31291.json
index b6d9bb10ea1..8c2fdb423a5 100644
--- a/scripts/ci/pr-bot/state/pr-state/pr-31291.json
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31291.json
@@ -6,5 +6,5 @@
   "nextAction": "Reviewers",
   "stopReviewerNotifications": false,
   "remindAfterTestsPass": [],
-  "committerAssigned": false
+  "committerAssigned": true
 }
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new fa3c053488d Updating config from bot
fa3c053488d is described below

commit fa3c053488d7d9028eb5b606d921cf9f600458e0
Author: github-actions 
AuthorDate: Wed May 29 16:04:06 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31392.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31392.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31392.json
index 242a48d7d3b..37f7ab41238 100644
--- a/scripts/ci/pr-bot/state/pr-state/pr-31392.json
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31392.json
@@ -2,7 +2,7 @@
   "commentedAboutFailingChecks": true,
   "reviewersAssignedForLabels": {},
   "nextAction": "Author",
-  "stopReviewerNotifications": false,
+  "stopReviewerNotifications": true,
   "remindAfterTestsPass": [],
   "committerAssigned": false
 }
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 5a58ecbeeec Updating config from bot
5a58ecbeeec is described below

commit 5a58ecbeeecc2f07e58106195d20df067a9efe6a
Author: github-actions 
AuthorDate: Wed May 29 15:12:25 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31291.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31291.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31291.json
index 62e3b33e44d..b6d9bb10ea1 100644
--- a/scripts/ci/pr-bot/state/pr-state/pr-31291.json
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31291.json
@@ -3,7 +3,7 @@
   "reviewersAssignedForLabels": {
 "python": "tvalentyn"
   },
-  "nextAction": "Author",
+  "nextAction": "Reviewers",
   "stopReviewerNotifications": false,
   "remindAfterTestsPass": [],
   "committerAssigned": false



(beam) branch master updated: Add options to specify read and write http timeout for gcs as well as lower batching limit for rewrite operations which are copying. (#31410)

2024-05-29 Thread scwhittle
This is an automated email from the ASF dual-hosted git repository.

scwhittle pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 49a4290426d Add options to specify read and write http timeout for gcs 
as well as lower batching limit for rewrite operations which are copying. 
(#31410)
49a4290426d is described below

commit 49a4290426d5cfad56b5d9977f26517de5036885
Author: Sam Whittle 
AuthorDate: Wed May 29 15:37:34 2024 +0200

Add options to specify read and write http timeout for gcs as well as lower 
batching limit for rewrite operations which are copying. (#31410)
---
 .../sdk/extensions/gcp/options/GcsOptions.java |  18 +++
 .../beam/sdk/extensions/gcp/util/GcsUtil.java  | 142 +
 .../gcp/util/RetryHttpRequestInitializer.java  |  15 ++-
 .../beam/sdk/extensions/gcp/util/Transport.java|  41 +++---
 .../beam/sdk/extensions/gcp/util/GcsUtilTest.java  |  83 +---
 5 files changed, 211 insertions(+), 88 deletions(-)

diff --git 
a/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java
 
b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java
index 3b2461dcb0e..175d8f58de4 100644
--- 
a/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java
+++ 
b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java
@@ -124,6 +124,24 @@ public interface GcsOptions extends 
ApplicationNameOptions, GcpOptions, Pipeline
 
   void setGcsPerformanceMetrics(Boolean reportPerformanceMetrics);
 
+  @Description("Read timeout for gcs http requests")
+  @Nullable
+  Integer getGcsHttpRequestReadTimeout();
+
+  void setGcsHttpRequestReadTimeout(@Nullable Integer timeoutMs);
+
+  @Description("Write timeout for gcs http requests.")
+  @Nullable
+  Integer getGcsHttpRequestWriteTimeout();
+
+  void setGcsHttpRequestWriteTimeout(@Nullable Integer timeoutMs);
+
+  @Description("Batching limit for rewrite ops which will copy data.")
+  @Nullable
+  Integer getGcsRewriteDataOpBatchLimit();
+
+  void setGcsRewriteDataOpBatchLimit(@Nullable Integer timeoutMs);
+
   /**
* Returns the default {@link ExecutorService} to use within the Apache Beam 
SDK. The {@link
* ExecutorService} is compatible with AppEngine.
diff --git 
a/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
 
b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
index 60e8443d264..9e790002ecd 100644
--- 
a/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
+++ 
b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
@@ -86,6 +86,7 @@ import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.util.FluentBackoff;
 import org.apache.beam.sdk.util.MoreFutures;
 import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.annotations.VisibleForTesting;
+import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions;
 import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.ImmutableList;
 import 
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.Lists;
 import org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.Sets;
@@ -123,7 +124,8 @@ public class GcsUtil {
   gcsOptions.getExecutorService(),
   hasExperiment(options, "use_grpc_for_gcs"),
   gcsOptions.getGcpCredential(),
-  gcsOptions.getGcsUploadBufferSizeBytes());
+  gcsOptions.getGcsUploadBufferSizeBytes(),
+  gcsOptions.getGcsRewriteDataOpBatchLimit());
 }
 
 /** Returns an instance of {@link GcsUtil} based on the given parameters. 
*/
@@ -140,7 +142,8 @@ public class GcsUtil {
   executorService,
   hasExperiment(options, "use_grpc_for_gcs"),
   credentials,
-  uploadBufferSizeBytes);
+  uploadBufferSizeBytes,
+  null);
 }
   }
 
@@ -154,6 +157,8 @@ public class GcsUtil {
 
   /** Maximum number of requests permitted in a GCS batch request. */
   private static final int MAX_REQUESTS_PER_BATCH = 100;
+  /** Default maximum number of requests permitted in a GCS batch request 
where data is copied. */
+  private static final int MAX_REQUESTS_PER_COPY_BATCH = 10;
   /** Maximum number of concurrent batches of requests executing on GCS. */
   private static final int MAX_CONCURRENT_BATCHES = 256;
 
@@ -179,11 +184,13 @@ public class GcsUtil {
   // Exposed for testing.
   final ExecutorService executorService;
 
-  private Credentials 

(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new 96eb9cc94b3 Updating config from bot
96eb9cc94b3 is described below

commit 96eb9cc94b3ae42a45ddd49a1276355990887148
Author: github-actions 
AuthorDate: Wed May 29 12:31:43 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31436.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31436.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31436.json
new file mode 100644
index 000..9c2aa5aa212
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31436.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": false,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": true,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file



(beam) branch pr-bot-state updated: Updating config from bot

2024-05-29 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch pr-bot-state
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/pr-bot-state by this push:
 new df6cc49a199 Updating config from bot
df6cc49a199 is described below

commit df6cc49a19959b24b3d165f4bcff37bd939661bb
Author: github-actions 
AuthorDate: Wed May 29 07:06:04 2024 +

Updating config from bot
---
 scripts/ci/pr-bot/state/pr-state/pr-31435.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/ci/pr-bot/state/pr-state/pr-31435.json 
b/scripts/ci/pr-bot/state/pr-state/pr-31435.json
new file mode 100644
index 000..242a48d7d3b
--- /dev/null
+++ b/scripts/ci/pr-bot/state/pr-state/pr-31435.json
@@ -0,0 +1,8 @@
+{
+  "commentedAboutFailingChecks": true,
+  "reviewersAssignedForLabels": {},
+  "nextAction": "Author",
+  "stopReviewerNotifications": false,
+  "remindAfterTestsPass": [],
+  "committerAssigned": false
+}
\ No newline at end of file