[beam] branch nightly-refs/heads/master updated (11b1a70 -> 8724555)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch nightly-refs/heads/master in repository https://gitbox.apache.org/repos/asf/beam.git. from 11b1a70 [BEAM-13399] Move service liveness polling to Runner type (#16487) add 7f5abe8 [BEAM-13480] Increase pipeline timeout for PubSubIntegrationTest.test_streaming_data_only (#16496) add 814a10b Stronger typing inference for CoGBK. (#16465) add 5ab52a3 [BEAM-12464] Change ProtoSchemaTranslator beam schema creation to match the order for protobufs containing Oneof fields (#14974) add ad7a350 Introduce the notion of a JoinIndex for fewer shuffles. (#16101) add bd22c65 Merge pull request #16467 from [BEAM-12164]: SpannerIO DetectNewPartitions SDF add 7deb105 Merge pull request #16385 from [BEAM-13535] [Playground] add cancel execution button add 0a220a1 Merge pull request #16485 from [BEAM-13486] [Playground] For unit tests (java) if one of tests fails the output goes to stdOutput add cb08343 python sdk examples: Fixed typo in wordcount example. add 8724555 Merge pull request #16413 from blais/master No new revisions were added by this update. Summary of changes: playground/frontend/lib/l10n/app_en.arb| 12 + .../lib/modules/editor/components/run_button.dart | 20 +- .../code_repository/code_client/code_client.dart | 2 + .../code_client/grpc_code_client.dart | 6 + .../code_repository/code_repository.dart | 32 ++- .../code_repository/run_code_result.dart | 9 +- .../components/close_listener.dart}| 32 ++- .../close_listener_nonweb.dart}| 15 +- .../components/editor_textarea_wrapper.dart| 15 +- .../lib/pages/playground/playground_page.dart | 88 +++ .../pages/playground/states/playground_state.dart | 19 +- .../code_repository/code_repository_test.dart | 13 +- .../code_repository_test.mocks.dart| 5 + .../extensions/protobuf/ProtoSchemaTranslator.java | 26 +- .../protobuf/ProtoDynamicMessageSchemaTest.java| 86 +++ .../protobuf/ProtoMessageSchemaTest.java | 46 .../protobuf/ProtoSchemaTranslatorTest.java| 14 ++ .../sdk/extensions/protobuf/TestProtoSchemas.java | 125 +- .../src/test/proto/proto3_schema_messages.proto| 28 +++ .../spanner/changestreams/ChangeStreamMetrics.java | 154 .../dofn/DetectNewPartitionsDoFn.java | 264 + .../spanner/changestreams/dofn}/package-info.java | 4 +- sdks/python/apache_beam/dataframe/expressions.py | 1 + sdks/python/apache_beam/dataframe/frame_base.py| 2 +- sdks/python/apache_beam/dataframe/frames.py| 2 +- sdks/python/apache_beam/dataframe/partitionings.py | 57 - .../apache_beam/dataframe/partitionings_test.py| 11 +- sdks/python/apache_beam/dataframe/transforms.py| 79 -- .../apache_beam/dataframe/transforms_test.py | 40 sdks/python/apache_beam/examples/wordcount.py | 2 +- .../apache_beam/io/gcp/pubsub_integration_test.py | 4 +- sdks/python/apache_beam/metrics/metric.py | 5 +- sdks/python/apache_beam/runners/common.py | 3 +- .../runners/direct/direct_runner_test.py | 3 +- .../runners/portability/fn_api_runner/fn_runner.py | 13 +- .../portability/fn_api_runner/fn_runner_test.py| 10 +- .../runners/portability/portable_metrics.py| 4 - sdks/python/apache_beam/transforms/util.py | 48 +++- sdks/python/apache_beam/typehints/typehints.py | 6 +- .../python/apache_beam/typehints/typehints_test.py | 1 + 40 files changed, 1155 insertions(+), 151 deletions(-) copy playground/frontend/lib/pages/{embedded_playground/components/embedded_editor.dart => playground/components/close_listener.dart} (62%) copy playground/frontend/lib/pages/playground/{states/feedback_state.dart => components/close_listener_nonweb.dart} (80%) create mode 100644 sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/dofn/DetectNewPartitionsDoFn.java copy sdks/java/{extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/example => io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/dofn}/package-info.java (85%)
[beam] branch master updated: python sdk examples: Fixed typo in wordcount example.
This is an automated email from the ASF dual-hosted git repository. altay pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new cb08343 python sdk examples: Fixed typo in wordcount example. new 8724555 Merge pull request #16413 from blais/master cb08343 is described below commit cb083434de6be2b623bf290c69a880c2e856cfed Author: blais AuthorDate: Sat Jan 1 23:11:19 2022 -0500 python sdk examples: Fixed typo in wordcount example. --- sdks/python/apache_beam/examples/wordcount.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sdks/python/apache_beam/examples/wordcount.py b/sdks/python/apache_beam/examples/wordcount.py index b59baa6..12ba343 100644 --- a/sdks/python/apache_beam/examples/wordcount.py +++ b/sdks/python/apache_beam/examples/wordcount.py @@ -75,7 +75,7 @@ def run(argv=None, save_main_session=True): counts = ( lines | 'Split' >> (beam.ParDo(WordExtractingDoFn()).with_output_types(str)) -| 'PairWIthOne' >> beam.Map(lambda x: (x, 1)) +| 'PairWithOne' >> beam.Map(lambda x: (x, 1)) | 'GroupAndSum' >> beam.CombinePerKey(sum)) # Format the counts into a PCollection of strings.
[beam] branch master updated (7deb105 -> 0a220a1)
This is an automated email from the ASF dual-hosted git repository. pabloem pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 7deb105 Merge pull request #16385 from [BEAM-13535] [Playground] add cancel execution button add 0a220a1 Merge pull request #16485 from [BEAM-13486] [Playground] For unit tests (java) if one of tests fails the output goes to stdOutput No new revisions were added by this update. Summary of changes: .../repository/code_repository/code_repository.dart | 17 +++-- .../code_repository/code_repository_test.dart | 7 +-- 2 files changed, 20 insertions(+), 4 deletions(-)
[beam] branch master updated (bd22c65 -> 7deb105)
This is an automated email from the ASF dual-hosted git repository. pabloem pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from bd22c65 Merge pull request #16467 from [BEAM-12164]: SpannerIO DetectNewPartitions SDF add 7deb105 Merge pull request #16385 from [BEAM-13535] [Playground] add cancel execution button No new revisions were added by this update. Summary of changes: playground/frontend/lib/l10n/app_en.arb| 12 +++ .../lib/modules/editor/components/run_button.dart | 20 +++-- .../code_repository/code_client/code_client.dart | 2 + .../code_client/grpc_code_client.dart | 6 ++ .../code_repository/code_repository.dart | 15 +++- .../code_repository/run_code_result.dart | 9 ++- .../components/close_listener.dart}| 32 +--- .../close_listener_nonweb.dart}| 15 ++-- .../components/editor_textarea_wrapper.dart| 15 +++- .../lib/pages/playground/playground_page.dart | 88 +++--- .../pages/playground/states/playground_state.dart | 19 - .../code_repository/code_repository_test.dart | 6 ++ .../code_repository_test.mocks.dart| 5 ++ 13 files changed, 169 insertions(+), 75 deletions(-) copy playground/frontend/lib/pages/{embedded_playground/components/embedded_editor.dart => playground/components/close_listener.dart} (62%) copy playground/frontend/lib/pages/playground/{states/feedback_state.dart => components/close_listener_nonweb.dart} (80%)
[beam] branch master updated: Merge pull request #16467 from [BEAM-12164]: SpannerIO DetectNewPartitions SDF
This is an automated email from the ASF dual-hosted git repository. pabloem pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new bd22c65 Merge pull request #16467 from [BEAM-12164]: SpannerIO DetectNewPartitions SDF bd22c65 is described below commit bd22c65ee2788d2635ad76334d4d4c1cb653f535 Author: Thiago Nunes AuthorDate: Fri Jan 14 06:50:42 2022 +1100 Merge pull request #16467 from [BEAM-12164]: SpannerIO DetectNewPartitions SDF * [BEAM-12164]: SpannerIO DetectNewPartitions SDF Adds the DetectNewPartitions SDF. This component will be responsible for: - Emitting a watermark based on the min of all unfinished partitions in the metadata table. - Querying all partitions in the CREATED state. - Updating the created partitions to SCHEDULED state. - Emitting the scheduled partitions to the PCollection. This SDF will run periodically as based on the configured resume interval (default is 100ms, best effort). * chore: fix linting violations Co-authored-by: Hengfeng Li --- .../spanner/changestreams/ChangeStreamMetrics.java | 154 .../dofn/DetectNewPartitionsDoFn.java | 264 + .../package-info.java} | 17 +- 3 files changed, 422 insertions(+), 13 deletions(-) diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/ChangeStreamMetrics.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/ChangeStreamMetrics.java index 8f1e0fc..1b9fb7c 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/ChangeStreamMetrics.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/ChangeStreamMetrics.java @@ -18,6 +18,14 @@ package org.apache.beam.sdk.io.gcp.spanner.changestreams; import java.io.Serializable; +import java.util.HashSet; +import java.util.Set; +import org.apache.beam.sdk.io.gcp.spanner.changestreams.model.PartitionMetadata.State; +import org.apache.beam.sdk.metrics.Counter; +import org.apache.beam.sdk.metrics.Distribution; +import org.apache.beam.sdk.metrics.MetricName; +import org.apache.beam.sdk.metrics.Metrics; +import org.joda.time.Duration; /** Class to aggregate metrics related functionality. */ public class ChangeStreamMetrics implements Serializable { @@ -29,4 +37,150 @@ public class ChangeStreamMetrics implements Serializable { /** Cloud Tracing label for Partition Tokens. */ public static final String PARTITION_ID_ATTRIBUTE_LABEL = "PartitionID"; + + // + // Partition record metrics + + /** + * Counter for the total number of partitions identified during the execution of the Connector. + */ + public static final Counter PARTITION_RECORD_COUNT = + Metrics.counter(ChangeStreamMetrics.class, "partition_record_count"); + + /** + * Counter for the total number of partition splits / moves identified during the execution of the + * Connector. + */ + public static final Counter PARTITION_RECORD_SPLIT_COUNT = + Metrics.counter(ChangeStreamMetrics.class, "partition_record_split_count"); + + /** + * Counter for the total number of partition merges identified during the execution of the + * Connector. + */ + public static final Counter PARTITION_RECORD_MERGE_COUNT = + Metrics.counter(ChangeStreamMetrics.class, "partition_record_merge_count"); + + /** + * Time in milliseconds that a partition took to transition from {@link State#CREATED} to {@link + * State#SCHEDULED}. + */ + public static final Distribution PARTITION_CREATED_TO_SCHEDULED_MS = + Metrics.distribution(ChangeStreamMetrics.class, "partition_created_to_scheduled_ms"); + + /** + * Time in milliseconds that a partition took to transition from {@link State#SCHEDULED} to {@link + * State#RUNNING}. + */ + public static final Distribution PARTITION_SCHEDULED_TO_RUNNING_MS = + Metrics.distribution(ChangeStreamMetrics.class, "partition_scheduled_to_running_ms"); + + // --- + // Data record metrics + + /** + * Counter for the total number of data records identified during the execution of the Connector. + */ + public static final Counter DATA_RECORD_COUNT = + Metrics.counter(ChangeStreamMetrics.class, "data_record_count"); + + // --- + // Hearbeat record metrics + + /** + * Counter for the total number of heartbeat records identified during the execution of the + * Connector. + */ + public static final Counter HEARTBEAT_RECORD_COUNT = + Metrics.counter(ChangeStreamMetrics.class, "heartbeat_record_count"); + + /** + * If a metric is not within this set it will not be measured.
[beam] branch master updated (5ab52a3 -> ad7a350)
This is an automated email from the ASF dual-hosted git repository. robertwb pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 5ab52a3 [BEAM-12464] Change ProtoSchemaTranslator beam schema creation to match the order for protobufs containing Oneof fields (#14974) add ad7a350 Introduce the notion of a JoinIndex for fewer shuffles. (#16101) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/dataframe/expressions.py | 1 + sdks/python/apache_beam/dataframe/frame_base.py| 2 +- sdks/python/apache_beam/dataframe/frames.py| 2 +- sdks/python/apache_beam/dataframe/partitionings.py | 57 +++- .../apache_beam/dataframe/partitionings_test.py| 11 ++- sdks/python/apache_beam/dataframe/transforms.py| 79 +++--- .../apache_beam/dataframe/transforms_test.py | 40 +++ sdks/python/apache_beam/metrics/metric.py | 5 +- .../runners/direct/direct_runner_test.py | 3 +- .../runners/portability/fn_api_runner/fn_runner.py | 13 +++- .../portability/fn_api_runner/fn_runner_test.py| 10 +-- .../runners/portability/portable_metrics.py| 4 -- 12 files changed, 187 insertions(+), 40 deletions(-)
[beam] branch master updated: [BEAM-12464] Change ProtoSchemaTranslator beam schema creation to match the order for protobufs containing Oneof fields (#14974)
This is an automated email from the ASF dual-hosted git repository. ibzib pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 5ab52a3 [BEAM-12464] Change ProtoSchemaTranslator beam schema creation to match the order for protobufs containing Oneof fields (#14974) 5ab52a3 is described below commit 5ab52a3f4cfe2680098186763550b5f8ad30319c Author: Reuben van Ammers AuthorDate: Fri Jan 14 06:13:00 2022 +1100 [BEAM-12464] Change ProtoSchemaTranslator beam schema creation to match the order for protobufs containing Oneof fields (#14974) * ProtoSchemaTranslator now orders oneof fields in the resultant beam schema in accordance with their location in the protobuf definition * add reverse order protobuf * add noncontiguous oneof and some renaming * Comments and variable renaming * add reversed row tests * add noncontiguous tests * remove redundant null check * minor test comment update * update * add reversedonof test * add noncontiguous oneof test Co-authored-by: Reuben van Ammers --- .../extensions/protobuf/ProtoSchemaTranslator.java | 26 - .../protobuf/ProtoDynamicMessageSchemaTest.java| 86 ++ .../protobuf/ProtoMessageSchemaTest.java | 46 .../protobuf/ProtoSchemaTranslatorTest.java| 14 +++ .../sdk/extensions/protobuf/TestProtoSchemas.java | 125 - .../src/test/proto/proto3_schema_messages.proto| 28 + 6 files changed, 314 insertions(+), 11 deletions(-) diff --git a/sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.java b/sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.java index 91eb1bd7..ef46b59 100644 --- a/sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.java +++ b/sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.java @@ -156,13 +156,18 @@ class ProtoSchemaTranslator { } static Schema getSchema(Descriptors.Descriptor descriptor) { -Set oneOfFields = Sets.newHashSet(); +/* OneOfComponentFields refers to the field number in the protobuf where the component subfields + * are. This is needed to prevent double inclusion of the component fields.*/ +Set oneOfComponentFields = Sets.newHashSet(); +/* OneOfFieldLocation stores the field number of the first field in the OneOf. Using this, we can use the location +of the first field in the OneOf as the location of the entire OneOf.*/ +Map oneOfFieldLocation = Maps.newHashMap(); List fields = Lists.newArrayListWithCapacity(descriptor.getFields().size()); for (OneofDescriptor oneofDescriptor : descriptor.getOneofs()) { List subFields = Lists.newArrayListWithCapacity(oneofDescriptor.getFieldCount()); Map enumIds = Maps.newHashMap(); for (FieldDescriptor fieldDescriptor : oneofDescriptor.getFields()) { -oneOfFields.add(fieldDescriptor.getNumber()); +oneOfComponentFields.add(fieldDescriptor.getNumber()); // Store proto field number in a field option. FieldType fieldType = beamFieldTypeFromProtoField(fieldDescriptor); subFields.add( @@ -172,17 +177,26 @@ class ProtoSchemaTranslator { enumIds.putIfAbsent(fieldDescriptor.getName(), fieldDescriptor.getNumber()) == null); } FieldType oneOfType = FieldType.logicalType(OneOfType.create(subFields, enumIds)); - fields.add(Field.of(oneofDescriptor.getName(), oneOfType)); + oneOfFieldLocation.put( + oneofDescriptor.getFields().get(0).getNumber(), + Field.of(oneofDescriptor.getName(), oneOfType)); } for (Descriptors.FieldDescriptor fieldDescriptor : descriptor.getFields()) { - if (!oneOfFields.contains(fieldDescriptor.getNumber())) { + int fieldDescriptorNumber = fieldDescriptor.getNumber(); + if (!oneOfComponentFields.contains(fieldDescriptorNumber)) { // Store proto field number in metadata. FieldType fieldType = beamFieldTypeFromProtoField(fieldDescriptor); fields.add( -withFieldNumber( -Field.of(fieldDescriptor.getName(), fieldType), fieldDescriptor.getNumber()) +withFieldNumber(Field.of(fieldDescriptor.getName(), fieldType), fieldDescriptorNumber) .withOptions(getFieldOptions(fieldDescriptor))); +/* Note that descriptor.getFields() returns an iterator in the order of the fields in the .proto file, not + * in field number order. Therefore we can safely insert the OneOfField at the field of its first component.*/ + } else { +Field oneOfField =
[beam] branch master updated (7f5abe8 -> 814a10b)
This is an automated email from the ASF dual-hosted git repository. robertwb pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 7f5abe8 [BEAM-13480] Increase pipeline timeout for PubSubIntegrationTest.test_streaming_data_only (#16496) add 814a10b Stronger typing inference for CoGBK. (#16465) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/runners/common.py | 3 +- sdks/python/apache_beam/transforms/util.py | 48 +- sdks/python/apache_beam/typehints/typehints.py | 6 +-- .../python/apache_beam/typehints/typehints_test.py | 1 + 4 files changed, 43 insertions(+), 15 deletions(-)
[beam] branch master updated (11b1a70 -> 7f5abe8)
This is an automated email from the ASF dual-hosted git repository. bhulette pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 11b1a70 [BEAM-13399] Move service liveness polling to Runner type (#16487) add 7f5abe8 [BEAM-13480] Increase pipeline timeout for PubSubIntegrationTest.test_streaming_data_only (#16496) No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/io/gcp/pubsub_integration_test.py | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)