[beam] tag nightly-master updated (92ea5f5 -> 39cf3fc)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to tag nightly-master in repository https://gitbox.apache.org/repos/asf/beam.git. *** WARNING: tag nightly-master was modified! *** from 92ea5f5 (commit) to 39cf3fc (commit) from 92ea5f5 [BEAM-11088] Add TestStream package to Go SDK testing capabilities (#15253) add 8b6f675 Merge pull request #15272:[BEAM-12712] Exclude runners that can't handle looping timers add 3a84a78 [BEAM-12703] Fix universal metrics. (#15260) add ec62170 [BEAM-12702] Pull step unique names from pipeline for metrics. (#15261) add e11fbba fix GroupIntoBatches add 03a1cca Merge pull request #15273: [BEAM-12720] fix GroupIntoBatches incorrectly clearing variables after outputting them add 74963dc [BEAM-12715] Use shard number specified by user in SnowflakeIO batch writeFiles add 3898642 Merge pull request #15255: [BEAM-12715] Snowflake IO use getShards in batch write mode add dbb44bf [BEAM-12678] Add dependency of java jars when running go VR on portable runners (#15276) add 64188aa [BEAM-12671] Mark known composite transforms native (#15236) add 2c7911b [BEAM-12670] Relocate bq client exception imports to try block and conditionally turn off tests if imports fail add 39cf3fc Merge pull request #15264 from [BEAM-12670] Relocate bq client exception imports to try block and conditionally turn off tests if imports fail No new revisions were added by this update. Summary of changes: CHANGES.md | 1 + runners/direct-java/build.gradle | 2 + runners/flink/flink_runner.gradle | 1 + .../runners/dataflow/GroupIntoBatchesOverride.java | 4 +- runners/samza/build.gradle | 1 + runners/samza/job-server/build.gradle | 5 +- .../beam/runners/samza/SamzaPipelineRunner.java| 17 ++- .../SamzaPortablePipelineTranslator.java | 27 +++- java => SamzaPortableTranslatorRegistrar.java} | 4 +- sdks/go/pkg/beam/core/metrics/metrics.go | 54 ++-- sdks/go/pkg/beam/core/metrics/metrics_test.go | 144 + .../beam/runners/dataflow/dataflowlib/execute.go | 7 +- .../beam/runners/dataflow/dataflowlib/metrics.go | 22 ++-- .../runners/dataflow/dataflowlib/metrics_test.go | 34 ++--- sdks/go/test/build.gradle | 7 + .../apache/beam/sdk/io/snowflake/SnowflakeIO.java | 3 +- sdks/python/apache_beam/io/gcp/bigquery_tools.py | 3 +- .../apache_beam/io/gcp/bigquery_tools_test.py | 11 +- .../runners/portability/samza_runner_test.py | 6 + 19 files changed, 293 insertions(+), 60 deletions(-) copy runners/samza/src/main/java/org/apache/beam/runners/samza/translation/{SamzaTranslatorRegistrar.java => SamzaPortableTranslatorRegistrar.java} (89%)
[beam] branch release-2.32.0 updated: [BEAM-12678] Add dependency of java jars when running go VR on portable runners (#15276) (#15279)
This is an automated email from the ASF dual-hosted git repository. goenka pushed a commit to branch release-2.32.0 in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/release-2.32.0 by this push: new c0cd295 [BEAM-12678] Add dependency of java jars when running go VR on portable runners (#15276) (#15279) c0cd295 is described below commit c0cd29572c2ce1490d0519fd6d5946ef46cf28fe Author: Ankur AuthorDate: Wed Aug 4 18:07:49 2021 -0700 [BEAM-12678] Add dependency of java jars when running go VR on portable runners (#15276) (#15279) --- sdks/go/test/build.gradle | 7 +++ 1 file changed, 7 insertions(+) diff --git a/sdks/go/test/build.gradle b/sdks/go/test/build.gradle index ebeb0ea..8e078fa 100644 --- a/sdks/go/test/build.gradle +++ b/sdks/go/test/build.gradle @@ -78,6 +78,8 @@ task dataflowValidatesRunner() { task flinkValidatesRunner { dependsOn ":sdks:go:test:goBuild" dependsOn ":sdks:go:container:docker" + dependsOn ":sdks:java:container:java8:docker" + dependsOn ":sdks:java:container:java11:docker" dependsOn ":runners:flink:${project.ext.latestFlinkVersion}:job-server:shadowJar" dependsOn ":sdks:java:testing:expansion-service:buildTestExpansionServiceJar" doLast { @@ -101,6 +103,8 @@ task flinkValidatesRunner { task samzaValidatesRunner { dependsOn ":sdks:go:test:goBuild" dependsOn ":sdks:go:container:docker" + dependsOn ":sdks:java:container:java8:docker" + dependsOn ":sdks:java:container:java11:docker" dependsOn ":runners:samza:job-server:shadowJar" dependsOn ":sdks:java:testing:expansion-service:buildTestExpansionServiceJar" doLast { @@ -123,6 +127,8 @@ task samzaValidatesRunner { // with Spark to validate that the runner behaves as expected. task sparkValidatesRunner { dependsOn ":sdks:go:test:goBuild" + dependsOn ":sdks:java:container:java8:docker" + dependsOn ":sdks:java:container:java11:docker" dependsOn ":runners:spark:2:job-server:shadowJar" dependsOn ":sdks:java:testing:expansion-service:buildTestExpansionServiceJar" doLast { @@ -151,6 +157,7 @@ task sparkValidatesRunner { task ulrValidatesRunner { dependsOn ":sdks:go:test:goBuild" dependsOn ":sdks:go:container:docker" + dependsOn ":sdks:java:container:java8:docker" dependsOn ":sdks:java:container:java11:docker" dependsOn "setupVirtualenv" dependsOn ":sdks:python:buildPython"
[beam] branch master updated: [BEAM-12670] Relocate bq client exception imports to try block and conditionally turn off tests if imports fail
This is an automated email from the ASF dual-hosted git repository. pabloem pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 2c7911b [BEAM-12670] Relocate bq client exception imports to try block and conditionally turn off tests if imports fail new 39cf3fc Merge pull request #15264 from [BEAM-12670] Relocate bq client exception imports to try block and conditionally turn off tests if imports fail 2c7911b is described below commit 2c7911b3268fff1aa1fe9329495d90052e0716a0 Author: Alex Amato AuthorDate: Mon Aug 2 17:53:47 2021 -0700 [BEAM-12670] Relocate bq client exception imports to try block and conditionally turn off tests if imports fail --- sdks/python/apache_beam/io/gcp/bigquery_tools.py | 3 +-- sdks/python/apache_beam/io/gcp/bigquery_tools_test.py | 11 --- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/sdks/python/apache_beam/io/gcp/bigquery_tools.py b/sdks/python/apache_beam/io/gcp/bigquery_tools.py index 114c42d..964b8f9 100644 --- a/sdks/python/apache_beam/io/gcp/bigquery_tools.py +++ b/sdks/python/apache_beam/io/gcp/bigquery_tools.py @@ -68,6 +68,7 @@ from apache_beam.utils.histogram import LinearBucket try: from apitools.base.py.transfer import Upload from apitools.base.py.exceptions import HttpError, HttpForbiddenError + from google.api_core.exceptions import ClientError, GoogleAPICallError from google.cloud import bigquery as gcp_bigquery except ImportError: gcp_bigquery = None @@ -617,8 +618,6 @@ class BigQueryWrapper(object): Docs for this BQ call: https://cloud.google.com/bigquery/docs/reference\ /rest/v2/tabledata/insertAll.""" -from google.api_core.exceptions import ClientError -from google.api_core.exceptions import GoogleAPICallError # The rows argument is a list of # bigquery.TableDataInsertAllRequest.RowsValueListEntry instances as # required by the InsertAll() method. diff --git a/sdks/python/apache_beam/io/gcp/bigquery_tools_test.py b/sdks/python/apache_beam/io/gcp/bigquery_tools_test.py index 1252ffb..84ac2f7 100644 --- a/sdks/python/apache_beam/io/gcp/bigquery_tools_test.py +++ b/sdks/python/apache_beam/io/gcp/bigquery_tools_test.py @@ -54,9 +54,14 @@ from apache_beam.options.value_provider import StaticValueProvider # pylint: disable=wrong-import-order, wrong-import-position try: from apitools.base.py.exceptions import HttpError, HttpForbiddenError + from google.api_core.exceptions import ClientError, DeadlineExceeded + from google.api_core.exceptions import InternalServerError except ImportError: + ClientError = None + DeadlineExceeded = None HttpError = None HttpForbiddenError = None + InternalServerError = None # pylint: enable=wrong-import-order, wrong-import-position @@ -459,15 +464,15 @@ class TestBigQueryWrapper(unittest.TestCase): self.assertTrue( found, "Did not find write call metric with status: %s" % status) + @unittest.skipIf(ClientError is None, 'GCP dependencies are not installed') def test_insert_rows_sets_metric_on_failure(self): -from google.api_core import exceptions MetricsEnvironment.process_wide_container().reset() client = mock.Mock() client.insert_rows_json = mock.Mock( # Fail a few times, then succeed. side_effect=[ -exceptions.DeadlineExceeded("Deadline Exceeded"), -exceptions.InternalServerError("Internal Error"), +DeadlineExceeded("Deadline Exceeded"), +InternalServerError("Internal Error"), [], ]) wrapper = beam.io.gcp.bigquery_tools.BigQueryWrapper(client)
[beam] branch master updated: [BEAM-12671] Mark known composite transforms native (#15236)
This is an automated email from the ASF dual-hosted git repository. xinyu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 64188aa [BEAM-12671] Mark known composite transforms native (#15236) 64188aa is described below commit 64188aa8188de5a9e970fe456b9cf5187790613e Author: Ke Wu AuthorDate: Wed Aug 4 11:32:03 2021 -0700 [BEAM-12671] Mark known composite transforms native (#15236) --- runners/samza/job-server/build.gradle | 4 +++- .../beam/runners/samza/SamzaPipelineRunner.java| 17 -- .../SamzaPortablePipelineTranslator.java | 27 +- .../SamzaPortableTranslatorRegistrar.java | 25 .../runners/portability/samza_runner_test.py | 6 + 5 files changed, 75 insertions(+), 4 deletions(-) diff --git a/runners/samza/job-server/build.gradle b/runners/samza/job-server/build.gradle index bef6535..a8086c6 100644 --- a/runners/samza/job-server/build.gradle +++ b/runners/samza/job-server/build.gradle @@ -76,7 +76,7 @@ createPortableValidatesRunnerTask( excludeCategories 'org.apache.beam.sdk.testing.UsesTimersInParDo' // TODO: BEAM-12350 excludeCategories 'org.apache.beam.sdk.testing.UsesAttemptedMetrics' - +// TODO: BEAM-12681 excludeCategories 'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders' // Larger keys are possible, but they require more memory. excludeCategories 'org.apache.beam.sdk.testing.LargeKeys$Above10MB' @@ -101,6 +101,8 @@ createPortableValidatesRunnerTask( excludeCategories 'org.apache.beam.sdk.testing.UsesLoopingTimer' }, testFilter: { +// TODO(BEAM-12677) +excludeTestsMatching "org.apache.beam.sdk.transforms.FlattenTest.testFlattenWithDifferentInputAndOutputCoders2" excludeTestsMatching "org.apache.beam.sdk.transforms.FlattenTest.testEmptyFlattenAsSideInput" excludeTestsMatching "org.apache.beam.sdk.transforms.FlattenTest.testFlattenPCollectionsEmptyThenParDo" excludeTestsMatching "org.apache.beam.sdk.transforms.FlattenTest.testFlattenPCollectionsEmpty" diff --git a/runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineRunner.java b/runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineRunner.java index ef58588..bbdd1f5 100644 --- a/runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineRunner.java +++ b/runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineRunner.java @@ -20,13 +20,16 @@ package org.apache.beam.runners.samza; import org.apache.beam.model.pipeline.v1.RunnerApi; import org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline; import org.apache.beam.runners.core.construction.PTransformTranslation; +import org.apache.beam.runners.core.construction.graph.ExecutableStage; import org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser; import org.apache.beam.runners.core.construction.graph.ProtoOverrides; import org.apache.beam.runners.core.construction.graph.SplittableParDoExpander; +import org.apache.beam.runners.core.construction.graph.TrivialNativeTransformExpander; import org.apache.beam.runners.core.construction.renderer.PipelineDotRenderer; import org.apache.beam.runners.fnexecution.provisioning.JobInfo; import org.apache.beam.runners.jobsubmission.PortablePipelineResult; import org.apache.beam.runners.jobsubmission.PortablePipelineRunner; +import org.apache.beam.runners.samza.translation.SamzaPortablePipelineTranslator; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -46,9 +49,19 @@ public class SamzaPipelineRunner implements PortablePipelineRunner { pipeline, SplittableParDoExpander.createSizedReplacement()); +// Don't let the fuser fuse any subcomponents of native transforms. +Pipeline trimmedPipeline = +TrivialNativeTransformExpander.forKnownUrns( +pipelineWithSdfExpanded, SamzaPortablePipelineTranslator.knownUrns()); + // Fused pipeline proto. -final RunnerApi.Pipeline fusedPipeline = -GreedyPipelineFuser.fuse(pipelineWithSdfExpanded).toPipeline(); +// TODO: Consider supporting partially-fused graphs. +RunnerApi.Pipeline fusedPipeline = +trimmedPipeline.getComponents().getTransformsMap().values().stream() +.anyMatch(proto -> ExecutableStage.URN.equals(proto.getSpec().getUrn())) +? trimmedPipeline +: GreedyPipelineFuser.fuse(trimmedPipeline).toPipeline(); + LOG.info("Portable pipeline to run:"); LOG.info(PipelineDotRenderer.toDotString(fusedPipeline)); // the pipeline option coming from sdk will set the sdk specific runner which will break diff --git a/runners/samza/src/main/java/org/apache/beam/ru
[beam] branch master updated (3898642 -> dbb44bf)
This is an automated email from the ASF dual-hosted git repository. goenka pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 3898642 Merge pull request #15255: [BEAM-12715] Snowflake IO use getShards in batch write mode add dbb44bf [BEAM-12678] Add dependency of java jars when running go VR on portable runners (#15276) No new revisions were added by this update. Summary of changes: sdks/go/test/build.gradle | 7 +++ 1 file changed, 7 insertions(+)
[beam] 01/01: Merge pull request #15255: [BEAM-12715] Snowflake IO use getShards in batch write mode
This is an automated email from the ASF dual-hosted git repository. aromanenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 3898642abbf3017e947b5f5b38c2156b2cb99459 Merge: 03a1cca 74963dc Author: Alexey Romanenko <33895511+aromanenko-...@users.noreply.github.com> AuthorDate: Wed Aug 4 17:59:50 2021 +0200 Merge pull request #15255: [BEAM-12715] Snowflake IO use getShards in batch write mode CHANGES.md | 1 + .../src/main/java/org/apache/beam/sdk/io/snowflake/SnowflakeIO.java| 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-)
[beam] branch master updated (03a1cca -> 3898642)
This is an automated email from the ASF dual-hosted git repository. aromanenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 03a1cca Merge pull request #15273: [BEAM-12720] fix GroupIntoBatches incorrectly clearing variables after outputting them add 74963dc [BEAM-12715] Use shard number specified by user in SnowflakeIO batch writeFiles new 3898642 Merge pull request #15255: [BEAM-12715] Snowflake IO use getShards in batch write mode The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGES.md | 1 + .../src/main/java/org/apache/beam/sdk/io/snowflake/SnowflakeIO.java| 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-)
[beam] branch master updated (ec62170 -> 03a1cca)
This is an automated email from the ASF dual-hosted git repository. reuvenlax pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from ec62170 [BEAM-12702] Pull step unique names from pipeline for metrics. (#15261) add e11fbba fix GroupIntoBatches add 03a1cca Merge pull request #15273: [BEAM-12720] fix GroupIntoBatches incorrectly clearing variables after outputting them No new revisions were added by this update. Summary of changes: .../org/apache/beam/runners/dataflow/GroupIntoBatchesOverride.java| 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
[beam] branch master updated (3a84a78 -> ec62170)
This is an automated email from the ASF dual-hosted git repository. lostluck pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 3a84a78 [BEAM-12703] Fix universal metrics. (#15260) add ec62170 [BEAM-12702] Pull step unique names from pipeline for metrics. (#15261) No new revisions were added by this update. Summary of changes: .../beam/runners/dataflow/dataflowlib/execute.go | 7 ++--- .../beam/runners/dataflow/dataflowlib/metrics.go | 22 +++--- .../runners/dataflow/dataflowlib/metrics_test.go | 34 +- 3 files changed, 27 insertions(+), 36 deletions(-)
[beam] branch master updated (8b6f675 -> 3a84a78)
This is an automated email from the ASF dual-hosted git repository. lostluck pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 8b6f675 Merge pull request #15272:[BEAM-12712] Exclude runners that can't handle looping timers add 3a84a78 [BEAM-12703] Fix universal metrics. (#15260) No new revisions were added by this update. Summary of changes: sdks/go/pkg/beam/core/metrics/metrics.go | 54 -- sdks/go/pkg/beam/core/metrics/metrics_test.go | 144 ++ 2 files changed, 189 insertions(+), 9 deletions(-)