[beam] branch master updated (10e3e57 -> 98747fd)

2021-08-05 Thread tvalentyn
This is an automated email from the ASF dual-hosted git repository.

tvalentyn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 10e3e57  Merge pull request #15288 from andyxiexu/java-sdk
 add 459654c  bump FnAPI container
 new 98747fd  Bump Python FnAPI beam-master container #15283

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/runners/dataflow/internal/names.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[beam] 01/01: Bump Python FnAPI beam-master container #15283

2021-08-05 Thread tvalentyn
This is an automated email from the ASF dual-hosted git repository.

tvalentyn pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 98747fd3e46f0d2b970e2d8f81f62756fa288ccc
Merge: 10e3e57 459654c
Author: tvalentyn 
AuthorDate: Thu Aug 5 23:48:57 2021 -0700

Bump Python FnAPI beam-master container #15283

 sdks/python/apache_beam/runners/dataflow/internal/names.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[beam] tag nightly-master updated (39cf3fc -> 10e3e57)

2021-08-05 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to tag nightly-master
in repository https://gitbox.apache.org/repos/asf/beam.git.


*** WARNING: tag nightly-master was modified! ***

from 39cf3fc  (commit)
  to 10e3e57  (commit)
from 39cf3fc  Merge pull request #15264 from [BEAM-12670] Relocate bq 
client exception imports to try block and conditionally turn off tests if 
imports fail
 add c4508c8  [BEAM-12591] Put Spark Structured Streaming runner sources 
back to main src folder
 add 473d187  [BEAM-12629] As spark DataSourceV2 is only available for 
spark 2, provide a DataSourceV2 based impl for spark 2 and create a structure 
for extension with a spark 3 source.
 add ad6bea8  [BEAM-12627] Deal with spark Encoders braking change between 
spark 2 and spark 3 by providing an implementation for each of them.
 add f0014d9  [BEAM-12591] move SchemaHelpers to correct package
 add fd9bb74  [BEAM-8470] Disable wait for termination in a streaming 
pipeline because it is infinite by definition
 add 94ce5d3  [BEAM-12630] Deal with breaking change in streaming pipelines 
start by introducing an AbstractTranslationContext and version specific 
implementations
 add b8dc86c  [BEAM-12629] Make source tests spark version agnostic and 
move them back to common spark module
 add b1d5dc4  [BEAM-12629] Make a spark 3 source impl
 add 75247cb  [BEAM-12591] Fix checkstyle and spotless
 add e10b2eb  [BEAM-12629] Reduce serializable to only needed classes and 
Fix schema inference
 add cc3ff98  [BEAM-12591] Add checkstyle exceptions for version specific 
classes because checkstyle does not correctly detect package files across 
multiple source directories
 add 81033b1  [BEAM-12629] Fix sources javadocs and improve impl
 add 23fd65d  [BEAM-12591] Add spark 3 to structured streaming validates 
runner tests
 add 2144cab  Merge pull request #15218 from 
echauchot/BEAM-7093-spark3-fix-for-SS-runner
 add 291aa6c  [BEAM-6516] Fixes race condition in RabbitMqIO causing 
duplicate acks (#15157)
 add 7bfeca2  [BEAM-12601] Add append-only option (#15257)
 add 57b6d77  Revert "[BEAM-11934] Remove Dataflow override of streaming 
WriteFiles with runner determined sharding (#15178)"
 add f759a5c  Revert "[BEAM-11934] Remove Dataflow override of streaming 
WriteFiles with runner determined sharding"
 add 0ac5480  Add google cloud heap profiling support to beam java sdk 
container
 add 10e3e57  Merge pull request #15288 from andyxiexu/java-sdk

No new revisions were added by this update.

Summary of changes:
 ...ValidatesRunner_SparkStructuredStreaming.groovy |   1 +
 CHANGES.md |   2 +
 .../beam/runners/dataflow/DataflowRunner.java  |  73 +++
 .../beam/runners/dataflow/DataflowRunnerTest.java  | 120 +--
 .../translation/TranslationContext.java| 240 +
 .../translation/batch/DatasetSourceBatch.java  |   9 +-
 .../translation/helpers/EncoderFactory.java|  54 +
 .../streaming/DatasetSourceStreaming.java  |   9 +-
 .../translation/batch/SimpleSourceTest.java| 101 -
 .../translation/TranslationContext.java}   |  29 ++-
 .../translation/batch/DatasetSourceBatch.java  | 240 +
 .../translation/helpers/EncoderFactory.java|  49 +
 .../streaming/DatasetSourceStreaming.java} |   9 +-
 .../spark/structuredstreaming/Constants.java}  |   8 +-
 .../SparkStructuredStreamingPipelineOptions.java   |   0
 .../SparkStructuredStreamingPipelineResult.java|   0
 .../SparkStructuredStreamingRunner.java|   6 +-
 .../SparkStructuredStreamingRunnerRegistrar.java   |   0
 .../aggregators/AggregatorsAccumulator.java|   0
 .../aggregators/NamedAggregators.java  |   0
 .../aggregators/NamedAggregatorsAccumulator.java   |   0
 .../aggregators/package-info.java  |   0
 .../structuredstreaming/examples/WordCount.java|   0
 .../metrics/AggregatorMetric.java  |   0
 .../metrics/AggregatorMetricSource.java|   0
 .../metrics/CompositeSource.java   |   0
 .../metrics/MetricsAccumulator.java|   0
 .../MetricsContainerStepMapAccumulator.java|   0
 .../metrics/SparkBeamMetric.java   |   0
 .../metrics/SparkBeamMetricSource.java |   0
 .../metrics/SparkMetricsContainerStepMap.java  |   0
 .../metrics/WithMetricsSupport.java|   0
 .../structuredstreaming/metrics/package-info.java  |   0
 .../metrics/sink/CodahaleCsvSink.java  |   0
 .../metrics/sink/CodahaleGraphiteSink.java |   0
 .../metrics/sink/package-info.java |   0
 .../spark/structuredstreaming/package-info.java|   0
 .../translation/AbstractTranslationContext.java}   |  19 +-
 .../translation/PipelineTran

[beam] branch master updated: Add google cloud heap profiling support to beam java sdk container

2021-08-05 Thread yichi
This is an automated email from the ASF dual-hosted git repository.

yichi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 0ac5480  Add google cloud heap profiling support to beam java sdk 
container
 new 10e3e57  Merge pull request #15288 from andyxiexu/java-sdk
0ac5480 is described below

commit 0ac54802ed07d962fde573208789d1106deb7de8
Author: Andy Xu 
AuthorDate: Thu Aug 5 13:37:05 2021 -0700

Add google cloud heap profiling support to beam java sdk container
---
 sdks/java/container/boot.go | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 2c029e5..d0f108c 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -48,6 +48,13 @@ var (
semiPersistDir= flag.String("semi_persist_dir", "/tmp", "Local 
semi-persistent directory (optional).")
 )
 
+const (
+   enableGoogleCloudProfilerOption = "enable_google_cloud_profiler"
+   enableGoogleCloudHeapSamplingOption = 
"enable_google_cloud_heap_sampling"
+   googleCloudProfilerAgentBaseArgs= 
"-agentpath:/opt/google_cloud_profiler/profiler_java_agent.so=-logtostderr,-cprof_service=%s,-cprof_service_version=%s"
+   googleCloudProfilerAgentHeapArgs= googleCloudProfilerAgentBaseArgs 
+ ",-cprof_enable_heap_sampling,-cprof_heap_sampling_interval=2097152"
+)
+
 func main() {
flag.Parse()
if *id == "" {
@@ -151,12 +158,18 @@ func main() {
"-cp", strings.Join(cp, ":"),
}
 
-   enableGoogleCloudProfiler := strings.Contains(options, 
"enable_google_cloud_profiler")
+   enableGoogleCloudProfiler := strings.Contains(options, 
enableGoogleCloudProfilerOption)
+   enableGoogleCloudHeapSampling := strings.Contains(options, 
enableGoogleCloudHeapSamplingOption)
if enableGoogleCloudProfiler {
if metadata := info.GetMetadata(); metadata != nil {
if jobName, nameExists := metadata["job_name"]; 
nameExists {
if jobId, idExists := metadata["job_id"]; 
idExists {
-   args = append(args, 
fmt.Sprintf("-agentpath:/opt/google_cloud_profiler/profiler_java_agent.so=-cprof_service=%s,-cprof_service_version=%s",
 jobName, jobId))
+   if enableGoogleCloudHeapSampling {
+   args = append(args, 
fmt.Sprintf(googleCloudProfilerAgentHeapArgs, jobName, jobId))
+   } else {
+   args = append(args, 
fmt.Sprintf(googleCloudProfilerAgentBaseArgs, jobName, jobId))
+   }
+   log.Printf("Turning on Cloud Profiling. 
Profile heap: %t", enableGoogleCloudHeapSampling)
} else {
log.Println("Required job_id missing 
from metadata, profiling will not be enabled without it.")
}


[beam] branch master updated (7bfeca2 -> f759a5c)

2021-08-05 Thread tvalentyn
This is an automated email from the ASF dual-hosted git repository.

tvalentyn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 7bfeca2  [BEAM-12601] Add append-only option (#15257)
 add 57b6d77  Revert "[BEAM-11934] Remove Dataflow override of streaming 
WriteFiles with runner determined sharding (#15178)"
 add f759a5c  Revert "[BEAM-11934] Remove Dataflow override of streaming 
WriteFiles with runner determined sharding"

No new revisions were added by this update.

Summary of changes:
 .../beam/runners/dataflow/DataflowRunner.java  |  73 +
 .../beam/runners/dataflow/DataflowRunnerTest.java  | 120 ++---
 .../main/java/org/apache/beam/sdk/io/FileIO.java   |   9 +-
 3 files changed, 134 insertions(+), 68 deletions(-)


[beam] branch master updated (291aa6c -> 7bfeca2)

2021-08-05 Thread echauchot
This is an automated email from the ASF dual-hosted git repository.

echauchot pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 291aa6c  [BEAM-6516] Fixes race condition in RabbitMqIO causing 
duplicate acks (#15157)
 add 7bfeca2  [BEAM-12601] Add append-only option (#15257)

No new revisions were added by this update.

Summary of changes:
 CHANGES.md |  1 +
 .../sdk/io/elasticsearch/ElasticsearchIOTest.java  | 12 
 .../sdk/io/elasticsearch/ElasticsearchIOTest.java  | 12 
 .../sdk/io/elasticsearch/ElasticsearchIOTest.java  | 12 
 .../elasticsearch/ElasticsearchIOTestCommon.java   | 19 +
 .../beam/sdk/io/elasticsearch/ElasticsearchIO.java | 80 --
 6 files changed, 114 insertions(+), 22 deletions(-)


[beam] branch master updated (2144cab -> 291aa6c)

2021-08-05 Thread aromanenko
This is an automated email from the ASF dual-hosted git repository.

aromanenko pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 2144cab  Merge pull request #15218 from 
echauchot/BEAM-7093-spark3-fix-for-SS-runner
 add 291aa6c  [BEAM-6516] Fixes race condition in RabbitMqIO causing 
duplicate acks (#15157)

No new revisions were added by this update.

Summary of changes:
 CHANGES.md|  1 +
 .../java/org/apache/beam/sdk/io/rabbitmq/RabbitMqIO.java  | 15 ---
 2 files changed, 13 insertions(+), 3 deletions(-)


[beam] branch master updated (39cf3fc -> 2144cab)

2021-08-05 Thread echauchot
This is an automated email from the ASF dual-hosted git repository.

echauchot pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 39cf3fc  Merge pull request #15264 from [BEAM-12670] Relocate bq 
client exception imports to try block and conditionally turn off tests if 
imports fail
 new c4508c8  [BEAM-12591] Put Spark Structured Streaming runner sources 
back to main src folder
 new 473d187  [BEAM-12629] As spark DataSourceV2 is only available for 
spark 2, provide a DataSourceV2 based impl for spark 2 and create a structure 
for extension with a spark 3 source.
 new ad6bea8  [BEAM-12627] Deal with spark Encoders braking change between 
spark 2 and spark 3 by providing an implementation for each of them.
 new f0014d9  [BEAM-12591] move SchemaHelpers to correct package
 new fd9bb74  [BEAM-8470] Disable wait for termination in a streaming 
pipeline because it is infinite by definition
 new 94ce5d3  [BEAM-12630] Deal with breaking change in streaming pipelines 
start by introducing an AbstractTranslationContext and version specific 
implementations
 new b8dc86c  [BEAM-12629] Make source tests spark version agnostic and 
move them back to common spark module
 new b1d5dc4  [BEAM-12629] Make a spark 3 source impl
 new 75247cb  [BEAM-12591] Fix checkstyle and spotless
 new e10b2eb  [BEAM-12629] Reduce serializable to only needed classes and 
Fix schema inference
 new cc3ff98  [BEAM-12591] Add checkstyle exceptions for version specific 
classes because checkstyle does not correctly detect package files across 
multiple source directories
 new 81033b1  [BEAM-12629] Fix sources javadocs and improve impl
 new 23fd65d  [BEAM-12591] Add spark 3 to structured streaming validates 
runner tests
 new 2144cab  Merge pull request #15218 from 
echauchot/BEAM-7093-spark3-fix-for-SS-runner

The 32603 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ...ValidatesRunner_SparkStructuredStreaming.groovy |   1 +
 .../translation/TranslationContext.java| 240 +
 .../translation/batch/DatasetSourceBatch.java  |   9 +-
 .../translation/helpers/EncoderFactory.java|  54 +
 .../streaming/DatasetSourceStreaming.java  |   9 +-
 .../translation/batch/SimpleSourceTest.java| 101 -
 .../translation/TranslationContext.java}   |  29 ++-
 .../translation/batch/DatasetSourceBatch.java  | 240 +
 .../translation/helpers/EncoderFactory.java|  49 +
 .../streaming/DatasetSourceStreaming.java} |   9 +-
 .../spark/structuredstreaming/Constants.java}  |   8 +-
 .../SparkStructuredStreamingPipelineOptions.java   |   0
 .../SparkStructuredStreamingPipelineResult.java|   0
 .../SparkStructuredStreamingRunner.java|   6 +-
 .../SparkStructuredStreamingRunnerRegistrar.java   |   0
 .../aggregators/AggregatorsAccumulator.java|   0
 .../aggregators/NamedAggregators.java  |   0
 .../aggregators/NamedAggregatorsAccumulator.java   |   0
 .../aggregators/package-info.java  |   0
 .../structuredstreaming/examples/WordCount.java|   0
 .../metrics/AggregatorMetric.java  |   0
 .../metrics/AggregatorMetricSource.java|   0
 .../metrics/CompositeSource.java   |   0
 .../metrics/MetricsAccumulator.java|   0
 .../MetricsContainerStepMapAccumulator.java|   0
 .../metrics/SparkBeamMetric.java   |   0
 .../metrics/SparkBeamMetricSource.java |   0
 .../metrics/SparkMetricsContainerStepMap.java  |   0
 .../metrics/WithMetricsSupport.java|   0
 .../structuredstreaming/metrics/package-info.java  |   0
 .../metrics/sink/CodahaleCsvSink.java  |   0
 .../metrics/sink/CodahaleGraphiteSink.java |   0
 .../metrics/sink/package-info.java |   0
 .../spark/structuredstreaming/package-info.java|   0
 .../translation/AbstractTranslationContext.java}   |  19 +-
 .../translation/PipelineTranslator.java|   4 +-
 .../translation/SparkTransformOverrides.java   |   0
 .../translation/TransformTranslator.java   |   2 +-
 .../translation/batch/AggregatorCombiner.java  |   0
 .../batch/CombinePerKeyTranslatorBatch.java|   4 +-
 .../CreatePCollectionViewTranslatorBatch.java  |   4 +-
 .../translation/batch/DoFnFunction.java|   0
 .../translation/batch/DoFnRunnerWithMetrics.java   |   0
 .../translation/batch/FlattenTranslatorBatch.java  |   4 +-
 .../batch/GroupByKeyTranslatorBatch.java   |   4 +-
 .../translation/batch/ImpulseTranslatorBatch.java  |   4 +-
 .../translation/batch/ParDoTranslatorBatch.java|