[GitHub] [beam] dennisylyung removed a comment on pull request #12583: [BEAM-10706] Fix duplicate key error in DynamoDBIO.Write
dennisylyung removed a comment on pull request #12583: URL: https://github.com/apache/beam/pull/12583#issuecomment-725986759 @iemejia I made some changes to use a hashmap for deduplication, and to improve the test. I separated the test for duplicate key, and used mockito to check whether the call to the aws client contains duplicate key. However, as you have worried, each element is put into a single bundle. Hence each bundle consist of one element only, defeating the duplication test. I looked at the documentation of `DoFnTester` but it is deprecated. The docs suggests using `TestPipeline` instead, but I couldn't find anyway to control the bundling of elements. Any idea? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #13361: Fix NPE in CountingSource
boyuanzz commented on pull request #13361: URL: https://github.com/apache/beam/pull/13361#issuecomment-728682733 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz opened a new pull request #13361: Fix NPE in CountingSource
boyuanzz opened a new pull request #13361: URL: https://github.com/apache/beam/pull/13361 r: @y1chi Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
[GitHub] [beam] udim merged pull request #13358: Update Beam Dataflow container versions for Python
udim merged pull request #13358: URL: https://github.com/apache/beam/pull/13358 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji commented on pull request #13360: add validate runner dataflow v2 java badge
ihji commented on pull request #13360: URL: https://github.com/apache/beam/pull/13360#issuecomment-728673644 R: @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji opened a new pull request #13360: add validate runner dataflow v2 java badge
ihji opened a new pull request #13360: URL: https://github.com/apache/beam/pull/13360 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
[GitHub] [beam] boyuanzz merged pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
boyuanzz merged pull request #13356: URL: https://github.com/apache/beam/pull/13356 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] udim commented on pull request #13358: Update Beam Dataflow container versions for Python
udim commented on pull request #13358: URL: https://github.com/apache/beam/pull/13358#issuecomment-728661421 Run Python_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] udim commented on pull request #13358: Update Beam Dataflow container versions for Python
udim commented on pull request #13358: URL: https://github.com/apache/beam/pull/13358#issuecomment-728660986 Run Python_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rezarokni commented on pull request #13112: [BEAM-11065] Apache Beam Template to ingest from Apache Kafka to Google Pub/Sub
rezarokni commented on pull request #13112: URL: https://github.com/apache/beam/pull/13112#issuecomment-728659985 @kennknowles I wont be able to look at this until the end of the week at the earliest, is there someone else who can pick this up? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji commented on pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
ihji commented on pull request #13356: URL: https://github.com/apache/beam/pull/13356#issuecomment-728653835 R: @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] tvalentyn commented on pull request #13359: [BEAM-11196] Cherry-pick #13303 to the 2.26.0 release branch.
tvalentyn commented on pull request #13359: URL: https://github.com/apache/beam/pull/13359#issuecomment-728578563 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] nehsyc commented on pull request #13292: [BEAM-10475]Add WithShardedKey variation of GroupIntoBatches transform in Python SDK.
nehsyc commented on pull request #13292: URL: https://github.com/apache/beam/pull/13292#issuecomment-728571398 R: @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #13211: [BEAM-8106] Separate Java8/11 container image build tasks
TheNeuralBit commented on pull request #13211: URL: https://github.com/apache/beam/pull/13211#issuecomment-728489376 Run Java Dataflow V2 ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #13211: [BEAM-8106] Separate Java8/11 container image build tasks
TheNeuralBit commented on pull request #13211: URL: https://github.com/apache/beam/pull/13211#issuecomment-728488362 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit merged pull request #13340: [BEAM-11262] Remove numSleeps assertion in SpannerIOWriteTest
TheNeuralBit merged pull request #13340: URL: https://github.com/apache/beam/pull/13340 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] y1chi commented on a change in pull request #13350: [BEAM-11266] Python IO MongoDB: add bucket_auto aggregation option for bundling in Atlas.
y1chi commented on a change in pull request #13350: URL: https://github.com/apache/beam/pull/13350#discussion_r524486089 ## File path: sdks/python/apache_beam/io/mongodbio.py ## @@ -241,6 +275,27 @@ def _get_split_keys(self, desired_chunk_size_in_mb, start_pos, end_pos): max={'_id': end_pos}, maxChunkSize=desired_chunk_size_in_mb)['splitKeys']) + def _get_buckets(self, desired_chunk_size, start_pos, end_pos): +if start_pos >= end_pos: + # single document not splittable + return [] +size = self.estimate_size() +bucket_count = size // desired_chunk_size Review comment: The split function will likely be called recursively for dynamic rebalancing, so for a range with start_pos and end_pos, it can be further split upon backend request, so it might not be reasonable to always use the total collection size divided by desired_chunk_size to calculate the bucket count. Is it possible to only get the buckets within the give _id range? and we can probably use an average document size times the number of documents to calculate the size of the range being split. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] y1chi commented on a change in pull request #13350: [BEAM-11266] Python IO MongoDB: add bucket_auto aggregation option for bundling in Atlas.
y1chi commented on a change in pull request #13350: URL: https://github.com/apache/beam/pull/13350#discussion_r524486089 ## File path: sdks/python/apache_beam/io/mongodbio.py ## @@ -241,6 +275,27 @@ def _get_split_keys(self, desired_chunk_size_in_mb, start_pos, end_pos): max={'_id': end_pos}, maxChunkSize=desired_chunk_size_in_mb)['splitKeys']) + def _get_buckets(self, desired_chunk_size, start_pos, end_pos): +if start_pos >= end_pos: + # single document not splittable + return [] +size = self.estimate_size() +bucket_count = size // desired_chunk_size Review comment: The split function will likely be called recursively for dynamic rebalancing, so for a range with start_pos and end_pos, it can be further split upon backend request, so it might not be reasonable to always use the total collection size divide by desired_chunk_size to calculate the bucket count. Is it possible to only get the buckets within the give _id range? and we can probably use an average document size times the number of documents to calculate the size of the range being split. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kileys closed pull request #13289: [BEAM-9444] Add GCP BOM to all java projects
kileys closed pull request #13289: URL: https://github.com/apache/beam/pull/13289 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji removed a comment on pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
ihji removed a comment on pull request #13356: URL: https://github.com/apache/beam/pull/13356#issuecomment-728433566 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji commented on pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
ihji commented on pull request #13356: URL: https://github.com/apache/beam/pull/13356#issuecomment-728433566 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] yifanmai commented on pull request #13359: [BEAM-11196] Ensure parent of fused stages is not one of its transforms
yifanmai commented on pull request #13359: URL: https://github.com/apache/beam/pull/13359#issuecomment-728433371 R: @tvalentyn This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] yifanmai opened a new pull request #13359: [BEAM-11196] Ensure parent of fused stages is not one of its transforms
yifanmai opened a new pull request #13359: URL: https://github.com/apache/beam/pull/13359 #13202 introduces a bug where the algorithm that determines the parents of fused stage can create loops, because the lowest common ancestor algorithm considers a transform to be its own parent. Fusing a single transform A will result cause the fused stage's parent to be set to A. Fusing a transform A with a descendant B will also cause fused stage's parent to be set to A. To fix this, if the parent of the fused stage would be one of transforms to be fused, set the parent to the parent of that transform instead. This is a cherry pick of #13303. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build
[GitHub] [beam] TheNeuralBit commented on pull request #13211: [BEAM-8106] Separate Java8/11 container image build tasks
TheNeuralBit commented on pull request #13211: URL: https://github.com/apache/beam/pull/13211#issuecomment-728424003 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] udim commented on pull request #13358: Update Beam Dataflow container versions for Python
udim commented on pull request #13358: URL: https://github.com/apache/beam/pull/13358#issuecomment-728411154 R: @TheNeuralBit @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] udim opened a new pull request #13358: Update Beam Dataflow container versions for Python
udim opened a new pull request #13358: URL: https://github.com/apache/beam/pull/13358 Continuation of https://github.com/apache/beam/pull/13323 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
[GitHub] [beam] veblush opened a new pull request #13357: [BEAM-8889] Upgrade GCSIO to 2.1.6 (Backport of #13311)
veblush opened a new pull request #13357: URL: https://github.com/apache/beam/pull/13357 Backport of #13311 R:@kennknowles Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
[GitHub] [beam] TheNeuralBit commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
TheNeuralBit commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728384639 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728382855 Right, I knew this. Python and Java have dataflow in the precommits, which isn't how the Go SDK organises it's tests, which is why it's a surprise. SGTM. Merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck merged pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck merged pull request #13347: URL: https://github.com/apache/beam/pull/13347 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #13340: [BEAM-11262] Remove numSleeps assertion in SpannerIOWriteTest
TheNeuralBit commented on pull request #13340: URL: https://github.com/apache/beam/pull/13340#issuecomment-728382082 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on pull request #13289: [BEAM-9444] Add GCP BOM to all java projects
kennknowles commented on pull request #13289: URL: https://github.com/apache/beam/pull/13289#issuecomment-728374842 We chatted about that. I think the risk of forgetting to add the BOM to a module is less and the risk of messing up the deps of a module is higher. Eventually we will want dependency convergence but without some big picture testing and analysis strategy I would say let's skip this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck merged pull request #13272: [BEAM-11207] Metric Extraction via proto RPC API
lostluck merged pull request #13272: URL: https://github.com/apache/beam/pull/13272 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck merged pull request #13348: Go redundant type cleanup.
lostluck merged pull request #13348: URL: https://github.com/apache/beam/pull/13348 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
TheNeuralBit commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728365075 Well I can't find any documentation stating that Python tests will fail, just that Dataflow tests will fail. But looking at the last [Python PreCommit failure](https://ci-beam.apache.org/job/beam_PreCommit_Python_Phrase/2316) it looks like the issue was hanging Dataflow tests: ``` 12:24:47 > Task :sdks:python:test-suites:dataflow:py36:preCommitIT_batch_V2 12:24:47 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:24:29.045Z: JOB_MESSAGE_BASIC: Your project already contains 100 Dataflow-created metric descriptors, so new user metrics of the form custom.googleapis.com/* will not be created. However, all user metrics are also available in the metric dataflow.googleapis.com/job/user_counter. If you rely on the custom metrics, you can delete old / unused metric descriptors. See https://developers.google.com/apis-explorer/#p/monitoring/v3/monitoring.projects.metricDescriptors.list and https://developers.google.com/apis-explorer/#p/monitoring/v3/monitoring.projects.metricDescriptors.delete 12:24:49 12:24:49 > Task :sdks:python:test-suites:dataflow:py37:preCommitIT_batch_V2 12:24:49 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:24:46.063Z: JOB_MESSAGE_DETAILED: Autoscaling: Raised the number of workers to 1 based on the rate of progress in the currently running stage(s). 12:25:17 12:25:17 > Task :sdks:python:test-suites:dataflow:py36:preCommitIT_batch_V2 12:25:17 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:24:47.927Z: JOB_MESSAGE_DETAILED: Autoscaling: Raised the number of workers to 1 based on the rate of progress in the currently running stage(s). 12:25:19 12:25:19 > Task :sdks:python:test-suites:dataflow:py37:preCommitIT_batch_V2 12:25:19 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:25:16.445Z: JOB_MESSAGE_DETAILED: Workers have started successfully. 12:25:19 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:25:16.480Z: JOB_MESSAGE_DETAILED: Workers have started successfully. 12:25:47 12:25:47 > Task :sdks:python:test-suites:dataflow:py36:preCommitIT_batch_V2 12:25:47 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:25:23.244Z: JOB_MESSAGE_DETAILED: Workers have started successfully. 12:25:47 INFO:apache_beam.runners.dataflow.dataflow_runner:2020-11-16T20:25:23.307Z: JOB_MESSAGE_DETAILED: Workers have started successfully. 12:48:35 Build was aborted ``` It looks like the same suites didn't fail on your verification branch though.. maybe something has changed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji commented on pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
ihji commented on pull request #13356: URL: https://github.com/apache/beam/pull/13356#issuecomment-728364299 Run Java Dataflow V2 ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji commented on pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
ihji commented on pull request #13356: URL: https://github.com/apache/beam/pull/13356#issuecomment-728363769 run xvr_dataflow postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] ihji opened a new pull request #13356: [BEAM-11270] Dataflow Java on runner v2 tests are failing because sdk…
ihji opened a new pull request #13356: URL: https://github.com/apache/beam/pull/13356 … docker container is cleaned up incorrectly **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
[GitHub] [beam] TheNeuralBit commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
TheNeuralBit commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728361839 I think it's expected that Python test suites fail on the release branch until a Dataflow container release. Let me verify that This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on pull request #13128: [BEAM-11265] Update quickstart-java.md
kennknowles commented on pull request #13128: URL: https://github.com/apache/beam/pull/13128#issuecomment-728361655 Great. That's perfect. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #13128: [BEAM-11265] Update quickstart-java.md
TheNeuralBit commented on pull request #13128: URL: https://github.com/apache/beam/pull/13128#issuecomment-728359904 > There are no accounts or anything needed for this, right? See my comment above, I think it's because we didn't want the quickstart project to require the GCP extension: > I don't think authentication is an issue since apache-beam-samples is open publicly, but I think the reason we don't use gs:// in all the examples is it requires extensions:google-cloud-platform. This isn't a problem for DataflowRunner though. Ultimately we adjusted this PR, now it just: - Updates the quickstart archetype by adding a sample.txt file with some shakespeare - Changes the file paths on the website to `/path/to/input` Once 2.27.0 is out we can update the website again to point users to the new sample.txt file that they'll have locally. This will probably just be for the DirectRunner since other runners most likely won't have access to the local filesystem. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rgruener removed a comment on pull request #13355: [BEAM-11272] Fix combiner label constructor arg
rgruener removed a comment on pull request #13355: URL: https://github.com/apache/beam/pull/13355#issuecomment-728354671 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rgruener commented on pull request #13355: [BEAM-11272] Fix combiner label constructor arg
rgruener commented on pull request #13355: URL: https://github.com/apache/beam/pull/13355#issuecomment-728354671 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rgruener removed a comment on pull request #13355: [BEAM-11272] Fix combiner label constructor arg
rgruener removed a comment on pull request #13355: URL: https://github.com/apache/beam/pull/13355#issuecomment-728337397 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] mxm commented on a change in pull request #13353: [BEAM-11267] Remove unnecessary reshuffle for stateful ParDo after key…
mxm commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524634584 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java ## @@ -971,7 +987,9 @@ public void translateNode( .transform(fullName, outputTypeInfo, (OneInputStreamOperator) doFnOperator) .uid(fullName); - context.setOutputDataStream(context.getOutput(transform), outDataStream); + final PCollection>> output = context.getOutput(transform); + context.setOutputDataStream(output, outDataStream); + context.setProducer(output, transform); Review comment: It would be nice if these explicit calls wouldn't be required. I believe `context.setOutputDataStream` internally has the current transform available. So we could perform it internally in the context. ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java ## @@ -1127,7 +1148,9 @@ public void translateNode( keyedWorkItemStream.getExecutionEnvironment().addOperator(rawFlinkTransform); -context.setOutputDataStream(context.getOutput(transform), outDataStream); +final PCollection> output = context.getOutput(transform); +context.setOutputDataStream(output, outDataStream); +context.setProducer(output, transform); Review comment: Same here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on pull request #13128: [BEAM-11265] Update quickstart-java.md
kennknowles commented on pull request #13128: URL: https://github.com/apache/beam/pull/13128#issuecomment-728348545 A couple years ago I had this same thought ("why are we using pom.xml") and ended up finding an answer to my satisfaction and not changing it... I don't remember why, but please do watch for any issues that might come up. There are no accounts or anything needed for this, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles merged pull request #13311: [BEAM-8889] Upgrade GCSIO to 2.1.6
kennknowles merged pull request #13311: URL: https://github.com/apache/beam/pull/13311 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on pull request #13311: [BEAM-8889] Upgrade GCSIO to 2.1.6
kennknowles commented on pull request #13311: URL: https://github.com/apache/beam/pull/13311#issuecomment-728345583 OK then I am happy to merge. It is an experiment and there are no linkage errors and existing tests pass. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rgruener commented on pull request #13355: [BEAM-11272] Fix combiner label constructor arg
rgruener commented on pull request #13355: URL: https://github.com/apache/beam/pull/13355#issuecomment-728339372 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rgruener commented on pull request #13355: [BEAM-11272] Fix combiner label constructor arg
rgruener commented on pull request #13355: URL: https://github.com/apache/beam/pull/13355#issuecomment-728337397 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] je-ik commented on pull request #13353: [BEAM-11267] Remove unnecessary reshuffle for stateful ParDo after key…
je-ik commented on pull request #13353: URL: https://github.com/apache/beam/pull/13353#issuecomment-728337299 Do we have a test (in flink runner) for the GBK -> stateful pardo pair? Not sure if there is one in ValidatesRunner suite. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on a change in pull request #13342: [BEAM-11260][BEAM-11261] Remove inappropriate assumptions about repo from linkage check script
kennknowles commented on a change in pull request #13342: URL: https://github.com/apache/beam/pull/13342#discussion_r524593181 ## File path: sdks/java/build-tools/beam-linkage-check.sh ## @@ -36,46 +36,60 @@ set -o pipefail set -e # These default artifacts are common causes of linkage errors. -ARTIFACTS="beam-sdks-java-core - beam-sdks-java-io-google-cloud-platform - beam-runners-google-cloud-dataflow-java - beam-sdks-java-io-hadoop-format" - -if [ ! -z "$1" ]; then - ARTIFACTS=$1 +DEFAULT_ARTIFACT_LISTS=" \ + beam-sdks-java-core \ + beam-sdks-java-io-google-cloud-platform \ + beam-runners-google-cloud-dataflow-java \ + beam-sdks-java-io-hadoop-format \ +" + +BASELINE_REF=$1 +PROPOSED_REF=$2 +ARTIFACT_LISTS=$3 + +if [ -z "$ARTIFACT_LISTS" ]; then + ARTIFACT_LISTS=$DEFAULT_ARTIFACT_LISTS fi -BRANCH_NAME=$(git symbolic-ref --short HEAD) +if [ -z "$BASELINE_REF" ] || [ -z "$PROPOSED_REF" ] || [ -z "$ARTIFACT_LISTS" ] ; then + echo "Usage: $0 [artifact lists]" + exit 1 +fi if [ ! -d buildSrc ]; then - echo "Please run this script in the Beam project root:" - echo " /bin/bash sdks/java/build-tools/beam-linkage-check.sh" - exit 1 + echo "Directory 'buildSrc' not found. Please run this script from the root directory of a clone of the Beam git repo." fi -if [ "$BRANCH_NAME" = "master" ]; then - echo "Please run this script on a branch other than master" +if [ "$BASELINE_REF" = "$PROPOSED_REF" ]; then + echo "Baseline and proposed refs are identical; cannot compare their linkage errors!" exit 1 fi -OUTPUT_DIR=build/linkagecheck -mkdir -p $OUTPUT_DIR - if [ ! -z "$(git diff)" ]; then echo "Uncommited change detected. Please commit changes and ensure 'git diff' is empty." exit 1 fi +STARTING_REF=$(git rev-parse --abbrev-ref HEAD) +function cleanup() { + git checkout $STARTING_REF +} +trap cleanup EXIT + +echo "Comparing linkage of artifact lists $ARTIFACT_LISTS using baseline $BASELINE_REF and proposal $PROPOSED_REF" + +OUTPUT_DIR=build/linkagecheck +mkdir -p $OUTPUT_DIR + ACCUMULATED_RESULT=0 function runLinkageCheck () { - COMMIT=$1 - BRANCH=$2 - MODE=$3 # "baseline" or "validate" - for ARTIFACT in $ARTIFACTS; do -echo "`date`:" "Running linkage check (${MODE}) for ${ARTIFACT} in ${BRANCH}" + MODE=$1 # "baseline" or "validate" -BASELINE_FILE=${OUTPUT_DIR}/baseline-${ARTIFACT}.xml + for ARTIFACT_LIST in $ARTIFACT_LISTS; do +echo "`date`:" "Running linkage check (${MODE}) for ${ARTIFACT_LISTS}" + +BASELINE_FILE=${OUTPUT_DIR}/baseline-${ARTIFACT_LIST}.xml Review comment: It is a stylistic "nit" really. My expectation of gradle commands is that they operate on the current source tree. There are exceptions, sure. But my thinking was: - `:checkJavaLinkage` always writes to `build/linkagecheck` and the report is the linkage report for the current state of the worktree. - `beam-linkage-check.sh` operates at level above that. It produces the above file for two separate states of the source tree and compares them. It is not that important. Just recording my thoughts so we can discuss the script a bit. ## File path: sdks/java/build-tools/beam-linkage-check.sh ## @@ -36,46 +36,60 @@ set -o pipefail set -e # These default artifacts are common causes of linkage errors. -ARTIFACTS="beam-sdks-java-core - beam-sdks-java-io-google-cloud-platform - beam-runners-google-cloud-dataflow-java - beam-sdks-java-io-hadoop-format" - -if [ ! -z "$1" ]; then - ARTIFACTS=$1 +DEFAULT_ARTIFACT_LISTS=" \ Review comment: Yes, I understand what the comma-separated list and space-separated list do. I just don't understand why we use both. I trust that there is a good reason, but I do not know it. Like if you had two subsets of artifacts that are not expected to work together? I would expect that we do want to run all the GCP modules together with the core SDK and, separately, run all the hadoop modules together with the core SDK, etc. Or maybe we just run them all together. So my proposal would be to drop the space-separated and only keep the comma-separated logic. But like I said, I trust there is a reason. So I am just making this proposal so you can tell me why it does not work. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] je-ik commented on a change in pull request #13353: [BEAM-11267] Remove unnecessary reshuffle for stateful ParDo after key…
je-ik commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524597532 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/WorkItemKeySelector.java ## @@ -49,6 +52,6 @@ public ByteBuffer getKey(WindowedValue> value) thro @Override public TypeInformation getProducedType() { -return new GenericTypeInfo<>(ByteBuffer.class); +return new CoderTypeInformation<>(FlinkKeyUtils.ByteBufferCoder.of(), pipelineOptions.get()); Review comment: I would think that the partitioning is ensured by the `reinterpretAsKeyedStream` and will therefore be preserved from the previous shuffle phase. Is this not enough? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on pull request #13306: [BEAM-10925] Create interface for SQL Java aggregate function.
kennknowles commented on pull request #13306: URL: https://github.com/apache/beam/pull/13306#issuecomment-728330950 CombineFn is stable, but UDAF is not. One example is that a UDAF has to have a SQL type. Right now this is not represented on the UDAF object but is implied. That might change. Personally, I would make the type part of the UDAF object. If you look at any other SQL product's definition of UDAFs you will see lots of extra metadata ([postgresql](https://www.postgresql.org/docs/9.5/sql-createaggregate.html), [mssql](https://docs.microsoft.com/en-us/sql/t-sql/statements/create-aggregate-transact-sql?view=sql-server-ver15)). A decoupled interface and trivial wrapper have obvious benefits and no real downside. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] iemejia commented on pull request #12963: [BEAM-10983] Add getting started from Spark page
iemejia commented on pull request #12963: URL: https://github.com/apache/beam/pull/12963#issuecomment-728329415 Is there something important still missing to get this one merged? Maybe we can merge and ask/do minor fixes after? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] Aliraza-N edited a comment on pull request #13137: [BEAM-11073] Dicom IO Connector for Java
Aliraza-N edited a comment on pull request #13137: URL: https://github.com/apache/beam/pull/13137#issuecomment-727005597 All done! @pabloem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] rgruener opened a new pull request #13355: [BEAM-11272] Fix combiner label constructor arg
rgruener opened a new pull request #13355: URL: https://github.com/apache/beam/pull/13355 Combiners have a label constructor argument which is not currently used correctly. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build
[GitHub] [beam] lostluck commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728320640 I'm not worried about the windows failures, but the precommit flake needed a quick re-run. LGTM and merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck edited a comment on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck edited a comment on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728320640 I'm not worried about the windows failures, but the precommit flake needed a quick re-run, which appears to be stalled? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728320898 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robertwb merged pull request #13333: Further dataframe batch consolidation.
robertwb merged pull request #1: URL: https://github.com/apache/beam/pull/1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz merged pull request #13344: [BEAM-11263] Java cleanUpDockerImages now force removes container images.
boyuanzz merged pull request #13344: URL: https://github.com/apache/beam/pull/13344 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] dmvk commented on a change in pull request #13353: [BEAM-11267] Remove unecessary reshuffle for stateful ParDo after key…
dmvk commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524545989 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTranslationContext.java ## @@ -84,6 +85,17 @@ public void setOutputDataStream(PValue value, DataStream set) { } } + void setProducer(T value, PTransform producer) { +if (!producers.containsKey(value)) { Review comment: makes sense This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] dmvk commented on a change in pull request #13353: [BEAM-11267] Remove unecessary reshuffle for stateful ParDo after key…
dmvk commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524545316 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/WorkItemKeySelector.java ## @@ -49,6 +52,6 @@ public ByteBuffer getKey(WindowedValue> value) thro @Override public TypeInformation getProducedType() { -return new GenericTypeInfo<>(ByteBuffer.class); +return new CoderTypeInformation<>(FlinkKeyUtils.ByteBufferCoder.of(), pipelineOptions.get()); Review comment: Not sure if this is necessary. I wanted to ensure that the new "reinterpreted partitioning" is compatible with the one used by GBK / Combine. The idea was if partitioning is not compatible, it may result in some state partitioning related glitches (eg. you wouldn't have local state for a key-group you need). Second thoughts, flink selects target partition (key group) based on "pojo hash code" (not based on binary representation), so the previous version was probably compatible enough 樂 @mxm WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] dmvk commented on a change in pull request #13353: [BEAM-11267] Remove unecessary reshuffle for stateful ParDo after key…
dmvk commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524545316 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/WorkItemKeySelector.java ## @@ -49,6 +52,6 @@ public ByteBuffer getKey(WindowedValue> value) thro @Override public TypeInformation getProducedType() { -return new GenericTypeInfo<>(ByteBuffer.class); +return new CoderTypeInformation<>(FlinkKeyUtils.ByteBufferCoder.of(), pipelineOptions.get()); Review comment: Not sure if this is necessary. I wanted to ensure that the new "reinterpreted partitioning" is compatible with the one used by GBK / Combine. The idea was if partitioning is not compatible, it may result in some state partitioning related glitches (eg. you wouldn't have local state for a key-group you need). Second thoughts, flink selects target partition based on "pojo hash code" (not based on binary representation), so the previous version was probably compatible enough 樂 @mxm WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on pull request #13338: [BEAM-11070] Use self-checkpoint to enfore finalization happens.
boyuanzz commented on pull request #13338: URL: https://github.com/apache/beam/pull/13338#issuecomment-728292692 Run PythonDocker PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit merged pull request #13128: [BEAM-11265] Update quickstart-java.md
TheNeuralBit merged pull request #13128: URL: https://github.com/apache/beam/pull/13128 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles merged pull request #13259: Fix Java ValidatesRunner V2 task dependency.
kennknowles merged pull request #13259: URL: https://github.com/apache/beam/pull/13259 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] kennknowles commented on pull request #13259: Fix Java ValidatesRunner V2 task dependency.
kennknowles commented on pull request #13259: URL: https://github.com/apache/beam/pull/13259#issuecomment-728284902 Makes sense to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on a change in pull request #13211: [BEAM-8106] Separate Java8/11 container image build tasks
TheNeuralBit commented on a change in pull request #13211: URL: https://github.com/apache/beam/pull/13211#discussion_r524529168 ## File path: sdks/java/container/common.gradle ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Build script containign common build tasks for Java SDK Docker images. Review comment: typo nit: ```suggestion * Build script containing common build tasks for Java SDK Docker images. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on a change in pull request #13211: [BEAM-8106] Separate Java8/11 container image build tasks
TheNeuralBit commented on a change in pull request #13211: URL: https://github.com/apache/beam/pull/13211#discussion_r524528475 ## File path: sdks/java/container/build.gradle ## @@ -86,51 +76,21 @@ licenseReport { renderers = [new JsonReportRenderer()] } -def imageJavaVersion = project.hasProperty('imageJavaVersion') ? project.findProperty('imageJavaVersion') : '8' -docker { Review comment: Hm ok, I'm not sure if we have any precedent to follow for this. I don't think it's worth going to a lot of trouble for. Instead you could just send an FYI email to dev@ once this is merged letting people know they may need to change :sdks:java:container:docker -> :sdks:java:container:java8:docker. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on a change in pull request #13283: [BEAM-11142] Enable CrossLanguageValidateRunner test for dataflow run…
boyuanzz commented on a change in pull request #13283: URL: https://github.com/apache/beam/pull/13283#discussion_r524523450 ## File path: runners/google-cloud-dataflow-java/build.gradle ## @@ -312,6 +313,36 @@ task validatesRunnerStreaming { )) } +createCrossLanguageValidatesRunnerTask( Review comment: This task makes all Dataflow runner v2 tests fail because the sdk container is cleaned up as soon as it is built. The cause is that in the common configuration, you have `config.startJobServer.finalizedBy config.cleanupJobServer` This will change the task dependency incorrectly. Filed jira here: https://issues.apache.org/jira/browse/BEAM-11270 cc: @tysonjh This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on a change in pull request #13283: [BEAM-11142] Enable CrossLanguageValidateRunner test for dataflow run…
boyuanzz commented on a change in pull request #13283: URL: https://github.com/apache/beam/pull/13283#discussion_r524523450 ## File path: runners/google-cloud-dataflow-java/build.gradle ## @@ -312,6 +313,36 @@ task validatesRunnerStreaming { )) } +createCrossLanguageValidatesRunnerTask( Review comment: This task makes all Dataflow runner v2 tests fail because the sdk container is cleaned up as soon as it is built. The cause is that in the common configuration, you have config.startJobServer.finalizedBy config.cleanupJobServer This will change the task dependency incorrectly. Filed jira here: https://issues.apache.org/jira/browse/BEAM-11270 cc: @tysonjh This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] chamikaramj commented on pull request #13235: Changing BigQuery insertAll error from INFO level logging to WARNING
chamikaramj commented on pull request #13235: URL: https://github.com/apache/beam/pull/13235#issuecomment-728274702 I think the next step is to try out Beam 2.24.0 or later to see if this is really needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on a change in pull request #13026: [WIP] [BEAM-7003 BEAM-8639 BEAM-8774] Update Kafka dependencies, enable IT test in Postcommit
boyuanzz commented on a change in pull request #13026: URL: https://github.com/apache/beam/pull/13026#discussion_r524517625 ## File path: sdks/java/io/kafka/build.gradle ## @@ -65,26 +76,68 @@ dependencies { testCompile library.java.junit testCompile library.java.powermock testCompile library.java.powermock_mockito + testCompile library.java.testcontainers_kafka testRuntimeOnly library.java.slf4j_jdk14 testRuntimeOnly project(path: ":runners:direct-java", configuration: "shadow") - kafkaVersion210 "org.apache.kafka:kafka-clients:2.1.0" + kafkaVersions.each {"kafkaVersion$it.key" "org.apache.kafka:kafka-clients:$it.value"} } -configurations.kafkaVersion210 { - resolutionStrategy { -force "org.apache.kafka:kafka-clients:2.1.0" +kafkaVersions.each { kv -> + configurations."kafkaVersion$kv.key" { +resolutionStrategy { + force "org.apache.kafka:kafka-clients:$kv.value" +} } } -task kafkaVersion210Test(type: Test) { Review comment: So the problem is introduced by https://github.com/apache/beam/pull/13283/files This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] KevinGG commented on a change in pull request #13335: [BEAM-10921]: Fix BEAM-10921 and underlying issues
KevinGG commented on a change in pull request #13335: URL: https://github.com/apache/beam/pull/13335#discussion_r524516876 ## File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py ## @@ -163,7 +165,8 @@ def __init__(self): # the gRPC server serves. self._test_stream_service_controllers = {} self._cached_source_signature = {} -self._tracked_user_pipelines = set() Review comment: LGTM, only one question: Does the `UserPipelineTracker` clean up the user pipeline and their derived pipelines when a user pipeline is out of scope (e.g., deleted or garbage collected)? Or are the pipelines tracked never get garbage collected at all? Is there any side effect when the user uses an outdated pipeline ref or a new pipeline ref (from re-executions) that results in the same `__hash__` or `__eq__`/`in` to be `True`? Will that give back a wrong user_pipeline when the tracker thinks the pipeline is tracked while it's not? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] KevinGG commented on a change in pull request #13335: [BEAM-10921]: Fix BEAM-10921 and underlying issues
KevinGG commented on a change in pull request #13335: URL: https://github.com/apache/beam/pull/13335#discussion_r524516876 ## File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py ## @@ -163,7 +165,8 @@ def __init__(self): # the gRPC server serves. self._test_stream_service_controllers = {} self._cached_source_signature = {} -self._tracked_user_pipelines = set() Review comment: LGTM, only one question: Does the `UserPipelineTracker` clean up the user pipeline and their derived pipelines when a user pipeline is out of scope (e.g., deleted or garbage collected). If not, is there any side effect when the user uses an outdated pipeline ref or a new pipeline ref (from re-executions) that results in the same `__hash__` or `__eq__`/`in` to be `True`? Will that give back a wrong user_pipeline when the tracker thinks the pipeline is tracked while it's not? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robertwb commented on a change in pull request #13333: Further dataframe batch consolidation.
robertwb commented on a change in pull request #1: URL: https://github.com/apache/beam/pull/1#discussion_r524508497 ## File path: sdks/python/apache_beam/dataframe/transforms.py ## @@ -410,6 +426,35 @@ def _total_memory_usage(frame): float('inf') +class _PreBatch(beam.DoFn): + def __init__(self, target_size=TARGET_PARTITION_SIZE): +self._target_size = target_size + + def start_bundle(self): +self._parts = collections.defaultdict(list) +self._running_size = 0 + + def process( + self, + part, + window=beam.DoFn.WindowParam, + timestamp=beam.DoFn.TimestampParam): +part_size = _total_memory_usage(part) +if part_size >= self._target_size: + yield part +else: + self._running_size += part_size + self._parts[window, timestamp].append(part) + if self._running_size >= self._target_size: +yield from self.finish_bundle() + + def finish_bundle(self): +for (window, timestamp), parts in self._parts.items(): + yield windowed_value.WindowedValue( + pd.concat(parts), timestamp, (window, )) +self.start_bundle() Review comment: Yeah, I was thinking exactly the same... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #12779: [BEAM-10856] Support for NestedValueProvider for Python SDK
codecov[bot] edited a comment on pull request #12779: URL: https://github.com/apache/beam/pull/12779#issuecomment-692856347 # [Codecov](https://codecov.io/gh/apache/beam/pull/12779?src=pr=h1) Report > Merging [#12779](https://codecov.io/gh/apache/beam/pull/12779?src=pr=desc) (af2c14c) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.20%`. > The diff coverage is `93.33%`. [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12779/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12779?src=pr=tree) ```diff @@Coverage Diff @@ ## master #12779 +/- ## == - Coverage 82.48% 82.28% -0.21% == Files 455 451 -4 Lines 5487653738-1138 == - Hits4526644217-1049 + Misses 9610 9521 -89 ``` | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12779?src=pr=tree) | Coverage Δ | | |---|---|---| | [sdks/python/apache\_beam/options/value\_provider.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vb3B0aW9ucy92YWx1ZV9wcm92aWRlci5weQ==) | `91.76% <93.33%> (+0.21%)` | :arrow_up: | | [sdks/python/apache\_beam/utils/profiler.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcHJvZmlsZXIucHk=) | `32.11% <0.00%> (-54.35%)` | :arrow_down: | | [sdks/python/apache\_beam/utils/interactive\_utils.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaW50ZXJhY3RpdmVfdXRpbHMucHk=) | `83.33% <0.00%> (-9.53%)` | :arrow_down: | | [...eam/runners/interactive/options/capture\_control.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9vcHRpb25zL2NhcHR1cmVfY29udHJvbC5weQ==) | `92.00% <0.00%> (-8.00%)` | :arrow_down: | | [conftest.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-Y29uZnRlc3QucHk=) | `77.77% <0.00%> (-7.94%)` | :arrow_down: | | [...s/snippets/transforms/aggregation/combinevalues.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9jb21iaW5ldmFsdWVzLnB5) | `87.36% <0.00%> (-7.37%)` | :arrow_down: | | [sdks/python/apache\_beam/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vX19pbml0X18ucHk=) | `80.00% <0.00%> (-5.72%)` | :arrow_down: | | [sdks/python/apache\_beam/dataframe/partitionings.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL3BhcnRpdGlvbmluZ3MucHk=) | `83.60% <0.00%> (-5.44%)` | :arrow_down: | | [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | `86.95% <0.00%> (-3.67%)` | :arrow_down: | | [...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5) | `76.02% <0.00%> (-3.51%)` | :arrow_down: | | ... and [87 more](https://codecov.io/gh/apache/beam/pull/12779/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12779?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12779?src=pr=footer). Last update [eba648c...0f7961d](https://codecov.io/gh/apache/beam/pull/12779?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on a change in pull request #13272: [BEAM-11207] Metric Extraction via proto RPC API
lostluck commented on a change in pull request #13272: URL: https://github.com/apache/beam/pull/13272#discussion_r523124215 ## File path: sdks/go/pkg/beam/core/runtime/metricsx/metricsx_test.go ## @@ -0,0 +1,166 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package metricsx + +import ( + "testing" + "time" + + "github.com/apache/beam/sdks/go/pkg/beam/core/metrics" + pipepb "github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1" + "github.com/google/go-cmp/cmp" +) + +func TestFromMonitoringInfos_Counters(t *testing.T) { + var value int64 = 15 + want := metrics.CounterResult{ + Attempted: 15, + Committed: -1, + Key: metrics.StepKey{ + Step: "main.customDoFn", + Name: "customCounter", + Namespace: "customDoFn", + }} + + payload, err := Int64Counter(value) + if err != nil { + t.Fatalf("Failed to encode Int64Counter: %v", err) + } + + labels := map[string]string{ + "PTRANSFORM": "main.customDoFn", + "NAMESPACE": "customDoFn", + "NAME": "customCounter", + } + + mInfo := { + Urn: UrnToString(UrnUserSumInt64), + Type:UrnToType(UrnUserSumInt64), + Labels: labels, + Payload: payload, + } + + attempted := []*pipepb.MonitoringInfo{mInfo} + committed := []*pipepb.MonitoringInfo{} + + got := FromMonitoringInfos(attempted, committed).AllMetrics().Counters() + size := len(got) + if size < 1 { + t.Fatalf("Invalid array's size: got: %v, want: %v", size, 1) + } + if d := cmp.Diff(got[0], want); d != "" { Review comment: You'll want to have want before got when using cmp.Diff. Otherwise your diff guidance is wrong. cmp.Diff(want, got[0]) means the diff will be (-want,+got) that is saying what is missing from want, and extra in got, which is a useful way to read them. cmp.Diff(got[0],want) means the diff will be (+want,-got), which will be confusing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728262843 Run Python 3.8 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728262952 Run Python 3.7 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13348: Go redundant type cleanup.
lostluck commented on pull request #13348: URL: https://github.com/apache/beam/pull/13348#issuecomment-728262368 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13347: [release-2.26.0][BEAM-11264] Add Reshuffle in pd.read_*
lostluck commented on pull request #13347: URL: https://github.com/apache/beam/pull/13347#issuecomment-728262680 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13275: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
lostluck commented on pull request #13275: URL: https://github.com/apache/beam/pull/13275#issuecomment-728261076 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on a change in pull request #13333: Further dataframe batch consolidation.
TheNeuralBit commented on a change in pull request #1: URL: https://github.com/apache/beam/pull/1#discussion_r524496438 ## File path: sdks/python/apache_beam/dataframe/transforms.py ## @@ -223,8 +235,12 @@ def expand(self, pcolls): # Actually evaluate the expressions. def evaluate(partition, stage=self.stage, **side_inputs): + def lookup(expr): +return expr.proxy( +).iloc[:0] if partition[expr._id] is None else partition[expr._id] Review comment: nit: maybe add a comment like this to clarify the intention ```suggestion # Use proxy if there's no data in this partition return expr.proxy( ).iloc[:0] if partition[expr._id] is None else partition[expr._id] ``` ## File path: sdks/python/apache_beam/dataframe/transforms.py ## @@ -410,6 +426,35 @@ def _total_memory_usage(frame): float('inf') +class _PreBatch(beam.DoFn): + def __init__(self, target_size=TARGET_PARTITION_SIZE): +self._target_size = target_size + + def start_bundle(self): +self._parts = collections.defaultdict(list) +self._running_size = 0 + + def process( + self, + part, + window=beam.DoFn.WindowParam, + timestamp=beam.DoFn.TimestampParam): +part_size = _total_memory_usage(part) +if part_size >= self._target_size: + yield part +else: + self._running_size += part_size + self._parts[window, timestamp].append(part) + if self._running_size >= self._target_size: +yield from self.finish_bundle() + + def finish_bundle(self): +for (window, timestamp), parts in self._parts.items(): + yield windowed_value.WindowedValue( + pd.concat(parts), timestamp, (window, )) +self.start_bundle() Review comment: This is frustratingly similar to _ReBatch... but I don't see a reasonable way to combine them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13275: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
lostluck commented on pull request #13275: URL: https://github.com/apache/beam/pull/13275#issuecomment-728260568 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13275: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
lostluck commented on pull request #13275: URL: https://github.com/apache/beam/pull/13275#issuecomment-728260456 Run Release Gradle Build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] lostluck commented on pull request #13275: [DO NOT MERGE] Run all PostCommit and PreCommit Tests against Release Branch
lostluck commented on pull request #13275: URL: https://github.com/apache/beam/pull/13275#issuecomment-728260032 Hmmm it seems I was mistaken and this did not automatically run all the post commits. Doing so now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robertwb merged pull request #13341: Avoid unnecessary shuffling for single-input elementwise operations.
robertwb merged pull request #13341: URL: https://github.com/apache/beam/pull/13341 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robertwb commented on a change in pull request #13341: Avoid unnecessary shuffling for single-input elementwise operations.
robertwb commented on a change in pull request #13341: URL: https://github.com/apache/beam/pull/13341#discussion_r524496642 ## File path: sdks/python/apache_beam/dataframe/frames.py ## @@ -1917,9 +1918,12 @@ def func(df, *args, **kwargs): '__i%s__' % base, frame_base._elementwise_method('__i%s__' % base, inplace=True)) -for name in ['__lt__', '__le__', '__gt__', '__ge__', '__eq__', '__ne__']: - setattr(DeferredSeries, name, frame_base._elementwise_method(name)) - setattr(DeferredDataFrame, name, frame_base._elementwise_method(name)) +for name in ['lt', 'le', 'gt', 'ge', 'eq', 'ne']: + for p in '%s', '__%s__': +# Note that non-underscore name is used for both as the __xxx__ methods are +# order-sensitive. Review comment: It appears just to be the comparison operators that suffer from this defect. For PCollections, the order is unspecified, so in some sense one can't say that comparison of "differently-ordered" dataframes should be rejected (as the notion of "differently-ordered" is not well defined). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] robertwb commented on a change in pull request #13215: [BEAM-11151] Adds the ToStringFnRunner to Java
robertwb commented on a change in pull request #13215: URL: https://github.com/apache/beam/pull/13215#discussion_r524491784 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/ToStringFnRunner.java ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.fn.harness; + +import com.google.auto.service.AutoService; +import java.util.Map; +import org.apache.beam.model.pipeline.v1.RunnerApi.PTransform; +import org.apache.beam.runners.core.construction.PTransformTranslation; +import org.apache.beam.sdk.function.ThrowingFunction; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap; + +/** + * Translates from elements to human-readable string. + * + * Translation function: + * + * + * Input: {@code KV} + * Output: {@code KV} + * + * + * For each element, the human-readable string is returned. The nonce is used by a runner to + * associate each input with its output. The nonce is represented as an opaque set of bytes. + */ +@SuppressWarnings({ Review comment: Can you scope this more narrowly, or use instead? ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/ToStringFnRunner.java ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.fn.harness; + +import com.google.auto.service.AutoService; +import java.util.Map; +import org.apache.beam.model.pipeline.v1.RunnerApi.PTransform; +import org.apache.beam.runners.core.construction.PTransformTranslation; +import org.apache.beam.sdk.function.ThrowingFunction; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap; + +/** + * Translates from elements to human-readable string. + * + * Translation function: + * + * + * Input: {@code KV} + * Output: {@code KV} + * + * + * For each element, the human-readable string is returned. The nonce is used by a runner to + * associate each input with its output. The nonce is represented as an opaque set of bytes. + */ +@SuppressWarnings({ + "rawtypes" // TODO(https://issues.apache.org/jira/browse/BEAM-10556) +}) +public class ToStringFnRunner { + static final String URN = PTransformTranslation.TO_STRING_TRANSFORM_URN; + + /** + * A registrar which provides a factory to handle translating elements to a human readable string. + */ + @AutoService(PTransformRunnerFactory.Registrar.class) + public static class Registrar implements PTransformRunnerFactory.Registrar { + +@Override +public Map getPTransformRunnerFactories() { + return ImmutableMap.of( + URN, + MapFnRunners.forValueMapFnFactory(ToStringFnRunner::createToStringFunctionForPTransform)); +} + } + + static ThrowingFunction, KV> createToStringFunctionForPTransform( + String ptransformId, PTransform pTransform) { +return (KV input) -> { + String val = "null"; Review comment: Rather than assigning a value and then overwriting it, do if (v == null) { val = "null"; } else { val = v.toString(); } You could also just return "" + val or, even better, just use Objects.toString(input.getValue()). This is an automated message from the Apache Git Service. To respond to the message,
[GitHub] [beam] TheNeuralBit merged pull request #13331: [BEAM-11219][Website revamp] Development of All about Apache Beam component
TheNeuralBit merged pull request #13331: URL: https://github.com/apache/beam/pull/13331 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on a change in pull request #13331: [BEAM-11219][Website revamp] Development of All about Apache Beam component
TheNeuralBit commented on a change in pull request #13331: URL: https://github.com/apache/beam/pull/13331#discussion_r524493084 ## File path: website/www/site/assets/icons/extensive-icon.svg ## @@ -0,0 +1,7 @@ +http://www.w3.org/2000/svg; width="112" height="112" fill="none" viewBox="0 0 112 112"> + + + + + + Review comment: Ah neat, thanks for the explanation This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] boyuanzz commented on a change in pull request #13026: [WIP] [BEAM-7003 BEAM-8639 BEAM-8774] Update Kafka dependencies, enable IT test in Postcommit
boyuanzz commented on a change in pull request #13026: URL: https://github.com/apache/beam/pull/13026#discussion_r524491777 ## File path: sdks/java/io/kafka/build.gradle ## @@ -65,26 +76,68 @@ dependencies { testCompile library.java.junit testCompile library.java.powermock testCompile library.java.powermock_mockito + testCompile library.java.testcontainers_kafka testRuntimeOnly library.java.slf4j_jdk14 testRuntimeOnly project(path: ":runners:direct-java", configuration: "shadow") - kafkaVersion210 "org.apache.kafka:kafka-clients:2.1.0" + kafkaVersions.each {"kafkaVersion$it.key" "org.apache.kafka:kafka-clients:$it.value"} } -configurations.kafkaVersion210 { - resolutionStrategy { -force "org.apache.kafka:kafka-clients:2.1.0" +kafkaVersions.each { kv -> + configurations."kafkaVersion$kv.key" { +resolutionStrategy { + force "org.apache.kafka:kafka-clients:$kv.value" +} } } -task kafkaVersion210Test(type: Test) { Review comment: It seems like the sdk docker image is cleaned up incorrectly before test finishes. I can have a quick fix for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #13026: [WIP] [BEAM-7003 BEAM-8639 BEAM-8774] Update Kafka dependencies, enable IT test in Postcommit
codecov[bot] edited a comment on pull request #13026: URL: https://github.com/apache/beam/pull/13026#issuecomment-704839820 # [Codecov](https://codecov.io/gh/apache/beam/pull/13026?src=pr=h1) Report > Merging [#13026](https://codecov.io/gh/apache/beam/pull/13026?src=pr=desc) (000ac07) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **increase** coverage by `0.07%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/13026/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/13026?src=pr=tree) ```diff @@Coverage Diff @@ ## master #13026 +/- ## == + Coverage 82.48% 82.55% +0.07% == Files 455 455 Lines 5487655143 +267 == + Hits4526645526 +260 - Misses 9610 9617 +7 ``` | [Impacted Files](https://codecov.io/gh/apache/beam/pull/13026?src=pr=tree) | Coverage Δ | | |---|---|---| | [...ks/python/apache\_beam/runners/worker/data\_plane.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvZGF0YV9wbGFuZS5weQ==) | `88.32% <0.00%> (-1.20%)` | :arrow_down: | | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: | | [...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==) | `80.57% <0.00%> (-0.18%)` | :arrow_down: | | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: | | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.34% <0.00%> (ø)` | | | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | | | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: | | [sdks/python/apache\_beam/io/gcp/bigquery\_tools.py](https://codecov.io/gh/apache/beam/pull/13026/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X3Rvb2xzLnB5) | `90.69% <0.00%> (+2.90%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/13026?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/13026?src=pr=footer). Last update [2f2ffda...f7d06ba](https://codecov.io/gh/apache/beam/pull/13026?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on pull request #13026: [WIP] [BEAM-7003 BEAM-8639 BEAM-8774] Update Kafka dependencies, enable IT test in Postcommit
piotr-szuberski commented on pull request #13026: URL: https://github.com/apache/beam/pull/13026#issuecomment-728249571 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] y1chi commented on a change in pull request #13350: [BEAM-11266] Python IO MongoDB: add bucket_auto aggregation option for bundling in Atlas.
y1chi commented on a change in pull request #13350: URL: https://github.com/apache/beam/pull/13350#discussion_r524486089 ## File path: sdks/python/apache_beam/io/mongodbio.py ## @@ -241,6 +275,27 @@ def _get_split_keys(self, desired_chunk_size_in_mb, start_pos, end_pos): max={'_id': end_pos}, maxChunkSize=desired_chunk_size_in_mb)['splitKeys']) + def _get_buckets(self, desired_chunk_size, start_pos, end_pos): +if start_pos >= end_pos: + # single document not splittable + return [] +size = self.estimate_size() +bucket_count = size // desired_chunk_size Review comment: The split function will likely be called recursively for dynamic rebalancing, so for a range with start_pos and end_pos, it can be further split upon backend request, so it might not reasonable to always use the total collection size divide by desired_chunk_size to calculate the bucket count. Is it possible to only get the buckets within the give _id range? and we can probably use an average document size times the number of documents to calculate the size of the range being split. ## File path: sdks/python/apache_beam/io/mongodbio.py ## @@ -241,6 +275,27 @@ def _get_split_keys(self, desired_chunk_size_in_mb, start_pos, end_pos): max={'_id': end_pos}, maxChunkSize=desired_chunk_size_in_mb)['splitKeys']) + def _get_buckets(self, desired_chunk_size, start_pos, end_pos): +if start_pos >= end_pos: + # single document not splittable + return [] +size = self.estimate_size() +bucket_count = size // desired_chunk_size +if size % desired_chunk_size != 0: + bucket_count += 1 +with beam.io.mongodbio.MongoClient(self.uri, **self.spec) as client: + buckets = list( Review comment: the return buckets should guarantee the _id range is start_pos and end_pos otherwise same document could be read multiple times. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper
piotr-szuberski commented on pull request #12611: URL: https://github.com/apache/beam/pull/12611#issuecomment-728248437 > Looks good, merging now. Thanks for all your work on this @piotr-szuberski :) Thank you too for your reviews! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski opened a new pull request #13354: [BEAM-8569] Add changes note about Hadoop 3 support
piotr-szuberski opened a new pull request #13354: URL: https://github.com/apache/beam/pull/13354 R: @iemejia Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](htt ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/) Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build