[jira] [Work logged] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.
[ https://issues.apache.org/jira/browse/BEAM-9775?focusedWorklogId=423898=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423898 ] ASF GitHub Bot logged work on BEAM-9775: Author: ASF GitHub Bot Created on: 17/Apr/20 03:43 Start Date: 17/Apr/20 03:43 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11443: [BEAM-9775] Add Go support for SDF StandardRequirements. URL: https://github.com/apache/beam/pull/11443#issuecomment-615022471 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423898) Time Spent: 20m (was: 10m) > Add capability to run SDFs on Flink Runner and Python FnApiRunner. > -- > > Key: BEAM-9775 > URL: https://issues.apache.org/jira/browse/BEAM-9775 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > This is pretty simple: Add any missing requirements for being able to > actually execute SDFs, and _maybe_ an example SDF or something that actually > works. This can be marked completed when we can run a simple SDF with the Go > SDK. > I'm still hesitant on the example SDF because it may imply that the feature > is ready for general usage, so it might need to be bundled with the > documentation PR, but we'll see. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.
[ https://issues.apache.org/jira/browse/BEAM-9775?focusedWorklogId=423899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423899 ] ASF GitHub Bot logged work on BEAM-9775: Author: ASF GitHub Bot Created on: 17/Apr/20 03:43 Start Date: 17/Apr/20 03:43 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11443: [BEAM-9775] Add Go support for SDF StandardRequirements. URL: https://github.com/apache/beam/pull/11443#issuecomment-615022600 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423899) Time Spent: 0.5h (was: 20m) > Add capability to run SDFs on Flink Runner and Python FnApiRunner. > -- > > Key: BEAM-9775 > URL: https://issues.apache.org/jira/browse/BEAM-9775 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > This is pretty simple: Add any missing requirements for being able to > actually execute SDFs, and _maybe_ an example SDF or something that actually > works. This can be marked completed when we can run a simple SDF with the Go > SDK. > I'm still hesitant on the example SDF because it may imply that the feature > is ready for general usage, so it might need to be bundled with the > documentation PR, but we'll see. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.
[ https://issues.apache.org/jira/browse/BEAM-9775?focusedWorklogId=423897=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423897 ] ASF GitHub Bot logged work on BEAM-9775: Author: ASF GitHub Bot Created on: 17/Apr/20 03:42 Start Date: 17/Apr/20 03:42 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11443: [BEAM-9775] Add Go support for SDF StandardRequirements. URL: https://github.com/apache/beam/pull/11443 This fills in just the SDF requirement for pipelines (defined in beam_runner_api.proto). The way it's implemented should be pretty simple to expand for future requirements too. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Created] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.
Daniel Oliveira created BEAM-9775: - Summary: Add capability to run SDFs on Flink Runner and Python FnApiRunner. Key: BEAM-9775 URL: https://issues.apache.org/jira/browse/BEAM-9775 Project: Beam Issue Type: Sub-task Components: sdk-go Reporter: Daniel Oliveira Assignee: Daniel Oliveira This is pretty simple: Add any missing requirements for being able to actually execute SDFs, and _maybe_ an example SDF or something that actually works. This can be marked completed when we can run a simple SDF with the Go SDK. I'm still hesitant on the example SDF because it may imply that the feature is ready for general usage, so it might need to be bundled with the documentation PR, but we'll see. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.
[ https://issues.apache.org/jira/browse/BEAM-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Oliveira updated BEAM-9775: -- Status: Open (was: Triage Needed) > Add capability to run SDFs on Flink Runner and Python FnApiRunner. > -- > > Key: BEAM-9775 > URL: https://issues.apache.org/jira/browse/BEAM-9775 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > > This is pretty simple: Add any missing requirements for being able to > actually execute SDFs, and _maybe_ an example SDF or something that actually > works. This can be marked completed when we can run a simple SDF with the Go > SDK. > I'm still hesitant on the example SDF because it may imply that the feature > is ready for general usage, so it might need to be bundled with the > documentation PR, but we'll see. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9698) BeamUncollectRel UncollectDoFn NullPointerException
[ https://issues.apache.org/jira/browse/BEAM-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085395#comment-17085395 ] Kenneth Knowles commented on BEAM-9698: --- Looks like an issue in CAST. Per https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types arrays cannot be NULL. So this test seems to expect CAST(NULL AS ARRAY<...>) to produce an empty array. > BeamUncollectRel UncollectDoFn NullPointerException > --- > > Key: BEAM-9698 > URL: https://issues.apache.org/jira/browse/BEAM-9698 > Project: Beam > Issue Type: Bug > Components: dsl-sql-zetasql >Reporter: Andrew Pilloud >Priority: Minor > Labels: zetasql-compliance > > two failures in shard 19 > {code} > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > java.lang.NullPointerException > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348) > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318) > at > org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213) > at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.runCollector(BeamEnumerableConverter.java:201) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.collectRows(BeamEnumerableConverter.java:218) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:150) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127) > at > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:329) > at > com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423) > at > com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711) > at > com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamUncollectRel$UncollectDoFn.process(BeamUncollectRel.java:103) > {code} > {code} > Apr 01, 2020 5:58:27 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS > ARRAY)) e > Apr 01, 2020 5:58:27 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS > ARRAY>)) e > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?
[ https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=423891=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423891 ] ASF GitHub Bot logged work on BEAM-9444: Author: ASF GitHub Bot Created on: 17/Apr/20 03:11 Start Date: 17/Apr/20 03:11 Worklog Time Spent: 10m Work Description: suztomo commented on pull request #11391: [BEAM-9444] Rebasing old version of PR 11156 (no need to review) URL: https://github.com/apache/beam/pull/11391 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423891) Time Spent: 13h 20m (was: 13h 10m) > Shall we use GCP Libraries BOM to specify Google-related library versions? > -- > > Key: BEAM-9444 > URL: https://issues.apache.org/jira/browse/BEAM-9444 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot > 2020-03-17 at 16.01.16.png > > Time Spent: 13h 20m > Remaining Estimate: 0h > > Shall we use GCP Libraries BOM to specify Google-related library versions? > > I've been working on Beam's dependency upgrades in the past few months. I > think it's time to consider a long-term solution to keep the libraries > up-to-date with small maintenance effort. To achieve that, I propose Beam to > use GCP Libraries BOM to set the Google-related library versions, rather than > trying to make changes in each of ~30 Google libraries. > > h1. Background > A BOM is pom.xml that provides dependencyManagement to importing projects. > > GCP Libraries BOM is a BOM that includes many Google Cloud related libraries > + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain > the BOM so that the set of the libraries are compatible with each other. > > h1. Implementation > Notes for obstacles. > h2. BeamModulePlugin's "force" does not take BOM into account (thus fails) > {{forcedModules}} via version resolution strategy is playing bad. This causes > {noformat} > A problem occurred evaluating project ':sdks:java:extensions:sql'. > Could not resolve all dependencies for configuration > ':sdks:java:extensions:sql:fmppTemplates'. > Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version > cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat} > !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! > > h2. :sdks:java:maven-archetypes:examples needs the version of > google-http-client > The task requires the version for the library: > {code:java} > 'google-http-client.version': > dependencies.create(project.library.java.google_http_client).getVersion(), > {code} > This would generate NullPointerException. Running gradlew without the > subproject: > > {code:java} > ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check > {code} > h1. Problem in Gradle-generated pom files > The generated Maven artifact POM has invalid data due to the BOM change. For > example my locally installed > {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}} > had the following problems. > h2. The GCP Libraries BOM showing up in dependencies section: > {noformat} > > > com.google.cloud > libraries-bom > 4.2.0 > compile > > > com.google.guava > guava-jdk5 > ... > > > {noformat} > h2. The artifact that use the BOM in Gradle is missing version in the > dependency. > {noformat} > > com.google.api > gax > > compile > ... > > {noformat} > h1. DependencyManagement section in generated pom.xml > How can I check whether a entry in dependencies is "platform"? > !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9764) :sdks:java:container:generateThirdPartyLicenses failing
[ https://issues.apache.org/jira/browse/BEAM-9764?focusedWorklogId=423884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423884 ] ASF GitHub Bot logged work on BEAM-9764: Author: ASF GitHub Bot Created on: 17/Apr/20 02:45 Start Date: 17/Apr/20 02:45 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #11428: [BEAM-9764] fix Java license failures with Python2_PVR_Flink PreCommit URL: https://github.com/apache/beam/pull/11428#issuecomment-615007536 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423884) Time Spent: 50m (was: 40m) > :sdks:java:container:generateThirdPartyLicenses failing > --- > > Key: BEAM-9764 > URL: https://issues.apache.org/jira/browse/BEAM-9764 > Project: Beam > Issue Type: Bug > Components: sdk-java-core, test-failures >Reporter: Udi Meiri >Assignee: Hannah Jiang >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/774/console > The traceback is interspersed with other logs: > {code} > Traceback (most recent call last): > Successfully pulled > java_third_party_licenses/protobuf-java-util-3.11.1.jar/LICENSE from > https://opensource.org/licenses/BSD-3-Clause > Successfully pulled java_third_party_licenses/protoc-3.11.0.jar/LICENSE from > http://www.apache.org/licenses/LICENSE-2.0.txt > File "sdks/java/container/license_scripts/pull_licenses_java.py", line 138, > in > Successfully pulled java_third_party_licenses/protoc-3.11.1.jar/LICENSE from > http://www.apache.org/licenses/LICENSE-2.0.txt > license_url = dep['moduleLicenseUrl'] > Successfully pulled java_third_party_licenses/zetasketch-0.1.0.jar/LICENSE > from http://www.apache.org/licenses/LICENSE-2.0.txt > KeyError: 'moduleLicenseUrl' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-9773: --- Assignee: Steve Koonce > Update Dataflow Debug Capture to use Google API client Jackson 2 > > > Key: BEAM-9773 > URL: https://issues.apache.org/jira/browse/BEAM-9773 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Steve Koonce >Assignee: Steve Koonce >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > DebugCapture is using an old version of the JacksonFactory from Google API > client. Update it to use the latest to match the rest of the Dataflow runner > and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-9773: Status: Open (was: Triage Needed) > Update Dataflow Debug Capture to use Google API client Jackson 2 > > > Key: BEAM-9773 > URL: https://issues.apache.org/jira/browse/BEAM-9773 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Steve Koonce >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > DebugCapture is using an old version of the JacksonFactory from Google API > client. Update it to use the latest to match the rest of the Dataflow runner > and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?focusedWorklogId=423882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423882 ] ASF GitHub Bot logged work on BEAM-9773: Author: ASF GitHub Bot Created on: 17/Apr/20 02:37 Start Date: 17/Apr/20 02:37 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11442: [BEAM-9773]: Update Dataflow Debug Capture to use Google API client J… URL: https://github.com/apache/beam/pull/11442#issuecomment-615005670 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423882) Time Spent: 50m (was: 40m) > Update Dataflow Debug Capture to use Google API client Jackson 2 > > > Key: BEAM-9773 > URL: https://issues.apache.org/jira/browse/BEAM-9773 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Steve Koonce >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > DebugCapture is using an old version of the JacksonFactory from Google API > client. Update it to use the latest to match the rest of the Dataflow runner > and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP
[ https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=423881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423881 ] ASF GitHub Bot logged work on BEAM-8603: Author: ASF GitHub Bot Created on: 17/Apr/20 02:37 Start Date: 17/Apr/20 02:37 Worklog Time Spent: 10m Work Description: ihji commented on pull request #10055: [BEAM-8603] Add Python SqlTransform URL: https://github.com/apache/beam/pull/10055#discussion_r409958523 ## File path: buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy ## @@ -1779,6 +1779,33 @@ class BeamModulePlugin implements Plugin { cleanupTask.mustRunAfter pythonTask config.cleanupJobServer.mustRunAfter pythonTask } + // Task for running testcases in Python SDK + def testOpts = [ +"--attr=UsesSqlExpansionService" + ] + def pipelineOpts = [ +"--runner=PortableRunner", +"--environment_cache_millis=1", +"--job_endpoint=${config.jobEndpoint}" + ] + def beamPythonTestPipelineOptions = [ +"pipeline_opts": pipelineOpts, +"test_opts": testOpts, +"suite": "xlangSqlValidateRunner" + ] + def cmdArgs = project.project(':sdks:python').mapToArgString(beamPythonTestPipelineOptions) + def pythonSqlTask = project.tasks.create(name: config.name+"PythonUsingSql", type: Exec) { Review comment: If it's outside of the each block then it should be fine. The advantage of using existing testing expansion service here is you don't need to start and stop another expansion service inside the test code. The expansion service starts and finishes at the beginning and the end of the whole test suite. You can just pass the jar name and the port number by environment variables. Still there's no harm in launching your own expansion service though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423881) Time Spent: 9.5h (was: 9h 20m) > Add Python SqlTransform MVP > --- > > Key: BEAM-8603 > URL: https://issues.apache.org/jira/browse/BEAM-8603 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 9.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9743) TFRecordCodec not attempt to fully read/write
[ https://issues.apache.org/jira/browse/BEAM-9743?focusedWorklogId=423880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423880 ] ASF GitHub Bot logged work on BEAM-9743: Author: ASF GitHub Bot Created on: 17/Apr/20 02:36 Start Date: 17/Apr/20 02:36 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11397: [BEAM-9743] Fix TFRecordCodec to try harder to read/write URL: https://github.com/apache/beam/pull/11397#issuecomment-615005288 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423880) Time Spent: 3h (was: 2h 50m) > TFRecordCodec not attempt to fully read/write > - > > Key: BEAM-9743 > URL: https://issues.apache.org/jira/browse/BEAM-9743 > Project: Beam > Issue Type: Bug > Components: io-java-tfrecord, sdk-java-core >Reporter: Kyoungha Min >Assignee: Kyoungha Min >Priority: Critical > Time Spent: 3h > Remaining Estimate: 0h > > The same issue has been pointed out and the issues were marked resolved. But > they were still remaining parts > https://issues.apache.org/jira/browse/BEAM-5412?jql=text%20~%20%22tfrecord%22 > > Issue # 1: TFRecordCodec only tries once to read the header/footer. This is > likely to fail around the end of channel buffer. > Issue # 2: (minor) TFRecordCodec currently does not checks how much it > writes. > > Seems like it only happens with Zstd compression (or any other picky input > stream that refuse to read fully). ZstdInputStream seems very picky at giving > out data. > The parts with the issue are > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L672] > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L699] > > And not so problem within the beam application (As all (or most) of > WritableByteChannels in beam-java-sdk-core are backed by some OutputStream), > but still not following the WritableByteChannel specification, > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L720-L727] > > ReadableByteChannel/WritableByteChannel Javadoc specifies that they are not > required to read/write fully, and can refuse to read/write time to time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9774) apache-beam[gcp] install failed on arch linux ( 64bit)
Sunil Menon created BEAM-9774: - Summary: apache-beam[gcp] install failed on arch linux ( 64bit) Key: BEAM-9774 URL: https://issues.apache.org/jira/browse/BEAM-9774 Project: Beam Issue Type: Bug Components: dependencies Environment: Nividia Nano aarch64 python version : 3.7.1 python version: 3,.6.8 pip version: 20.0.2 Reporter: Sunil Menon My system details are as follows:- Nividia Nano aarch64 python version : 3.7.1 python version: 3,.6.8 pip version: 20.0.2 Ran the following as given in the google git folder training-data-analyst/courses/data_analysis/lab2/python/install_packages.sh install_packages.sh contains the following:- apt-get install python3-pip pip3 install apache-beam[gcp] pip3 install oauth2client==3.0.0 pip3 install -U pip Fails to install apache-beam giving an error while installing pyarrow with the following message ( truncated):- copying pyarrow/tests/data/parquet/v0.7.1.some-named-index.parquet -> build/lib.linux-aarch64-3.6/pyarrow/tests/data/parquet running build_ext creating /tmp/pip-req-build-bavmmcwx/build/temp.linux-aarch64-3.6 -- Running cmake for pyarrow cmake -DPYTHON_EXECUTABLE=/home/sunil/gcpbeam/bin/python -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=off -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on -DCMAKE_BUILD_TYPE=release /tmp/pip-req-build-bavmmcwx error: command 'cmake' failed with exit status 1 ERROR: Failed building wheel for pyarrow Failed to build pyarrow ERROR: Could not build wheels for pyarrow which use PEP 517 and cannot be installed directly When I run the same on a google cloud shell, it works. Not sure why?. I even downloaded the arrow-apache-arrow-0.16.0.tar.gzip and tried to build locally - but it gave the same error. Can someone help how to get this fixed. The problem is seen on both python 3.7.1 and python 3.6.8 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9743) TFRecordCodec not attempt to fully read/write
[ https://issues.apache.org/jira/browse/BEAM-9743?focusedWorklogId=423874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423874 ] ASF GitHub Bot logged work on BEAM-9743: Author: ASF GitHub Bot Created on: 17/Apr/20 02:15 Start Date: 17/Apr/20 02:15 Worklog Time Spent: 10m Work Description: lukemin89 commented on pull request #11397: [BEAM-9743] Fix TFRecordCodec to try harder to read/write URL: https://github.com/apache/beam/pull/11397#discussion_r409952691 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java ## @@ -717,14 +715,38 @@ public void write(WritableByteChannel outChannel, byte[] data) throws IOExceptio header.clear(); header.putLong(data.length).putInt(maskedCrc32OfLength); header.rewind(); - outChannel.write(header); + writeFully(outChannel, header); - outChannel.write(ByteBuffer.wrap(data)); + writeFully(outChannel, ByteBuffer.wrap(data)); footer.clear(); footer.putInt(maskedCrc32OfData); footer.rewind(); - outChannel.write(footer); + writeFully(outChannel, footer); +} + +@VisibleForTesting +static void readFully(ReadableByteChannel in, ByteBuffer bb) throws IOException { + int expected = bb.remaining(); + int actual = read(in, bb); + if (expected != actual) { +throw new IOException(String.format("expected %d, but got %d", expected, expected)); + } +} + +private static int read(ReadableByteChannel in, ByteBuffer bb) throws IOException { + int n, read = 0; + while (bb.hasRemaining() && (n = in.read(bb)) >= 0) { +read += n; + } + return read; +} + +@VisibleForTesting +static void writeFully(WritableByteChannel channel, ByteBuffer buffer) throws IOException { + while (buffer.hasRemaining()) { +channel.write(buffer); + } Review comment: Thanks for the confirmation :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423874) Time Spent: 2h 50m (was: 2h 40m) > TFRecordCodec not attempt to fully read/write > - > > Key: BEAM-9743 > URL: https://issues.apache.org/jira/browse/BEAM-9743 > Project: Beam > Issue Type: Bug > Components: io-java-tfrecord, sdk-java-core >Reporter: Kyoungha Min >Assignee: Kyoungha Min >Priority: Critical > Time Spent: 2h 50m > Remaining Estimate: 0h > > The same issue has been pointed out and the issues were marked resolved. But > they were still remaining parts > https://issues.apache.org/jira/browse/BEAM-5412?jql=text%20~%20%22tfrecord%22 > > Issue # 1: TFRecordCodec only tries once to read the header/footer. This is > likely to fail around the end of channel buffer. > Issue # 2: (minor) TFRecordCodec currently does not checks how much it > writes. > > Seems like it only happens with Zstd compression (or any other picky input > stream that refuse to read fully). ZstdInputStream seems very picky at giving > out data. > The parts with the issue are > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L672] > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L699] > > And not so problem within the beam application (As all (or most) of > WritableByteChannels in beam-java-sdk-core are backed by some OutputStream), > but still not following the WritableByteChannel specification, > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L720-L727] > > ReadableByteChannel/WritableByteChannel Javadoc specifies that they are not > required to read/write fully, and can refuse to read/write time to time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9743) TFRecordCodec not attempt to fully read/write
[ https://issues.apache.org/jira/browse/BEAM-9743?focusedWorklogId=423873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423873 ] ASF GitHub Bot logged work on BEAM-9743: Author: ASF GitHub Bot Created on: 17/Apr/20 02:12 Start Date: 17/Apr/20 02:12 Worklog Time Spent: 10m Work Description: lukemin89 commented on pull request #11397: [BEAM-9743] Fix TFRecordCodec to try harder to read/write URL: https://github.com/apache/beam/pull/11397#discussion_r409952123 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java ## @@ -440,4 +456,115 @@ public void processElement(ProcessContext c) { c.output(c.element().getBytes(Charsets.UTF_8)); } } + + static boolean maybeThisTime() { +return ThreadLocalRandom.current().nextBoolean(); + } + + static class PickyReadChannel extends FilterInputStream implements ReadableByteChannel { +protected PickyReadChannel(InputStream in) { + super(in); +} + +@Override +public int read(byte[] b, int off, int len) { + throw new UnsupportedOperationException(); +} + +@Override +public int read(ByteBuffer dst) throws IOException { + if (!maybeThisTime() || !dst.hasRemaining()) { +return 0; + } + int n = read(); + if (n == -1) { +return -1; + } + dst.put((byte) n); + return 1; +} + +@Override +public boolean isOpen() { + throw new UnsupportedOperationException(); +} + } + + static class PickyWriteChannel extends FilterOutputStream implements WritableByteChannel { +@Override +public void write(byte[] b, int off, int len) throws IOException { + throw new UnsupportedOperationException(); +} + +public PickyWriteChannel(OutputStream out) { + super(out); +} + +@Override +public int write(ByteBuffer src) throws IOException { + if (!maybeThisTime() || !src.hasRemaining()) { +return 0; + } + write(src.get()); + return 1; +} + +@Override +public boolean isOpen() { + throw new UnsupportedOperationException(); +} + } + + @Test + public void testReadFully() throws IOException { +byte[] data = "Hello World".getBytes(StandardCharsets.UTF_8); +ReadableByteChannel chan = new PickyReadChannel(new ByteArrayInputStream(data)); + +ByteBuffer buffer = ByteBuffer.allocate(data.length); +TFRecordCodec.readFully(chan, buffer); + +assertArrayEquals(data, buffer.array()); + } + + @Test(expected = IOException.class) Review comment: done! changed to use `ExpectedExcepton` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423873) Time Spent: 2h 40m (was: 2.5h) > TFRecordCodec not attempt to fully read/write > - > > Key: BEAM-9743 > URL: https://issues.apache.org/jira/browse/BEAM-9743 > Project: Beam > Issue Type: Bug > Components: io-java-tfrecord, sdk-java-core >Reporter: Kyoungha Min >Assignee: Kyoungha Min >Priority: Critical > Time Spent: 2h 40m > Remaining Estimate: 0h > > The same issue has been pointed out and the issues were marked resolved. But > they were still remaining parts > https://issues.apache.org/jira/browse/BEAM-5412?jql=text%20~%20%22tfrecord%22 > > Issue # 1: TFRecordCodec only tries once to read the header/footer. This is > likely to fail around the end of channel buffer. > Issue # 2: (minor) TFRecordCodec currently does not checks how much it > writes. > > Seems like it only happens with Zstd compression (or any other picky input > stream that refuse to read fully). ZstdInputStream seems very picky at giving > out data. > The parts with the issue are > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L672] > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L699] > > And not so problem within the beam application (As all (or most) of > WritableByteChannels in beam-java-sdk-core are backed by some OutputStream), > but still not following the WritableByteChannel specification, > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L720-L727] > >
[jira] [Work logged] (BEAM-9743) TFRecordCodec not attempt to fully read/write
[ https://issues.apache.org/jira/browse/BEAM-9743?focusedWorklogId=423872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423872 ] ASF GitHub Bot logged work on BEAM-9743: Author: ASF GitHub Bot Created on: 17/Apr/20 02:12 Start Date: 17/Apr/20 02:12 Worklog Time Spent: 10m Work Description: lukemin89 commented on pull request #11397: [BEAM-9743] Fix TFRecordCodec to try harder to read/write URL: https://github.com/apache/beam/pull/11397#discussion_r409952021 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java ## @@ -717,14 +716,38 @@ public void write(WritableByteChannel outChannel, byte[] data) throws IOExceptio header.clear(); header.putLong(data.length).putInt(maskedCrc32OfLength); header.rewind(); - outChannel.write(header); + writeFully(outChannel, header); - outChannel.write(ByteBuffer.wrap(data)); + writeFully(outChannel, ByteBuffer.wrap(data)); footer.clear(); footer.putInt(maskedCrc32OfData); footer.rewind(); - outChannel.write(footer); + writeFully(outChannel, footer); +} + +@VisibleForTesting +static void readFully(ReadableByteChannel in, ByteBuffer bb) throws IOException { + int expected = bb.remaining(); + int actual = read(in, bb); + if (expected != actual) { +throw new IOException(String.format("expected %d, but got %d", expected, actual)); + } +} + +private static int read(ReadableByteChannel in, ByteBuffer bb) throws IOException { + int n, read = 0; + while (bb.hasRemaining() && (n = in.read(bb)) >= 0) { +read += n; + } + return read; +} Review comment: Done! (`spotlessApply` forced me to change bracket loc) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423872) Time Spent: 2.5h (was: 2h 20m) > TFRecordCodec not attempt to fully read/write > - > > Key: BEAM-9743 > URL: https://issues.apache.org/jira/browse/BEAM-9743 > Project: Beam > Issue Type: Bug > Components: io-java-tfrecord, sdk-java-core >Reporter: Kyoungha Min >Assignee: Kyoungha Min >Priority: Critical > Time Spent: 2.5h > Remaining Estimate: 0h > > The same issue has been pointed out and the issues were marked resolved. But > they were still remaining parts > https://issues.apache.org/jira/browse/BEAM-5412?jql=text%20~%20%22tfrecord%22 > > Issue # 1: TFRecordCodec only tries once to read the header/footer. This is > likely to fail around the end of channel buffer. > Issue # 2: (minor) TFRecordCodec currently does not checks how much it > writes. > > Seems like it only happens with Zstd compression (or any other picky input > stream that refuse to read fully). ZstdInputStream seems very picky at giving > out data. > The parts with the issue are > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L672] > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L699] > > And not so problem within the beam application (As all (or most) of > WritableByteChannels in beam-java-sdk-core are backed by some OutputStream), > but still not following the WritableByteChannel specification, > [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L720-L727] > > ReadableByteChannel/WritableByteChannel Javadoc specifies that they are not > required to read/write fully, and can refuse to read/write time to time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9739) SpannerIO - Retry on Aborted Exception during schema change
[ https://issues.apache.org/jira/browse/BEAM-9739?focusedWorklogId=423867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423867 ] ASF GitHub Bot logged work on BEAM-9739: Author: ASF GitHub Bot Created on: 17/Apr/20 01:43 Start Date: 17/Apr/20 01:43 Worklog Time Spent: 10m Work Description: allenpradeep commented on issue #11392: [BEAM-9739] Retry SpannerIO write on Schema change URL: https://github.com/apache/beam/pull/11392#issuecomment-614989387 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423867) Time Spent: 1h 20m (was: 1h 10m) > SpannerIO - Retry on Aborted Exception during schema change > --- > > Key: BEAM-9739 > URL: https://issues.apache.org/jira/browse/BEAM-9739 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Allen Pradeep Xavier >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Spanner aborts all transactions in flight when there is a schema change and > returns an Aborted Exception. The client is expected to retry the transaction > silently. > SpannerIO does not handle the exception and this is propagated to the > pipeline. This bug is to track the changes to retry on Aborted Exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9700) Support ValueProvider for HCatalogIO
[ https://issues.apache.org/jira/browse/BEAM-9700?focusedWorklogId=423862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423862 ] ASF GitHub Bot logged work on BEAM-9700: Author: ASF GitHub Bot Created on: 17/Apr/20 01:31 Start Date: 17/Apr/20 01:31 Worklog Time Spent: 10m Work Description: aaltay commented on issue #11316: [BEAM-9700] [WIP] add integration for Hive IO at DataflowTemplate URL: https://github.com/apache/beam/pull/11316#issuecomment-614985745 Is this ready for a review? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423862) Remaining Estimate: 0h Time Spent: 10m > Support ValueProvider for HCatalogIO > > > Key: BEAM-9700 > URL: https://issues.apache.org/jira/browse/BEAM-9700 > Project: Beam > Issue Type: Improvement > Components: io-java-hcatalog >Reporter: chie hayashida >Assignee: chie hayashida >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I'd like to integrate modules which use hive as input for > [DataflowTemplates|[https://github.com/GoogleCloudPlatform/DataflowTemplates]]. > But current HCatalogIO.java doesn't support ValueProvider. > I'd like to add integration to support ValueProvider for HCatalogIO. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9739) SpannerIO - Retry on Aborted Exception during schema change
[ https://issues.apache.org/jira/browse/BEAM-9739?focusedWorklogId=423858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423858 ] ASF GitHub Bot logged work on BEAM-9739: Author: ASF GitHub Bot Created on: 17/Apr/20 01:17 Start Date: 17/Apr/20 01:17 Worklog Time Spent: 10m Work Description: allenpradeep commented on issue #11392: [BEAM-9739] Retry SpannerIO write on Schema change URL: https://github.com/apache/beam/pull/11392#issuecomment-614981384 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423858) Time Spent: 1h 10m (was: 1h) > SpannerIO - Retry on Aborted Exception during schema change > --- > > Key: BEAM-9739 > URL: https://issues.apache.org/jira/browse/BEAM-9739 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Allen Pradeep Xavier >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Spanner aborts all transactions in flight when there is a schema change and > returns an Aborted Exception. The client is expected to retry the transaction > silently. > SpannerIO does not handle the exception and this is propagated to the > pipeline. This bug is to track the changes to retry on Aborted Exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9703) Create py validations runner test for metrics
[ https://issues.apache.org/jira/browse/BEAM-9703?focusedWorklogId=423847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423847 ] ASF GitHub Bot logged work on BEAM-9703: Author: ASF GitHub Bot Created on: 17/Apr/20 00:34 Start Date: 17/Apr/20 00:34 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11319: [BEAM-9703]Include user distritribution into metric-dedicated validate runner test. URL: https://github.com/apache/beam/pull/11319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423847) Time Spent: 4h 40m (was: 4.5h) > Create py validations runner test for metrics > - > > Key: BEAM-9703 > URL: https://issues.apache.org/jira/browse/BEAM-9703 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > > Some of the metrics are not covered by dedicated validation runner test. > Would like create these if needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=423849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423849 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 17/Apr/20 00:34 Start Date: 17/Apr/20 00:34 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423849) Time Spent: 18h 50m (was: 18h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9739) SpannerIO - Retry on Aborted Exception during schema change
[ https://issues.apache.org/jira/browse/BEAM-9739?focusedWorklogId=423841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423841 ] ASF GitHub Bot logged work on BEAM-9739: Author: ASF GitHub Bot Created on: 16/Apr/20 23:51 Start Date: 16/Apr/20 23:51 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #11392: [BEAM-9739] Retry SpannerIO write on Schema change URL: https://github.com/apache/beam/pull/11392#issuecomment-614955150 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423841) Time Spent: 1h (was: 50m) > SpannerIO - Retry on Aborted Exception during schema change > --- > > Key: BEAM-9739 > URL: https://issues.apache.org/jira/browse/BEAM-9739 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Allen Pradeep Xavier >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Spanner aborts all transactions in flight when there is a schema change and > returns an Aborted Exception. The client is expected to retry the transaction > silently. > SpannerIO does not handle the exception and this is propagated to the > pipeline. This bug is to track the changes to retry on Aborted Exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085321#comment-17085321 ] Kyle Weaver commented on BEAM-9745: --- Reassigning this to Cham who has more domain knowledge. > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Chamikara Madhusanka Jayalath >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > {noformat} > Update: Looks like this has been failing as far back as [Apr > 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] > after a long period where the test was consistently timing out since [Mar > 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. > So it's hard to narrow down what commit may have caused this. Plus, the test > was failing due to a completely different BigQuery failure before anyway, so > it seems like this test will need to be completely fixed from scratch, > instead of tracking down a specific breaking change. > > _After you've filled out the above details, please [assign the issue to an > individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist]. > Assignee should [treat test failures as > high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test], > helping to fix the issue or find a more appropriate owner. See [Apache Beam > Post-Commit > Policies|https://beam.apache.org/contribute/postcommits-policies]._ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver reassigned BEAM-9745: - Assignee: Chamikara Madhusanka Jayalath (was: Kyle Weaver) > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Chamikara Madhusanka Jayalath >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > {noformat} > Update: Looks like this has been failing as far back as [Apr > 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] > after a long period where the test was consistently timing out since [Mar > 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. > So it's hard to narrow down what commit may have caused this. Plus, the test > was failing due to a completely different BigQuery failure before anyway, so > it seems like this test will need to be completely fixed from scratch, > instead of tracking down a specific breaking change. > > _After you've filled out the above details, please [assign the issue to an > individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist]. > Assignee should [treat test failures as > high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test], > helping to fix the issue or find a more appropriate owner. See [Apache Beam > Post-Commit > Policies|https://beam.apache.org/contribute/postcommits-policies]._ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9739) SpannerIO - Retry on Aborted Exception during schema change
[ https://issues.apache.org/jira/browse/BEAM-9739?focusedWorklogId=423838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423838 ] ASF GitHub Bot logged work on BEAM-9739: Author: ASF GitHub Bot Created on: 16/Apr/20 23:49 Start Date: 16/Apr/20 23:49 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #11392: [BEAM-9739] Retry SpannerIO write on Schema change URL: https://github.com/apache/beam/pull/11392#issuecomment-614954744 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423838) Time Spent: 50m (was: 40m) > SpannerIO - Retry on Aborted Exception during schema change > --- > > Key: BEAM-9739 > URL: https://issues.apache.org/jira/browse/BEAM-9739 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Allen Pradeep Xavier >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Spanner aborts all transactions in flight when there is a schema change and > returns an Aborted Exception. The client is expected to retry the transaction > silently. > SpannerIO does not handle the exception and this is propagated to the > pipeline. This bug is to track the changes to retry on Aborted Exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-7981) ParDo function wrapper doesn't support Iterable output types
[ https://issues.apache.org/jira/browse/BEAM-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri reopened BEAM-7981: - Missed a TODO in typed_pipeline_test > ParDo function wrapper doesn't support Iterable output types > > > Key: BEAM-7981 > URL: https://issues.apache.org/jira/browse/BEAM-7981 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Fix For: 2.18.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > I believe the bug is in CallableWrapperDoFn.default_type_hints, which > converts Iterable[str] to str. > This test will be included (commented out) in > https://github.com/apache/beam/pull/9283 > {code} > def test_typed_callable_iterable_output(self): > @typehints.with_input_types(int) > @typehints.with_output_types(typehints.Iterable[str]) > def do_fn(element): > return [[str(element)] * 2] > result = [1, 2] | beam.ParDo(do_fn) > self.assertEqual([['1', '1'], ['2', '2']], sorted(result)) > {code} > Result: > {code} > == > ERROR: test_typed_callable_iterable_output > (apache_beam.typehints.typed_pipeline_test.MainInputTest) > -- > Traceback (most recent call last): > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", > line 104, in test_typed_callable_iterable_output > result = [1, 2] | beam.ParDo(do_fn) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/transforms/ptransform.py", > line 519, in __ror__ > p.run().wait_until_finish() > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", > line 406, in run > self._options).run(False) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", > line 419, in run > return self.runner.run_pipeline(self, self._options) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/direct/direct_runner.py", > line 129, in run_pipeline > return runner.run_pipeline(pipeline, options) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 366, in run_pipeline > default_environment=self._default_environment)) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 373, in run_via_runner_api > return self.run_stages(stage_context, stages) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 455, in run_stages > stage_context.safe_coders) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 733, in _run_stage > result, splits = bundle_manager.process_bundle(data_input, data_output) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 1663, in process_bundle > part, expected_outputs), part_inputs): > File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in > result_iterator > yield fs.pop().result() > File "/usr/lib/python3.7/concurrent/futures/_base.py", line 432, in result > return self.__get_result() > File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in > __get_result > raise self._exception > File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run > result = self.fn(*self.args, **self.kwargs) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 1663, in > part, expected_outputs), part_inputs): > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 1601, in process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", > line 1080, in push > response = self.worker.do_instruction(request) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 343, in do_instruction > request.instruction_id) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 369, in process_bundle > bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 593, in process_bundle >
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=423835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423835 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 16/Apr/20 23:40 Start Date: 16/Apr/20 23:40 Worklog Time Spent: 10m Work Description: ibzib commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-614951136 What remains to be done on this PR? Test coverage? I'd like to get this merged as soon as possible to unblock the 2.21 release. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423835) Time Spent: 3.5h (was: 3h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085313#comment-17085313 ] Kyle Weaver commented on BEAM-9745: --- The old cause for failures is BEAM-9390. When I diff the list of currently failing tests with the test failures reported in BEAM-9390, it looks like there are five newly failing tests: org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaUpdateOptionsIT.testAllowFieldAddition org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaUpdateOptionsIT.testAllowFieldRelaxation org.apache.beam.sdk.io.gcp.bigtable.BigtableWriteIT.testE2EBigtableWrite org.apache.beam.sdk.io.gcp.datastore.V1WriteIT.testE2EV1WriteWithLargeEntities org.apache.beam.sdk.io.gcp.datastore.V1WriteIT.testE2EV1Write For the 2.21 release, we probably won't fix BEAM-9390, but we should at least make sure the newer failures don't represent a regression. > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Kyle Weaver >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > {noformat} > Update: Looks like this has been failing as far back as [Apr > 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] > after a long period where the test was consistently timing out since [Mar > 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. > So it's hard to narrow down what commit may have caused this. Plus, the test > was failing due to a completely different BigQuery failure before anyway, so > it seems like this test will need to be
[jira] [Work logged] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?focusedWorklogId=423834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423834 ] ASF GitHub Bot logged work on BEAM-9773: Author: ASF GitHub Bot Created on: 16/Apr/20 23:36 Start Date: 16/Apr/20 23:36 Worklog Time Spent: 10m Work Description: scwhittle commented on issue #11442: [BEAM-9773]: Update Dataflow Debug Capture to use Google API client J… URL: https://github.com/apache/beam/pull/11442#issuecomment-614949937 Thanks! I think that the description would be better if it included why the test passes without this change and where this causes a failure. IIRC you mentioned that you had to add some logging to see this was what was causing the failure. Can you include that logging in the PR too? I'm also not a merger so maybe ask reuvenlax after making those changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423834) Time Spent: 40m (was: 0.5h) > Update Dataflow Debug Capture to use Google API client Jackson 2 > > > Key: BEAM-9773 > URL: https://issues.apache.org/jira/browse/BEAM-9773 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Steve Koonce >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > DebugCapture is using an old version of the JacksonFactory from Google API > client. Update it to use the latest to match the rest of the Dataflow runner > and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver resolved BEAM-9769. --- Resolution: Fixed > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?focusedWorklogId=423830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423830 ] ASF GitHub Bot logged work on BEAM-9769: Author: ASF GitHub Bot Created on: 16/Apr/20 23:18 Start Date: 16/Apr/20 23:18 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11435: [release-2.21.0][BEAM-9769] Ensuring JSON is the default export format for BQ sink URL: https://github.com/apache/beam/pull/11435 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423830) Time Spent: 4h (was: 3h 50m) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?focusedWorklogId=423828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423828 ] ASF GitHub Bot logged work on BEAM-9773: Author: ASF GitHub Bot Created on: 16/Apr/20 23:09 Start Date: 16/Apr/20 23:09 Worklog Time Spent: 10m Work Description: stevekoonce commented on issue #11442: [BEAM-9773]: Update Dataflow Debug Capture to use Google API client J… URL: https://github.com/apache/beam/pull/11442#issuecomment-614942121 R: @scwhittle This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423828) Time Spent: 0.5h (was: 20m) > Update Dataflow Debug Capture to use Google API client Jackson 2 > > > Key: BEAM-9773 > URL: https://issues.apache.org/jira/browse/BEAM-9773 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Steve Koonce >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > DebugCapture is using an old version of the JacksonFactory from Google API > client. Update it to use the latest to match the rest of the Dataflow runner > and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?focusedWorklogId=423824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423824 ] ASF GitHub Bot logged work on BEAM-9773: Author: ASF GitHub Bot Created on: 16/Apr/20 23:06 Start Date: 16/Apr/20 23:06 Worklog Time Spent: 10m Work Description: stevekoonce commented on issue #11442: [BEAM-9773]: Update Dataflow Debug Capture to use Google API client J… URL: https://github.com/apache/beam/pull/11442#issuecomment-614941168 Run Dataflow ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423824) Time Spent: 20m (was: 10m) > Update Dataflow Debug Capture to use Google API client Jackson 2 > > > Key: BEAM-9773 > URL: https://issues.apache.org/jira/browse/BEAM-9773 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Steve Koonce >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > DebugCapture is using an old version of the JacksonFactory from Google API > client. Update it to use the latest to match the rest of the Dataflow runner > and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
[ https://issues.apache.org/jira/browse/BEAM-9773?focusedWorklogId=423821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423821 ] ASF GitHub Bot logged work on BEAM-9773: Author: ASF GitHub Bot Created on: 16/Apr/20 23:04 Start Date: 16/Apr/20 23:04 Worklog Time Spent: 10m Work Description: stevekoonce commented on pull request #11442: [BEAM-9773]: Update Dataflow Debug Capture to use Google API client J… URL: https://github.com/apache/beam/pull/11442 …ackson 2 This change updates the DebugCapture within the Dataflow runner to use the latest Google API client JacksonFactory. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [] Update `CHANGES.md` with noteworthy changes. - [] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Created] (BEAM-9773) Update Dataflow Debug Capture to use Google API client Jackson 2
Steve Koonce created BEAM-9773: -- Summary: Update Dataflow Debug Capture to use Google API client Jackson 2 Key: BEAM-9773 URL: https://issues.apache.org/jira/browse/BEAM-9773 Project: Beam Issue Type: Bug Components: runner-dataflow Reporter: Steve Koonce DebugCapture is using an old version of the JacksonFactory from Google API client. Update it to use the latest to match the rest of the Dataflow runner and the Java SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9703) Create py validations runner test for metrics
[ https://issues.apache.org/jira/browse/BEAM-9703?focusedWorklogId=423818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423818 ] ASF GitHub Bot logged work on BEAM-9703: Author: ASF GitHub Bot Created on: 16/Apr/20 22:56 Start Date: 16/Apr/20 22:56 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11319: [BEAM-9703]Include user distritribution into metric-dedicated validate runner test. URL: https://github.com/apache/beam/pull/11319#issuecomment-614937874 Run PythonLint PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423818) Time Spent: 4.5h (was: 4h 20m) > Create py validations runner test for metrics > - > > Key: BEAM-9703 > URL: https://issues.apache.org/jira/browse/BEAM-9703 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Time Spent: 4.5h > Remaining Estimate: 0h > > Some of the metrics are not covered by dedicated validation runner test. > Would like create these if needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8646) PR #9814 appears to cause failures in fnapi_runner tests on Windows
[ https://issues.apache.org/jira/browse/BEAM-8646?focusedWorklogId=423819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423819 ] ASF GitHub Bot logged work on BEAM-8646: Author: ASF GitHub Bot Created on: 16/Apr/20 22:56 Start Date: 16/Apr/20 22:56 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11431: [BEAM-8646] Fix external environment on OS X as well. URL: https://github.com/apache/beam/pull/11431#issuecomment-614937962 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423819) Time Spent: 1.5h (was: 1h 20m) > PR #9814 appears to cause failures in fnapi_runner tests on Windows > --- > > Key: BEAM-8646 > URL: https://issues.apache.org/jira/browse/BEAM-8646 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Wanqi Lyu >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > It appears that changes in > > https://github.com/apache/beam/commit/d6bcb03f586b5430c30f6ca4a1af9e42711e529c > cause test failures in Beam test suite on Windows, for example: > python setup.py nosetests --tests > apache_beam/runners/portability/portable_runner_test.py:PortableRunnerTestWithExternalEnv.test_callbacks_with_exception > > does not finish on a Windows VM machine within at least 60 seconds but passes > within a second if we change host_from_worker to return 'localhost' in [1]. > [~violalyu] , do you think you could take a look? Thanks! > cc: [~chadrik] [~thw] > [1] > https://github.com/apache/beam/blob/808cb35018cd228a59b152234b655948da2455fa/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1377. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9703) Create py validations runner test for metrics
[ https://issues.apache.org/jira/browse/BEAM-9703?focusedWorklogId=423816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423816 ] ASF GitHub Bot logged work on BEAM-9703: Author: ASF GitHub Bot Created on: 16/Apr/20 22:55 Start Date: 16/Apr/20 22:55 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11319: [BEAM-9703]Include user distritribution into metric-dedicated validate runner test. URL: https://github.com/apache/beam/pull/11319#issuecomment-614937799 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423816) Time Spent: 4h 10m (was: 4h) > Create py validations runner test for metrics > - > > Key: BEAM-9703 > URL: https://issues.apache.org/jira/browse/BEAM-9703 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Time Spent: 4h 10m > Remaining Estimate: 0h > > Some of the metrics are not covered by dedicated validation runner test. > Would like create these if needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8646) PR #9814 appears to cause failures in fnapi_runner tests on Windows
[ https://issues.apache.org/jira/browse/BEAM-8646?focusedWorklogId=423815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423815 ] ASF GitHub Bot logged work on BEAM-8646: Author: ASF GitHub Bot Created on: 16/Apr/20 22:55 Start Date: 16/Apr/20 22:55 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11431: [BEAM-8646] Fix external environment on OS X as well. URL: https://github.com/apache/beam/pull/11431#issuecomment-614937771 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423815) Time Spent: 1h 20m (was: 1h 10m) > PR #9814 appears to cause failures in fnapi_runner tests on Windows > --- > > Key: BEAM-8646 > URL: https://issues.apache.org/jira/browse/BEAM-8646 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Wanqi Lyu >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > It appears that changes in > > https://github.com/apache/beam/commit/d6bcb03f586b5430c30f6ca4a1af9e42711e529c > cause test failures in Beam test suite on Windows, for example: > python setup.py nosetests --tests > apache_beam/runners/portability/portable_runner_test.py:PortableRunnerTestWithExternalEnv.test_callbacks_with_exception > > does not finish on a Windows VM machine within at least 60 seconds but passes > within a second if we change host_from_worker to return 'localhost' in [1]. > [~violalyu] , do you think you could take a look? Thanks! > cc: [~chadrik] [~thw] > [1] > https://github.com/apache/beam/blob/808cb35018cd228a59b152234b655948da2455fa/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1377. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9703) Create py validations runner test for metrics
[ https://issues.apache.org/jira/browse/BEAM-9703?focusedWorklogId=423817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423817 ] ASF GitHub Bot logged work on BEAM-9703: Author: ASF GitHub Bot Created on: 16/Apr/20 22:55 Start Date: 16/Apr/20 22:55 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11319: [BEAM-9703]Include user distritribution into metric-dedicated validate runner test. URL: https://github.com/apache/beam/pull/11319#issuecomment-614937833 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423817) Time Spent: 4h 20m (was: 4h 10m) > Create py validations runner test for metrics > - > > Key: BEAM-9703 > URL: https://issues.apache.org/jira/browse/BEAM-9703 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > Some of the metrics are not covered by dedicated validation runner test. > Would like create these if needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9764) :sdks:java:container:generateThirdPartyLicenses failing
[ https://issues.apache.org/jira/browse/BEAM-9764?focusedWorklogId=423813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423813 ] ASF GitHub Bot logged work on BEAM-9764: Author: ASF GitHub Bot Created on: 16/Apr/20 22:53 Start Date: 16/Apr/20 22:53 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #11428: [BEAM-9764] fix Java license failures with Python2_PVR_Flink PreCommit URL: https://github.com/apache/beam/pull/11428#issuecomment-614937155 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423813) Time Spent: 40m (was: 0.5h) > :sdks:java:container:generateThirdPartyLicenses failing > --- > > Key: BEAM-9764 > URL: https://issues.apache.org/jira/browse/BEAM-9764 > Project: Beam > Issue Type: Bug > Components: sdk-java-core, test-failures >Reporter: Udi Meiri >Assignee: Hannah Jiang >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/774/console > The traceback is interspersed with other logs: > {code} > Traceback (most recent call last): > Successfully pulled > java_third_party_licenses/protobuf-java-util-3.11.1.jar/LICENSE from > https://opensource.org/licenses/BSD-3-Clause > Successfully pulled java_third_party_licenses/protoc-3.11.0.jar/LICENSE from > http://www.apache.org/licenses/LICENSE-2.0.txt > File "sdks/java/container/license_scripts/pull_licenses_java.py", line 138, > in > Successfully pulled java_third_party_licenses/protoc-3.11.1.jar/LICENSE from > http://www.apache.org/licenses/LICENSE-2.0.txt > license_url = dep['moduleLicenseUrl'] > Successfully pulled java_third_party_licenses/zetasketch-0.1.0.jar/LICENSE > from http://www.apache.org/licenses/LICENSE-2.0.txt > KeyError: 'moduleLicenseUrl' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9729) Cleanup bundle registration now that SDKs can pull.
[ https://issues.apache.org/jira/browse/BEAM-9729?focusedWorklogId=423812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423812 ] ASF GitHub Bot logged work on BEAM-9729: Author: ASF GitHub Bot Created on: 16/Apr/20 22:49 Start Date: 16/Apr/20 22:49 Worklog Time Spent: 10m Work Description: pabloem commented on issue #11358: [BEAM-9729, BEAM-8486] Runner-side bundle registration cleanup. URL: https://github.com/apache/beam/pull/11358#issuecomment-614935695 > For my edification - what happened here? Are bundles registered with the process bundle request? Robert has since informed me that this is pull-based now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423812) Time Spent: 2h 10m (was: 2h) > Cleanup bundle registration now that SDKs can pull. > --- > > Key: BEAM-9729 > URL: https://issues.apache.org/jira/browse/BEAM-9729 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Once all runners (in particular dataflow) support pull descriptors, we can > clean things up by removing the push registration code. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=423805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423805 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Apr/20 22:44 Start Date: 16/Apr/20 22:44 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#issuecomment-614934489 Run RAT PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423805) Time Spent: 18h 40m (was: 18.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5379) Go Modules versioning support
[ https://issues.apache.org/jira/browse/BEAM-5379?focusedWorklogId=423803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423803 ] ASF GitHub Bot logged work on BEAM-5379: Author: ASF GitHub Bot Created on: 16/Apr/20 22:41 Start Date: 16/Apr/20 22:41 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11441: [BEAM-5379] Ignore go.sum files in RAT checks URL: https://github.com/apache/beam/pull/11441 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423803) Time Spent: 5h 10m (was: 5h) > Go Modules versioning support > - > > Key: BEAM-5379 > URL: https://issues.apache.org/jira/browse/BEAM-5379 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Robert Burke >Assignee: Robert Burke >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > This would make it easier for non-Go developers to update and test changes to > the Go SDK without jumping through hoops to set up Go Paths at first. > Right now, we us the gogradle plugin for gradle to handle re-producible > builds. Without doing something with the GO_PATH relative to a user's local > git repo though, changes made in the user's repo are not represented when > gradle is invoked to test everything. > One of at least the following needs to be accomplished: > * gogradle moves to support the Go Modules experiment in Go 1.11, and the SDK > migrates to that > * or we re-implement our gradle go rules ourselves to use them, > * or some third option, that moves away from the GO_PATH nit. > This issue should be resolved after deciding and implementing a clear > versioning story for the SDK, ideally along Go best practices. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423802=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423802 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 22:40 Start Date: 16/Apr/20 22:40 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614933272 R: @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423802) Time Spent: 3h 40m (was: 3.5h) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5379) Go Modules versioning support
[ https://issues.apache.org/jira/browse/BEAM-5379?focusedWorklogId=423801=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423801 ] ASF GitHub Bot logged work on BEAM-5379: Author: ASF GitHub Bot Created on: 16/Apr/20 22:39 Start Date: 16/Apr/20 22:39 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11441: [BEAM-5379] Ignore go.sum files in RAT checks URL: https://github.com/apache/beam/pull/11441#issuecomment-614932942 Since this only impacts the RAT check and is blocking all other PRs I suggest merging when that is green. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423801) Time Spent: 5h (was: 4h 50m) > Go Modules versioning support > - > > Key: BEAM-5379 > URL: https://issues.apache.org/jira/browse/BEAM-5379 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Robert Burke >Assignee: Robert Burke >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > This would make it easier for non-Go developers to update and test changes to > the Go SDK without jumping through hoops to set up Go Paths at first. > Right now, we us the gogradle plugin for gradle to handle re-producible > builds. Without doing something with the GO_PATH relative to a user's local > git repo though, changes made in the user's repo are not represented when > gradle is invoked to test everything. > One of at least the following needs to be accomplished: > * gogradle moves to support the Go Modules experiment in Go 1.11, and the SDK > migrates to that > * or we re-implement our gradle go rules ourselves to use them, > * or some third option, that moves away from the GO_PATH nit. > This issue should be resolved after deciding and implementing a clear > versioning story for the SDK, ideally along Go best practices. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9771) colab links in example notebooks don't work
[ https://issues.apache.org/jira/browse/BEAM-9771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9771: --- Status: Open (was: Triage Needed) > colab links in example notebooks don't work > --- > > Key: BEAM-9771 > URL: https://issues.apache.org/jira/browse/BEAM-9771 > Project: Beam > Issue Type: Bug > Components: examples-python >Reporter: Ahmet Altay >Assignee: David Cavazos >Priority: Major > > Example: > https://github.com/apache/beam/blob/master/examples/notebooks/documentation/transforms/python/elementwise/map-py.ipynb > Error: > Notebook not found > There was an error loading this notebook. Ensure that the file is accessible > and try again. > Ensure that you have permission to view this notebook in GitHub and authorize > Colaboratory to use the GitHub API. > https://github.com/apache/beam/blob/master/Users/dcavazos/src/beam/examples/notebooks/documentation/transforms/python/elementwise/map-py.ipynb > I believe this is true for all files in that folder at least. I did not check > other places. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5379) Go Modules versioning support
[ https://issues.apache.org/jira/browse/BEAM-5379?focusedWorklogId=423800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423800 ] ASF GitHub Bot logged work on BEAM-5379: Author: ASF GitHub Bot Created on: 16/Apr/20 22:37 Start Date: 16/Apr/20 22:37 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11441: [BEAM-5379] Ignore go.sum files in RAT checks URL: https://github.com/apache/beam/pull/11441 R: @youngoli cc: @boyuanzz @lukecwik @damondouglas Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423795=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423795 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 22:31 Start Date: 16/Apr/20 22:31 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614930497 R: @amaliujia This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423795) Time Spent: 3.5h (was: 3h 20m) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-6860) WriteToText crash with "GlobalWindow -> ._IntervalWindowBase"
[ https://issues.apache.org/jira/browse/BEAM-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri updated BEAM-6860: Status: Open (was: Triage Needed) > WriteToText crash with "GlobalWindow -> ._IntervalWindowBase" > - > > Key: BEAM-6860 > URL: https://issues.apache.org/jira/browse/BEAM-6860 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.11.0 > Environment: macOS, DirectRunner, python 2.7.15 via > pyenv/pyenv-virtualenv >Reporter: Henrik >Assignee: Udi Meiri >Priority: Major > Labels: newbie > Fix For: 2.16.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Main error: > > Cannot convert GlobalWindow to > > apache_beam.utils.windowed_value._IntervalWindowBase > This is very hard for me to debug. Doing a DoPar call before, printing the > input, gives me just what I want; so the lines of data to serialise are > "alright"; just JSON strings, in fact. > Stacktrace: > {code:java} > Traceback (most recent call last): > File "./okr_end_ride.py", line 254, in > run() > File "./okr_end_ride.py", line 250, in run > run_pipeline(pipeline_options, known_args) > File "./okr_end_ride.py", line 198, in run_pipeline > | 'write_all' >> WriteToText(known_args.output, > file_name_suffix=".txt") > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 426, in __exit__ > self.run().wait_until_finish() > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 406, in run > self._options).run(False) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 419, in run > return self.runner.run_pipeline(self, self._options) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py", > line 132, in run_pipeline > return runner.run_pipeline(pipeline, options) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 275, in run_pipeline > default_environment=self._default_environment)) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 278, in run_via_runner_api > return self.run_stages(*self.create_stages(pipeline_proto)) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 354, in run_stages > stage_context.safe_coders) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 509, in run_stage > data_input, data_output) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 1206, in process_bundle > result_future = self._controller.control_handler.push(process_bundle) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 821, in push > response = self.worker.do_instruction(request) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 265, in do_instruction > request.instruction_id) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 281, in process_bundle > delayed_applications = bundle_processor.process_bundle(instruction_id) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 552, in process_bundle > op.finish() > File "apache_beam/runners/worker/operations.py", line 549, in > apache_beam.runners.worker.operations.DoOperation.finish > File "apache_beam/runners/worker/operations.py", line 550, in > apache_beam.runners.worker.operations.DoOperation.finish > File "apache_beam/runners/worker/operations.py", line 551, in > apache_beam.runners.worker.operations.DoOperation.finish > File "apache_beam/runners/common.py", line 758, in > apache_beam.runners.common.DoFnRunner.finish > File "apache_beam/runners/common.py", line 752, in > apache_beam.runners.common.DoFnRunner._invoke_bundle_method > File "apache_beam/runners/common.py", line 777, in > apache_beam.runners.common.DoFnRunner._reraise_augmented >
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423787 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 22:07 Start Date: 16/Apr/20 22:07 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614921757 Run Website PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423787) Time Spent: 3h 20m (was: 3h 10m) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423784 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 22:04 Start Date: 16/Apr/20 22:04 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614920456 Run Full Website Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423784) Time Spent: 3h 10m (was: 3h) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085285#comment-17085285 ] Udi Meiri commented on BEAM-9737: - Magics: https://lists.apache.org/thread.html/r9f9b7d4c502141e2e973f23d9e7329dcbd11fb64487b77a02d42e144%40%3Cbuilds.apache.org%3E > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423778 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 21:48 Start Date: 16/Apr/20 21:48 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614914483 Run Website_Stage_GCS PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423778) Time Spent: 3h (was: 2h 50m) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423777=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423777 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 21:45 Start Date: 16/Apr/20 21:45 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614913280 Run Full Website Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423777) Time Spent: 2h 50m (was: 2h 40m) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423776 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 21:45 Start Date: 16/Apr/20 21:45 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614913238 Run Website PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423776) Time Spent: 2h 40m (was: 2.5h) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8646) PR #9814 appears to cause failures in fnapi_runner tests on Windows
[ https://issues.apache.org/jira/browse/BEAM-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085272#comment-17085272 ] Valentyn Tymofieiev commented on BEAM-8646: --- Yes, there is https://github.com/apache/beam/pull/11431 out for this purpose. For long term fix, if you need a windows VM to test the changes, we can probably create one in apache-beam-testing. But if the issue on Mac OS is of the same nature, testing on Mac should be more accessible. > PR #9814 appears to cause failures in fnapi_runner tests on Windows > --- > > Key: BEAM-8646 > URL: https://issues.apache.org/jira/browse/BEAM-8646 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Wanqi Lyu >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > It appears that changes in > > https://github.com/apache/beam/commit/d6bcb03f586b5430c30f6ca4a1af9e42711e529c > cause test failures in Beam test suite on Windows, for example: > python setup.py nosetests --tests > apache_beam/runners/portability/portable_runner_test.py:PortableRunnerTestWithExternalEnv.test_callbacks_with_exception > > does not finish on a Windows VM machine within at least 60 seconds but passes > within a second if we change host_from_worker to return 'localhost' in [1]. > [~violalyu] , do you think you could take a look? Thanks! > cc: [~chadrik] [~thw] > [1] > https://github.com/apache/beam/blob/808cb35018cd228a59b152234b655948da2455fa/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1377. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing
[ https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=423756=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423756 ] ASF GitHub Bot logged work on BEAM-9737: Author: ASF GitHub Bot Created on: 16/Apr/20 21:15 Start Date: 16/Apr/20 21:15 Worklog Time Spent: 10m Work Description: udim commented on issue #11386: [BEAM-9737] Fix website postcommit URL: https://github.com/apache/beam/pull/11386#issuecomment-614900277 Run Full Website Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423756) Time Spent: 2.5h (was: 2h 20m) > beam_PostCommit_Website_Test failing > > > Key: BEAM-9737 > URL: https://issues.apache.org/jira/browse/BEAM-9737 > Project: Beam > Issue Type: Bug > Components: test-failures, website >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > Also failing: beam_PostCommit_Website_Publish (same failure) > {code} > > Task :website:buildLocalWebsite > `/` is not writable. > Bundler will use `/tmp/bundler/home/unknown' as your home directory > temporarily. > Configuration file: /repo/website/_config.yml > Configuration file: /repo/website/_config_test.yml > Configuration file: /tmp/_config_branch_repo.yml > Source: /repo/website/src >Destination: generated-local-content > Incremental build: enabled > Generating... > jekyll 3.6.3 | Error: Permission denied @ dir_s_mkdir - > /repo/build/website/generated-local-content/security > {code} > https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console > Possible culprit: https://github.com/apache/beam/pull/11232/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?focusedWorklogId=423754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423754 ] ASF GitHub Bot logged work on BEAM-9769: Author: ASF GitHub Bot Created on: 16/Apr/20 21:13 Start Date: 16/Apr/20 21:13 Worklog Time Spent: 10m Work Description: chunyang commented on issue #11433: [BEAM-9769] Ensuring JSON is the default export format for BQ sink URL: https://github.com/apache/beam/pull/11433#issuecomment-614899392 We backported the transform into a library that we ship with our Dataflow jobs, I didn't know that there were Python snapshots :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423754) Time Spent: 3h 50m (was: 3h 40m) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423748 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:59 Start Date: 16/Apr/20 20:59 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409844505 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -19,6 +19,7 @@ import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertNull; import static org.junit.Assert.assertTrue; import org.apache.beam.sdk.io.range.OffsetRange; Review comment: Yes. Your right. Need more sleep. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423748) Time Spent: 3h (was: 2h 50m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9742) Add ability to pass FluentBackoff to JdbcIo.Write
[ https://issues.apache.org/jira/browse/BEAM-9742?focusedWorklogId=423743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423743 ] ASF GitHub Bot logged work on BEAM-9742: Author: ASF GitHub Bot Created on: 16/Apr/20 20:57 Start Date: 16/Apr/20 20:57 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11396: [BEAM-9742] Add Configurable FluentBackoff to JdbcIO Write URL: https://github.com/apache/beam/pull/11396#discussion_r409843690 ## File path: sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java ## @@ -257,6 +258,28 @@ public boolean apply(SQLException e) { } } + /** + * This is the default {@link FluentBackoffConfiguration} that we use to retry when a {@link + * SQLException} occurs. + */ + public static class DefaultFluentBackoffConfiguration implements FluentBackoffConfiguration { Review comment: Waiting on updates on ML thread: https://lists.apache.org/thread.html/r7fde7b5c87c6008689a013fc113d869d3ac15cbc7a35e4534469b9ab%40%3Cdev.beam.apache.org%3E This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423743) Time Spent: 50m (was: 40m) > Add ability to pass FluentBackoff to JdbcIo.Write > - > > Key: BEAM-9742 > URL: https://issues.apache.org/jira/browse/BEAM-9742 > Project: Beam > Issue Type: Improvement > Components: io-java-jdbc >Reporter: Akshay Iyangar >Assignee: Akshay Iyangar >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Currently, the FluentBackoff is hardcoded with `maxRetries` and > `initialBackoff` . > It would be helpful if the client were able to pass these values. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9767) test_streaming_wordcount flaky timeouts
[ https://issues.apache.org/jira/browse/BEAM-9767?focusedWorklogId=423742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423742 ] ASF GitHub Bot logged work on BEAM-9767: Author: ASF GitHub Bot Created on: 16/Apr/20 20:56 Start Date: 16/Apr/20 20:56 Worklog Time Spent: 10m Work Description: rohdesamuel commented on issue #11440: [BEAM-9767] Add a timeout to the TestStream GRPC and fix the Streaming cache timeout URL: https://github.com/apache/beam/pull/11440#issuecomment-614891171 Looks like an unrelated error in the RAT (go learning katas have a bad license) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423742) Time Spent: 40m (was: 0.5h) > test_streaming_wordcount flaky timeouts > --- > > Key: BEAM-9767 > URL: https://issues.apache.org/jira/browse/BEAM-9767 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures >Reporter: Udi Meiri >Assignee: Sam Rohde >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Timed out after 600s, typically completes in 2.8s on my workstation. > https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/ > {code} > self = > testMethod=test_streaming_wordcount> > @unittest.skipIf( > sys.version_info < (3, 5, 3), > 'The tests require at least Python 3.6 to work.') > def test_streaming_wordcount(self): > class WordExtractingDoFn(beam.DoFn): > def process(self, element): > text_line = element.strip() > words = text_line.split() > return words > > # Add the TestStream so that it can be cached. > ib.options.capturable_sources.add(TestStream) > ib.options.capture_duration = timedelta(seconds=5) > > p = beam.Pipeline( > runner=interactive_runner.InteractiveRunner(), > options=StandardOptions(streaming=True)) > > data = ( > p > | TestStream() > .advance_watermark_to(0) > .advance_processing_time(1) > .add_elements(['to', 'be', 'or', 'not', 'to', 'be']) > .advance_watermark_to(20) > .advance_processing_time(1) > .add_elements(['that', 'is', 'the', 'question']) > | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable > > counts = ( > data > | 'split' >> beam.ParDo(WordExtractingDoFn()) > | 'pair_with_one' >> beam.Map(lambda x: (x, 1)) > | 'group' >> beam.GroupByKey() > | 'count' >> beam.Map(lambda wordones: (wordones[0], > sum(wordones[1] > > # Watch the local scope for Interactive Beam so that referenced > PCollections > # will be cached. > ib.watch(locals()) > > # This is normally done in the interactive_utils when a transform is > # applied but needs an IPython environment. So we manually run this > here. > ie.current_env().track_user_pipelines() > > # Create a fake limiter that cancels the BCJ once the main job receives > the > # expected amount of results. > class FakeLimiter: > def __init__(self, p, pcoll): > self.p = p > self.pcoll = pcoll > > def is_triggered(self): > result = ie.current_env().pipeline_result(self.p) > if result: > try: > results = result.get(self.pcoll) > except ValueError: > return False > return len(results) >= 10 > return False > > # This sets the limiters to stop reading when the test receives 10 > elements > # or after 5 seconds have elapsed (to eliminate the possibility of > hanging). > ie.current_env().options.capture_control.set_limiters_for_test( > [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))]) > > # This tests that the data was correctly cached. > pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0) > expected_data_df = pd.DataFrame([ > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('or', 0, [IntervalWindow(0, 10)], pane_info), > ('not', 0, [IntervalWindow(0, 10)], pane_info), > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), >
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423710 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:35 Start Date: 16/Apr/20 20:35 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409832148 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -47,16 +48,9 @@ public void testTryClaim() throws Exception { @Test public void testCheckpointUnstarted() throws Exception { OffsetRangeTracker tracker = new OffsetRangeTracker(new OffsetRange(100, 200)); -expected.expect(IllegalStateException.class); -tracker.trySplit(0).getResidual(); - } - - @Test - public void testCheckpointOnlyFailedClaim() throws Exception { -OffsetRangeTracker tracker = new OffsetRangeTracker(new OffsetRange(100, 200)); -assertFalse(tracker.tryClaim(250L)); -expected.expect(IllegalStateException.class); -OffsetRange checkpoint = tracker.trySplit(0).getResidual(); +SplitResult res = tracker.trySplit(0); +assertEquals(new OffsetRange(100, 100), res.getPrimary()); +assertEquals(new OffsetRange(100, 200), res.getResidual()); Review comment: Thanks, that was my mistake, I read both lines as being `[100, 200)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423710) Time Spent: 2h 50m (was: 2h 40m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9678) Introduction Kata | Go SDK Code Katas
[ https://issues.apache.org/jira/browse/BEAM-9678?focusedWorklogId=423708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423708 ] ASF GitHub Bot logged work on BEAM-9678: Author: ASF GitHub Bot Created on: 16/Apr/20 20:27 Start Date: 16/Apr/20 20:27 Worklog Time Spent: 10m Work Description: boyuanzz commented on issue #11340: [BEAM-9678] Create Go SDK introduction kata URL: https://github.com/apache/beam/pull/11340#issuecomment-614877141 There are 2 go.sum files without apache license header, which breaks RAT check. Could you please take a look at it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423708) Time Spent: 5h 10m (was: 5h) > Introduction Kata | Go SDK Code Katas > - > > Key: BEAM-9678 > URL: https://issues.apache.org/jira/browse/BEAM-9678 > Project: Beam > Issue Type: Sub-task > Components: katas, sdk-go >Reporter: Damon Douglas >Assignee: Damon Douglas >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > An Introduction kata patterns after > [https://github.com/apache/beam/tree/master/learning/katas/java/Introduction] > where the take away is an individual's ability to start an Apache Beam > pipeline using the Golang SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423706 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:24 Start Date: 16/Apr/20 20:24 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409826545 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -19,6 +19,7 @@ import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertNull; import static org.junit.Assert.assertTrue; import org.apache.beam.sdk.io.range.OffsetRange; Review comment: `trySplit` right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423706) Time Spent: 2.5h (was: 2h 20m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423707 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:24 Start Date: 16/Apr/20 20:24 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409826636 ## File path: runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java ## @@ -210,9 +210,10 @@ public FinishBundleContext finishBundleContext(DoFn doFn) { // the call says that not the whole restriction has been processed. So we need to take // a checkpoint now: checkpoint() guarantees that the primary restriction describes exactly // the work that was done in the current ProcessElement call, and returns a residual -// restriction that describes exactly the work that wasn't done in the current call. +// restriction that describes exactly the work that wasn't done in the current call. The +// residual is null when the entire restriction has been processed. if (processContext.numClaimedBlocks > 0) { - residual = checkNotNull(processContext.takeCheckpointNow()); + residual = processContext.takeCheckpointNow(); Review comment: I guess the original assumption is, checkpoint should happen after at least one `tryClaim` called. Since we change the assumption, the `numClaimedBlocks ` can also be removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423707) Time Spent: 2h 40m (was: 2.5h) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9648) DirectRunner waitUntilFinish does not return null on timeout
[ https://issues.apache.org/jira/browse/BEAM-9648?focusedWorklogId=423705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423705 ] ASF GitHub Bot logged work on BEAM-9648: Author: ASF GitHub Bot Created on: 16/Apr/20 20:21 Start Date: 16/Apr/20 20:21 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11275: [BEAM-9648]: DirectRunner should return null on timeout URL: https://github.com/apache/beam/pull/11275 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423705) Time Spent: 1h 50m (was: 1h 40m) > DirectRunner waitUntilFinish does not return null on timeout > > > Key: BEAM-9648 > URL: https://issues.apache.org/jira/browse/BEAM-9648 > Project: Beam > Issue Type: Bug > Components: runner-direct >Affects Versions: 2.19.0 >Reporter: Filipe Regadas >Assignee: Filipe Regadas >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > According to PipelineResult if waitUntilFinish(Duration) is supported it > should return null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423704 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:20 Start Date: 16/Apr/20 20:20 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409824692 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -47,16 +48,9 @@ public void testTryClaim() throws Exception { @Test public void testCheckpointUnstarted() throws Exception { OffsetRangeTracker tracker = new OffsetRangeTracker(new OffsetRange(100, 200)); -expected.expect(IllegalStateException.class); -tracker.trySplit(0).getResidual(); - } - - @Test - public void testCheckpointOnlyFailedClaim() throws Exception { -OffsetRangeTracker tracker = new OffsetRangeTracker(new OffsetRange(100, 200)); -assertFalse(tracker.tryClaim(250L)); -expected.expect(IllegalStateException.class); -OffsetRange checkpoint = tracker.trySplit(0).getResidual(); +SplitResult res = tracker.trySplit(0); +assertEquals(new OffsetRange(100, 100), res.getPrimary()); +assertEquals(new OffsetRange(100, 200), res.getResidual()); Review comment: In this test case, the expected primary is [100, 100) and the expected residual is [100, 200) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423704) Time Spent: 2h 20m (was: 2h 10m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-7195) BigQuery - 404 errors for 'table not found' when using dynamic destinations - sometimes, new table fails to get created
[ https://issues.apache.org/jira/browse/BEAM-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-7195: --- Assignee: Guangyu Chen > BigQuery - 404 errors for 'table not found' when using dynamic destinations - > sometimes, new table fails to get created > --- > > Key: BEAM-7195 > URL: https://issues.apache.org/jira/browse/BEAM-7195 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.5.0 > Environment: Windows >Reporter: Chris >Assignee: Guangyu Chen >Priority: Critical > > See the following StackOverflow question, which describes the details: > > [https://stackoverflow.com/questions/55932291/apache-beam-for-google-cloud-dataflow-404-errors-when-using-bigqueryio-write-c] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423701 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:12 Start Date: 16/Apr/20 20:12 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409819571 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -19,6 +19,7 @@ import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertNull; import static org.junit.Assert.assertTrue; import org.apache.beam.sdk.io.range.OffsetRange; Review comment: Can we add tests to verify tryClaim(0), tryClaim(0.1), tryClaim(1) on an empty range like [100, 100) Can we also add tests to verify the behavior of tryClaim(1) on range [100, 200) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423701) Time Spent: 2h 10m (was: 2h) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423697 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:11 Start Date: 16/Apr/20 20:11 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409806889 ## File path: runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java ## @@ -210,9 +210,10 @@ public FinishBundleContext finishBundleContext(DoFn doFn) { // the call says that not the whole restriction has been processed. So we need to take // a checkpoint now: checkpoint() guarantees that the primary restriction describes exactly // the work that was done in the current ProcessElement call, and returns a residual -// restriction that describes exactly the work that wasn't done in the current call. +// restriction that describes exactly the work that wasn't done in the current call. The +// residual is null when the entire restriction has been processed. if (processContext.numClaimedBlocks > 0) { - residual = checkNotNull(processContext.takeCheckpointNow()); + residual = processContext.takeCheckpointNow(); Review comment: takeCheckpointNow should work regardless whether numClaimedBlocks > 0 or not. Even if tryClaim never happens, the watermark may advance. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423697) Time Spent: 1h 50m (was: 1h 40m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423699 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:11 Start Date: 16/Apr/20 20:11 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409807914 ## File path: runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java ## @@ -210,9 +210,10 @@ public FinishBundleContext finishBundleContext(DoFn doFn) { // the call says that not the whole restriction has been processed. So we need to take // a checkpoint now: checkpoint() guarantees that the primary restriction describes exactly // the work that was done in the current ProcessElement call, and returns a residual -// restriction that describes exactly the work that wasn't done in the current call. +// restriction that describes exactly the work that wasn't done in the current call. The +// residual is null when the entire restriction has been processed. if (processContext.numClaimedBlocks > 0) { - residual = checkNotNull(processContext.takeCheckpointNow()); + residual = processContext.takeCheckpointNow(); processContext.tracker.checkDone(); } else { Review comment: The comments below will likely need updating This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423699) Time Spent: 2h (was: 1h 50m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423698 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:11 Start Date: 16/Apr/20 20:11 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409818060 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -96,9 +90,24 @@ public void testCheckpointAfterFailedClaim() throws Exception { assertTrue(tracker.tryClaim(110L)); assertTrue(tracker.tryClaim(160L)); assertFalse(tracker.tryClaim(240L)); -OffsetRange checkpoint = tracker.trySplit(0).getResidual(); -assertEquals(new OffsetRange(100, 161), tracker.currentRestriction()); -assertEquals(new OffsetRange(161, 200), checkpoint); +assertNull(tracker.trySplit(0)); + } Review comment: ```suggestion tracker.checkDone(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423698) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423700 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:12 Start Date: 16/Apr/20 20:12 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409820103 ## File path: runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableProcessElementInvoker.java ## @@ -51,11 +51,6 @@ public Result( @Nullable WatermarkEstimatorStateT futureWatermarkEstimatorState) { checkArgument(continuation != null, "continuation must not be null"); this.continuation = continuation; - if (continuation.shouldResume()) { -checkArgument( -residualRestriction != null, -"residual restriction must not be null if continuation indicate it should resume"); - } this.residualRestriction = residualRestriction; this.futureOutputWatermark = futureOutputWatermark; this.futureWatermarkEstimatorState = futureWatermarkEstimatorState; Review comment: I believe the comment below could be incorrect. If we get stop(), we shouldn't have a residual restriction. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423700) Time Spent: 2h 10m (was: 2h) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423696=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423696 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 20:11 Start Date: 16/Apr/20 20:11 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#discussion_r409806021 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java ## @@ -47,16 +48,9 @@ public void testTryClaim() throws Exception { @Test public void testCheckpointUnstarted() throws Exception { OffsetRangeTracker tracker = new OffsetRangeTracker(new OffsetRange(100, 200)); -expected.expect(IllegalStateException.class); -tracker.trySplit(0).getResidual(); - } - - @Test - public void testCheckpointOnlyFailedClaim() throws Exception { -OffsetRangeTracker tracker = new OffsetRangeTracker(new OffsetRange(100, 200)); -assertFalse(tracker.tryClaim(250L)); -expected.expect(IllegalStateException.class); -OffsetRange checkpoint = tracker.trySplit(0).getResidual(); +SplitResult res = tracker.trySplit(0); +assertEquals(new OffsetRange(100, 100), res.getPrimary()); +assertEquals(new OffsetRange(100, 200), res.getResidual()); Review comment: primary and residual shouldn't have the same value, primary should be an empty range like `[100, 100)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423696) Time Spent: 1h 40m (was: 1.5h) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9192) BigQuery IO on Dataflow runner fails (java.lang.ClassCastException) with --experiment=beam_fn_api
[ https://issues.apache.org/jira/browse/BEAM-9192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085217#comment-17085217 ] Bo Shi commented on BEAM-9192: -- CC [~angoenka] > BigQuery IO on Dataflow runner fails (java.lang.ClassCastException) with > --experiment=beam_fn_api > - > > Key: BEAM-9192 > URL: https://issues.apache.org/jira/browse/BEAM-9192 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0 >Reporter: Bo Shi >Priority: Major > > {noformat} > python repro.py \ > --project=CHANGEME \ > --runner=DataflowRunner \ > --temp_location=gs://change-me/bshi/tmp \ > --staging_location=gs://change-me/bshi/stg \ > --experiment=beam_fn_api > --save_main_function > {noformat} > The same repro code works with --runner=Direct. On Dataflow, the error is > {noformat} > java.util.concurrent.ExecutionException: java.lang.ClassCastException: [B > cannot be cast to org.apache.beam.sdk.values.KV > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > at > org.apache.beam.sdk.fn.data.CompletableFutureInboundDataClient.awaitCompletion(CompletableFutureInboundDataClient.java:48) > at > org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.awaitCompletion(BeamFnDataInboundObserver.java:87) > at > org.apache.beam.runners.dataflow.worker.fn.data.BeamFnDataGrpcService$DeferredInboundDataClient.awaitCompletion(BeamFnDataGrpcService.java:134) > at > org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortReadOperation.finish(RemoteGrpcPortReadOperation.java:83) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85) > at > org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125) > at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:411) > at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:380) > at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:305) > at > org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195) > at > org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123) > Suppressed: java.lang.IllegalStateException: Already closed. > at > org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:93) > at > org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:220) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:91) > ... 6 more > Caused by: java.lang.ClassCastException: [B cannot be cast to > org.apache.beam.sdk.values.KV > at > org.apache.beam.runners.dataflow.worker.ReifyTimestampAndWindowsParDoFnFactory$ReifyTimestampAndWindowsParDoFn.processElement(ReifyTimestampAndWindowsParDoFnFactory.java:72) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49) > at > org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortReadOperation.consumeOutput(RemoteGrpcPortReadOperation.java:103) > at > org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:78) > at > org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31) > at > org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:138) > at > org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:125) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:249) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76) > at > org.apache.beam.vendor.grpc.v1p21p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:297) >
[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=423675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423675 ] ASF GitHub Bot logged work on BEAM-8872: Author: ASF GitHub Bot Created on: 16/Apr/20 19:33 Start Date: 16/Apr/20 19:33 Worklog Time Spent: 10m Work Description: boyuanzz commented on issue #11418: [BEAM-8872] Support split at fraction for OffsetRangeTracker URL: https://github.com/apache/beam/pull/11418#issuecomment-614853115 Run RAT PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423675) Time Spent: 1.5h (was: 1h 20m) > Add support for splitting at fractions > 0 to > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker > -- > > Key: BEAM-8872 > URL: https://issues.apache.org/jira/browse/BEAM-8872 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only > supports checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-8646) PR #9814 appears to cause failures in fnapi_runner tests on Windows
[ https://issues.apache.org/jira/browse/BEAM-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085198#comment-17085198 ] Wanqi Lyu edited comment on BEAM-8646 at 4/16/20, 7:27 PM: --- [~robertwb] Hi Robert, should we for now add `darwin` in the special casing that was added in PR#10100? something like this: {code:java} #TODO(BEAM-8646): Reconcile the behavior on Windows / MacOS platform. if sys.platform in ('win32', 'darwin'): return 'localhost' {code} ? was (Author: violalyu): [~robertwb] Hi Robert, should we for now add `darwin` in the special casing that was added in PR#10100? something like this: ``` # TODO(BEAM-8646): Reconcile the behavior on Windows / MacOS platform. if sys.platform in ('win32', 'darwin'): return 'localhost' ``` ? > PR #9814 appears to cause failures in fnapi_runner tests on Windows > --- > > Key: BEAM-8646 > URL: https://issues.apache.org/jira/browse/BEAM-8646 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Wanqi Lyu >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > It appears that changes in > > https://github.com/apache/beam/commit/d6bcb03f586b5430c30f6ca4a1af9e42711e529c > cause test failures in Beam test suite on Windows, for example: > python setup.py nosetests --tests > apache_beam/runners/portability/portable_runner_test.py:PortableRunnerTestWithExternalEnv.test_callbacks_with_exception > > does not finish on a Windows VM machine within at least 60 seconds but passes > within a second if we change host_from_worker to return 'localhost' in [1]. > [~violalyu] , do you think you could take a look? Thanks! > cc: [~chadrik] [~thw] > [1] > https://github.com/apache/beam/blob/808cb35018cd228a59b152234b655948da2455fa/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1377. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8646) PR #9814 appears to cause failures in fnapi_runner tests on Windows
[ https://issues.apache.org/jira/browse/BEAM-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085198#comment-17085198 ] Wanqi Lyu commented on BEAM-8646: - [~robertwb] Hi Robert, should we for now add `darwin` in the special casing that was added in PR#10100? something like this: ``` # TODO(BEAM-8646): Reconcile the behavior on Windows / MacOS platform. if sys.platform in ('win32', 'darwin'): return 'localhost' ``` ? > PR #9814 appears to cause failures in fnapi_runner tests on Windows > --- > > Key: BEAM-8646 > URL: https://issues.apache.org/jira/browse/BEAM-8646 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Wanqi Lyu >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > It appears that changes in > > https://github.com/apache/beam/commit/d6bcb03f586b5430c30f6ca4a1af9e42711e529c > cause test failures in Beam test suite on Windows, for example: > python setup.py nosetests --tests > apache_beam/runners/portability/portable_runner_test.py:PortableRunnerTestWithExternalEnv.test_callbacks_with_exception > > does not finish on a Windows VM machine within at least 60 seconds but passes > within a second if we change host_from_worker to return 'localhost' in [1]. > [~violalyu] , do you think you could take a look? Thanks! > cc: [~chadrik] [~thw] > [1] > https://github.com/apache/beam/blob/808cb35018cd228a59b152234b655948da2455fa/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1377. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9767) test_streaming_wordcount flaky timeouts
[ https://issues.apache.org/jira/browse/BEAM-9767?focusedWorklogId=423666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423666 ] ASF GitHub Bot logged work on BEAM-9767: Author: ASF GitHub Bot Created on: 16/Apr/20 19:13 Start Date: 16/Apr/20 19:13 Worklog Time Spent: 10m Work Description: pabloem commented on issue #11440: [BEAM-9767] Add a timeout to the TestStream GRPC and fix the Streaming cache timeout URL: https://github.com/apache/beam/pull/11440#issuecomment-614844071 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423666) Time Spent: 0.5h (was: 20m) > test_streaming_wordcount flaky timeouts > --- > > Key: BEAM-9767 > URL: https://issues.apache.org/jira/browse/BEAM-9767 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures >Reporter: Udi Meiri >Assignee: Sam Rohde >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Timed out after 600s, typically completes in 2.8s on my workstation. > https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/ > {code} > self = > testMethod=test_streaming_wordcount> > @unittest.skipIf( > sys.version_info < (3, 5, 3), > 'The tests require at least Python 3.6 to work.') > def test_streaming_wordcount(self): > class WordExtractingDoFn(beam.DoFn): > def process(self, element): > text_line = element.strip() > words = text_line.split() > return words > > # Add the TestStream so that it can be cached. > ib.options.capturable_sources.add(TestStream) > ib.options.capture_duration = timedelta(seconds=5) > > p = beam.Pipeline( > runner=interactive_runner.InteractiveRunner(), > options=StandardOptions(streaming=True)) > > data = ( > p > | TestStream() > .advance_watermark_to(0) > .advance_processing_time(1) > .add_elements(['to', 'be', 'or', 'not', 'to', 'be']) > .advance_watermark_to(20) > .advance_processing_time(1) > .add_elements(['that', 'is', 'the', 'question']) > | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable > > counts = ( > data > | 'split' >> beam.ParDo(WordExtractingDoFn()) > | 'pair_with_one' >> beam.Map(lambda x: (x, 1)) > | 'group' >> beam.GroupByKey() > | 'count' >> beam.Map(lambda wordones: (wordones[0], > sum(wordones[1] > > # Watch the local scope for Interactive Beam so that referenced > PCollections > # will be cached. > ib.watch(locals()) > > # This is normally done in the interactive_utils when a transform is > # applied but needs an IPython environment. So we manually run this > here. > ie.current_env().track_user_pipelines() > > # Create a fake limiter that cancels the BCJ once the main job receives > the > # expected amount of results. > class FakeLimiter: > def __init__(self, p, pcoll): > self.p = p > self.pcoll = pcoll > > def is_triggered(self): > result = ie.current_env().pipeline_result(self.p) > if result: > try: > results = result.get(self.pcoll) > except ValueError: > return False > return len(results) >= 10 > return False > > # This sets the limiters to stop reading when the test receives 10 > elements > # or after 5 seconds have elapsed (to eliminate the possibility of > hanging). > ie.current_env().options.capture_control.set_limiters_for_test( > [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))]) > > # This tests that the data was correctly cached. > pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0) > expected_data_df = pd.DataFrame([ > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('or', 0, [IntervalWindow(0, 10)], pane_info), > ('not', 0, [IntervalWindow(0, 10)], pane_info), > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('that', 2000, [IntervalWindow(20, 30)], pane_info), >
[jira] [Updated] (BEAM-9772) Pipeline validation errors log at WARN level
[ https://issues.apache.org/jira/browse/BEAM-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-9772: -- Status: Open (was: Triage Needed) > Pipeline validation errors log at WARN level > > > Key: BEAM-9772 > URL: https://issues.apache.org/jira/browse/BEAM-9772 > Project: Beam > Issue Type: Bug > Components: runner-flink, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > Validation errors mean the pipeline can't be run, so we should log at ERROR. > https://github.com/apache/beam/blob/072dd4bfcd3074f57a28a0a05f3a6813dd2104a6/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/InMemoryJobService.java#L225 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9772) Pipeline validation errors log at WARN level
Kyle Weaver created BEAM-9772: - Summary: Pipeline validation errors log at WARN level Key: BEAM-9772 URL: https://issues.apache.org/jira/browse/BEAM-9772 Project: Beam Issue Type: Bug Components: runner-flink, runner-spark Reporter: Kyle Weaver Assignee: Kyle Weaver Validation errors mean the pipeline can't be run, so we should log at ERROR. https://github.com/apache/beam/blob/072dd4bfcd3074f57a28a0a05f3a6813dd2104a6/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/InMemoryJobService.java#L225 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9767) test_streaming_wordcount flaky timeouts
[ https://issues.apache.org/jira/browse/BEAM-9767?focusedWorklogId=423661=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423661 ] ASF GitHub Bot logged work on BEAM-9767: Author: ASF GitHub Bot Created on: 16/Apr/20 19:04 Start Date: 16/Apr/20 19:04 Worklog Time Spent: 10m Work Description: rohdesamuel commented on issue #11440: [BEAM-9767] Add a timeout to the TestStream GRPC and fix the Streaming cache timeout URL: https://github.com/apache/beam/pull/11440#issuecomment-614840044 R: @pabloem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423661) Time Spent: 20m (was: 10m) > test_streaming_wordcount flaky timeouts > --- > > Key: BEAM-9767 > URL: https://issues.apache.org/jira/browse/BEAM-9767 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures >Reporter: Udi Meiri >Assignee: Sam Rohde >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Timed out after 600s, typically completes in 2.8s on my workstation. > https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/ > {code} > self = > testMethod=test_streaming_wordcount> > @unittest.skipIf( > sys.version_info < (3, 5, 3), > 'The tests require at least Python 3.6 to work.') > def test_streaming_wordcount(self): > class WordExtractingDoFn(beam.DoFn): > def process(self, element): > text_line = element.strip() > words = text_line.split() > return words > > # Add the TestStream so that it can be cached. > ib.options.capturable_sources.add(TestStream) > ib.options.capture_duration = timedelta(seconds=5) > > p = beam.Pipeline( > runner=interactive_runner.InteractiveRunner(), > options=StandardOptions(streaming=True)) > > data = ( > p > | TestStream() > .advance_watermark_to(0) > .advance_processing_time(1) > .add_elements(['to', 'be', 'or', 'not', 'to', 'be']) > .advance_watermark_to(20) > .advance_processing_time(1) > .add_elements(['that', 'is', 'the', 'question']) > | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable > > counts = ( > data > | 'split' >> beam.ParDo(WordExtractingDoFn()) > | 'pair_with_one' >> beam.Map(lambda x: (x, 1)) > | 'group' >> beam.GroupByKey() > | 'count' >> beam.Map(lambda wordones: (wordones[0], > sum(wordones[1] > > # Watch the local scope for Interactive Beam so that referenced > PCollections > # will be cached. > ib.watch(locals()) > > # This is normally done in the interactive_utils when a transform is > # applied but needs an IPython environment. So we manually run this > here. > ie.current_env().track_user_pipelines() > > # Create a fake limiter that cancels the BCJ once the main job receives > the > # expected amount of results. > class FakeLimiter: > def __init__(self, p, pcoll): > self.p = p > self.pcoll = pcoll > > def is_triggered(self): > result = ie.current_env().pipeline_result(self.p) > if result: > try: > results = result.get(self.pcoll) > except ValueError: > return False > return len(results) >= 10 > return False > > # This sets the limiters to stop reading when the test receives 10 > elements > # or after 5 seconds have elapsed (to eliminate the possibility of > hanging). > ie.current_env().options.capture_control.set_limiters_for_test( > [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))]) > > # This tests that the data was correctly cached. > pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0) > expected_data_df = pd.DataFrame([ > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('or', 0, [IntervalWindow(0, 10)], pane_info), > ('not', 0, [IntervalWindow(0, 10)], pane_info), > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('that', 2000, [IntervalWindow(20, 30)], pane_info), >
[jira] [Work logged] (BEAM-9767) test_streaming_wordcount flaky timeouts
[ https://issues.apache.org/jira/browse/BEAM-9767?focusedWorklogId=423660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423660 ] ASF GitHub Bot logged work on BEAM-9767: Author: ASF GitHub Bot Created on: 16/Apr/20 19:04 Start Date: 16/Apr/20 19:04 Worklog Time Spent: 10m Work Description: rohdesamuel commented on pull request #11440: [BEAM-9767] Add a timeout to the TestStream GRPC and fix the Streaming cache timeout URL: https://github.com/apache/beam/pull/11440 Change-Id: I33908eab8313a90829a2115029f87b7f2f454f1b The TestStream read from GRPC method didn't have a timeout and had the possibility of hanging indefinitely. This adds a 30s inactivity timeout. This also fixes the StreamingCache waiting for file timeout which didn't work. This doesn't necessarily fix BEAM-9767 because it looks more like a problem interacting with the test environment. This will make it so that the test times out after 30s instead of 600s. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?focusedWorklogId=423658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423658 ] ASF GitHub Bot logged work on BEAM-9769: Author: ASF GitHub Bot Created on: 16/Apr/20 19:02 Start Date: 16/Apr/20 19:02 Worklog Time Spent: 10m Work Description: pabloem commented on issue #11433: [BEAM-9769] Ensuring JSON is the default export format for BQ sink URL: https://github.com/apache/beam/pull/11433#issuecomment-614839093 @chunyang I reviewed the releases, and I saw that this is not in 2.20 - so I'm curious. Do you guys use Beam snapshots? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423658) Time Spent: 3h 40m (was: 3.5h) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?focusedWorklogId=423657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423657 ] ASF GitHub Bot logged work on BEAM-9769: Author: ASF GitHub Bot Created on: 16/Apr/20 19:02 Start Date: 16/Apr/20 19:02 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11433: [BEAM-9769] Ensuring JSON is the default export format for BQ sink URL: https://github.com/apache/beam/pull/11433#discussion_r409783068 ## File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads_test.py ## @@ -176,7 +176,7 @@ def test_many_files(self): file length is very small, so only a couple records fit in each file. """ -fn = bqfl.WriteRecordsToFile(schema=_ELEMENTS_SCHEMA, max_file_size=300) +fn = bqfl.WriteRecordsToFile(schema=_ELEMENTS_SCHEMA, max_file_size=50) Review comment: 50 and 300 are the file sizes that force the transform to spill out after a couple elements in JSON and AVRO respectively. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423657) Time Spent: 3.5h (was: 3h 20m) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?focusedWorklogId=423656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423656 ] ASF GitHub Bot logged work on BEAM-9769: Author: ASF GitHub Bot Created on: 16/Apr/20 19:01 Start Date: 16/Apr/20 19:01 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11433: [BEAM-9769] Ensuring JSON is the default export format for BQ sink URL: https://github.com/apache/beam/pull/11433 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423656) Time Spent: 3h 20m (was: 3h 10m) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9739) SpannerIO - Retry on Aborted Exception during schema change
[ https://issues.apache.org/jira/browse/BEAM-9739?focusedWorklogId=423642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423642 ] ASF GitHub Bot logged work on BEAM-9739: Author: ASF GitHub Bot Created on: 16/Apr/20 18:10 Start Date: 16/Apr/20 18:10 Worklog Time Spent: 10m Work Description: allenpradeep commented on issue #11392: [BEAM-9739] Retry SpannerIO write on Schema change URL: https://github.com/apache/beam/pull/11392#issuecomment-614812243 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423642) Time Spent: 40m (was: 0.5h) > SpannerIO - Retry on Aborted Exception during schema change > --- > > Key: BEAM-9739 > URL: https://issues.apache.org/jira/browse/BEAM-9739 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Allen Pradeep Xavier >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Spanner aborts all transactions in flight when there is a schema change and > returns an Aborted Exception. The client is expected to retry the transaction > silently. > SpannerIO does not handle the exception and this is propagated to the > pipeline. This bug is to track the changes to retry on Aborted Exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9678) Introduction Kata | Go SDK Code Katas
[ https://issues.apache.org/jira/browse/BEAM-9678?focusedWorklogId=423626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423626 ] ASF GitHub Bot logged work on BEAM-9678: Author: ASF GitHub Bot Created on: 16/Apr/20 18:04 Start Date: 16/Apr/20 18:04 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11340: [BEAM-9678] Create Go SDK introduction kata URL: https://github.com/apache/beam/pull/11340 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423626) Time Spent: 5h (was: 4h 50m) > Introduction Kata | Go SDK Code Katas > - > > Key: BEAM-9678 > URL: https://issues.apache.org/jira/browse/BEAM-9678 > Project: Beam > Issue Type: Sub-task > Components: katas, sdk-go >Reporter: Damon Douglas >Assignee: Damon Douglas >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > An Introduction kata patterns after > [https://github.com/apache/beam/tree/master/learning/katas/java/Introduction] > where the take away is an individual's ability to start an Apache Beam > pipeline using the Golang SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9678) Introduction Kata | Go SDK Code Katas
[ https://issues.apache.org/jira/browse/BEAM-9678?focusedWorklogId=423625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423625 ] ASF GitHub Bot logged work on BEAM-9678: Author: ASF GitHub Bot Created on: 16/Apr/20 18:03 Start Date: 16/Apr/20 18:03 Worklog Time Spent: 10m Work Description: lostluck commented on issue #11340: [BEAM-9678] Create Go SDK introduction kata URL: https://github.com/apache/beam/pull/11340#issuecomment-614808287 Thank you very much! I look forward to the next lessons. This can wait for a next PR, but having a small README on how to keep the go side of the katas up to date (eg commands to run, like go mod tidy or pointers to instructions for how to upload them to stepik) would be valuable for ongoing maintenance and improvement as the SDK has more features added (like Trigger support). I know that when the SDK becomes module compatible and no-longer experimental, we'll definitely need to update these files :) . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423625) Time Spent: 4h 50m (was: 4h 40m) > Introduction Kata | Go SDK Code Katas > - > > Key: BEAM-9678 > URL: https://issues.apache.org/jira/browse/BEAM-9678 > Project: Beam > Issue Type: Sub-task > Components: katas, sdk-go >Reporter: Damon Douglas >Assignee: Damon Douglas >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > An Introduction kata patterns after > [https://github.com/apache/beam/tree/master/learning/katas/java/Introduction] > where the take away is an individual's ability to start an Apache Beam > pipeline using the Golang SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9685) Don't release Go SDK container until Go is officially supported.
[ https://issues.apache.org/jira/browse/BEAM-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085137#comment-17085137 ] Kyle Weaver commented on BEAM-9685: --- Thanks for addressing that Hannah. Ahmet, I think it's okay to consider this closed then. > Don't release Go SDK container until Go is officially supported. > > > Key: BEAM-9685 > URL: https://issues.apache.org/jira/browse/BEAM-9685 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > 1. Remove Go SDK container from release process. > 2. Update document about it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9685) Don't release Go SDK container until Go is officially supported.
[ https://issues.apache.org/jira/browse/BEAM-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085132#comment-17085132 ] Hannah Jiang commented on BEAM-9685: I think it's ok to keep already released images. I added a comment at the docker page to say Go SDK is experimental and will not be released from 2.21.0. > Don't release Go SDK container until Go is officially supported. > > > Key: BEAM-9685 > URL: https://issues.apache.org/jira/browse/BEAM-9685 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > 1. Remove Go SDK container from release process. > 2. Update document about it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6860) WriteToText crash with "GlobalWindow -> ._IntervalWindowBase"
[ https://issues.apache.org/jira/browse/BEAM-6860?focusedWorklogId=423609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423609 ] ASF GitHub Bot logged work on BEAM-6860: Author: ASF GitHub Bot Created on: 16/Apr/20 17:50 Start Date: 16/Apr/20 17:50 Worklog Time Spent: 10m Work Description: udim commented on issue #11439: [BEAM-6860] Fix iobase non-global windows bug URL: https://github.com/apache/beam/pull/11439#issuecomment-614800987 R: @chamikaramj @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423609) Time Spent: 1.5h (was: 1h 20m) > WriteToText crash with "GlobalWindow -> ._IntervalWindowBase" > - > > Key: BEAM-6860 > URL: https://issues.apache.org/jira/browse/BEAM-6860 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.11.0 > Environment: macOS, DirectRunner, python 2.7.15 via > pyenv/pyenv-virtualenv >Reporter: Henrik >Assignee: Udi Meiri >Priority: Major > Labels: newbie > Fix For: 2.16.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Main error: > > Cannot convert GlobalWindow to > > apache_beam.utils.windowed_value._IntervalWindowBase > This is very hard for me to debug. Doing a DoPar call before, printing the > input, gives me just what I want; so the lines of data to serialise are > "alright"; just JSON strings, in fact. > Stacktrace: > {code:java} > Traceback (most recent call last): > File "./okr_end_ride.py", line 254, in > run() > File "./okr_end_ride.py", line 250, in run > run_pipeline(pipeline_options, known_args) > File "./okr_end_ride.py", line 198, in run_pipeline > | 'write_all' >> WriteToText(known_args.output, > file_name_suffix=".txt") > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 426, in __exit__ > self.run().wait_until_finish() > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 406, in run > self._options).run(False) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 419, in run > return self.runner.run_pipeline(self, self._options) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py", > line 132, in run_pipeline > return runner.run_pipeline(pipeline, options) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 275, in run_pipeline > default_environment=self._default_environment)) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 278, in run_via_runner_api > return self.run_stages(*self.create_stages(pipeline_proto)) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 354, in run_stages > stage_context.safe_coders) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 509, in run_stage > data_input, data_output) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 1206, in process_bundle > result_future = self._controller.control_handler.push(process_bundle) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 821, in push > response = self.worker.do_instruction(request) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 265, in do_instruction > request.instruction_id) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 281, in process_bundle > delayed_applications = bundle_processor.process_bundle(instruction_id) > File >
[jira] [Work logged] (BEAM-6860) WriteToText crash with "GlobalWindow -> ._IntervalWindowBase"
[ https://issues.apache.org/jira/browse/BEAM-6860?focusedWorklogId=423608=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423608 ] ASF GitHub Bot logged work on BEAM-6860: Author: ASF GitHub Bot Created on: 16/Apr/20 17:49 Start Date: 16/Apr/20 17:49 Worklog Time Spent: 10m Work Description: udim commented on pull request #11439: [BEAM-6860] Fix iobase non-global windows bug URL: https://github.com/apache/beam/pull/11439 WriteImpl writes elements to files and outputs `writer.close()` return values. It assumed (when num_shards=0) that the PCollection window was global. The fix is to set a global window before doing the writes. Added WriteToText test, where this issue manifested. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?focusedWorklogId=423603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423603 ] ASF GitHub Bot logged work on BEAM-9769: Author: ASF GitHub Bot Created on: 16/Apr/20 17:43 Start Date: 16/Apr/20 17:43 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11433: [BEAM-9769] Ensuring JSON is the default export format for BQ sink URL: https://github.com/apache/beam/pull/11433#discussion_r409736653 ## File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads_test.py ## @@ -176,7 +176,7 @@ def test_many_files(self): file length is very small, so only a couple records fit in each file. """ -fn = bqfl.WriteRecordsToFile(schema=_ELEMENTS_SCHEMA, max_file_size=300) +fn = bqfl.WriteRecordsToFile(schema=_ELEMENTS_SCHEMA, max_file_size=50) Review comment: Why change these? (Not opposed just curious for the record) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423603) Time Spent: 3h 10m (was: 3h) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver resolved BEAM-9136. --- Resolution: Fixed > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 22h 10m > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9505) SpannerIO spurious error message with empty bundles
[ https://issues.apache.org/jira/browse/BEAM-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085116#comment-17085116 ] Kyle Weaver commented on BEAM-9505: --- FYI I'm removing the fix version on this issue because the Beam 2.21 release branch is already cut, meaning we would have to cherry-pick a fix for this to get it in 2.21, which doesn't seem warranted. Please let me know if you disagree. > SpannerIO spurious error message with empty bundles > --- > > Key: BEAM-9505 > URL: https://issues.apache.org/jira/browse/BEAM-9505 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.18.0, 2.19.0, 2.20.0 >Reporter: Niel Markwick >Assignee: Niel Markwick >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > -When using DataflowRunner in streaming mode. DoFn.StartBundle is called > multiple times for the same bundle.- > -This does not occur with DirectRunner.- > -This breaks DoFn's which require per-bundle setup and teardown procedures.- > When a bundle is empty (such as in streaming if a window is empty), SpannerIO > will report a spurious error message: > {{IllegalStateException: Sorter should be null here}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9505) SpannerIO spurious error message with empty bundles
[ https://issues.apache.org/jira/browse/BEAM-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-9505: -- Fix Version/s: (was: 2.21.0) > SpannerIO spurious error message with empty bundles > --- > > Key: BEAM-9505 > URL: https://issues.apache.org/jira/browse/BEAM-9505 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.18.0, 2.19.0, 2.20.0 >Reporter: Niel Markwick >Assignee: Niel Markwick >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > -When using DataflowRunner in streaming mode. DoFn.StartBundle is called > multiple times for the same bundle.- > -This does not occur with DirectRunner.- > -This breaks DoFn's which require per-bundle setup and teardown procedures.- > When a bundle is empty (such as in streaming if a window is empty), SpannerIO > will report a spurious error message: > {{IllegalStateException: Sorter should be null here}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-6860) WriteToText crash with "GlobalWindow -> ._IntervalWindowBase"
[ https://issues.apache.org/jira/browse/BEAM-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri reopened BEAM-6860: - This is still an issue. In the latest Beam it manifests as: {code} AttributeError: 'GlobalWindow' object has no attribute '_end_micros' [while running 'Write/Write/WriteImpl/WriteBundles'] {code} > WriteToText crash with "GlobalWindow -> ._IntervalWindowBase" > - > > Key: BEAM-6860 > URL: https://issues.apache.org/jira/browse/BEAM-6860 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.11.0 > Environment: macOS, DirectRunner, python 2.7.15 via > pyenv/pyenv-virtualenv >Reporter: Henrik >Assignee: Udi Meiri >Priority: Major > Labels: newbie > Fix For: 2.16.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Main error: > > Cannot convert GlobalWindow to > > apache_beam.utils.windowed_value._IntervalWindowBase > This is very hard for me to debug. Doing a DoPar call before, printing the > input, gives me just what I want; so the lines of data to serialise are > "alright"; just JSON strings, in fact. > Stacktrace: > {code:java} > Traceback (most recent call last): > File "./okr_end_ride.py", line 254, in > run() > File "./okr_end_ride.py", line 250, in run > run_pipeline(pipeline_options, known_args) > File "./okr_end_ride.py", line 198, in run_pipeline > | 'write_all' >> WriteToText(known_args.output, > file_name_suffix=".txt") > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 426, in __exit__ > self.run().wait_until_finish() > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 406, in run > self._options).run(False) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 419, in run > return self.runner.run_pipeline(self, self._options) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py", > line 132, in run_pipeline > return runner.run_pipeline(pipeline, options) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 275, in run_pipeline > default_environment=self._default_environment)) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 278, in run_via_runner_api > return self.run_stages(*self.create_stages(pipeline_proto)) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 354, in run_stages > stage_context.safe_coders) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 509, in run_stage > data_input, data_output) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 1206, in process_bundle > result_future = self._controller.control_handler.push(process_bundle) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", > line 821, in push > response = self.worker.do_instruction(request) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 265, in do_instruction > request.instruction_id) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 281, in process_bundle > delayed_applications = bundle_processor.process_bundle(instruction_id) > File > "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 552, in process_bundle > op.finish() > File "apache_beam/runners/worker/operations.py", line 549, in > apache_beam.runners.worker.operations.DoOperation.finish > File "apache_beam/runners/worker/operations.py", line 550, in > apache_beam.runners.worker.operations.DoOperation.finish > File "apache_beam/runners/worker/operations.py", line 551, in > apache_beam.runners.worker.operations.DoOperation.finish > File "apache_beam/runners/common.py", line 758, in > apache_beam.runners.common.DoFnRunner.finish > File "apache_beam/runners/common.py", line 752, in >