[jira] [Work logged] (BEAM-7810) Allow ValueProvider arguments to ReadFromDatastore
[ https://issues.apache.org/jira/browse/BEAM-7810?focusedWorklogId=378091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378091 ] ASF GitHub Bot logged work on BEAM-7810: Author: ASF GitHub Bot Created on: 28/Jan/20 08:08 Start Date: 28/Jan/20 08:08 Worklog Time Spent: 10m Work Description: EDjur commented on issue #10683: [BEAM-7810] Added ValueProvider support for Datastore query namespaces URL: https://github.com/apache/beam/pull/10683#issuecomment-579127724 Cheers for opening and fixing that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378091) Time Spent: 1h 10m (was: 1h) > Allow ValueProvider arguments to ReadFromDatastore > -- > > Key: BEAM-7810 > URL: https://issues.apache.org/jira/browse/BEAM-7810 > Project: Beam > Issue Type: New Feature > Components: io-py-gcp >Reporter: Udi Meiri >Assignee: Elias Djurfeldt >Priority: Minor > Fix For: 2.20.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > From: > https://stackoverflow.com/questions/56748893/trying-to-achieve-runtime-value-of-namespace-of-datastore-in-dataflow-template -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378095 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 08:23 Start Date: 28/Jan/20 08:23 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579132447 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378095) Time Spent: 1h 20m (was: 1h 10m) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378096 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 08:23 Start Date: 28/Jan/20 08:23 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579132447 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378096) Time Spent: 1.5h (was: 1h 20m) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378097 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 08:25 Start Date: 28/Jan/20 08:25 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579133146 @suztomo maybe you know a way I can run the linkagechecker analysis in the full set of Beam modules? I think is more scalable to have a task for that that we invoke during PRs to validate that no regressions are included as suggested by Luke. (I can do that in Maven but my gradle-fu is still not good enough). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378097) Time Spent: 1h 40m (was: 1.5h) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits
Ismaël Mejía created BEAM-9204: -- Summary: HBase SDF @SplitRestriction does not take the range input into account to restrict splits Key: BEAM-9204 URL: https://issues.apache.org/jira/browse/BEAM-9204 Project: Beam Issue Type: Bug Components: io-java-hbase Reporter: Ismaël Mejía Assignee: Ismaël Mejía -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits
[ https://issues.apache.org/jira/browse/BEAM-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9204: --- Status: Open (was: Triage Needed) > HBase SDF @SplitRestriction does not take the range input into account to > restrict splits > - > > Key: BEAM-9204 > URL: https://issues.apache.org/jira/browse/BEAM-9204 > Project: Beam > Issue Type: Bug > Components: io-java-hbase >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > > This is an issue because it is common if split is called multiple times work > this will produce repeated work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits
[ https://issues.apache.org/jira/browse/BEAM-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9204: --- Description: This is an issue because it is common if split is called multiple times work this will produce repeated work. > HBase SDF @SplitRestriction does not take the range input into account to > restrict splits > - > > Key: BEAM-9204 > URL: https://issues.apache.org/jira/browse/BEAM-9204 > Project: Beam > Issue Type: Bug > Components: io-java-hbase >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > > This is an issue because it is common if split is called multiple times work > this will produce repeated work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-4735) Make HBaseIO.read() based on SDF
[ https://issues.apache.org/jira/browse/BEAM-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024990#comment-17024990 ] Ismaël Mejía commented on BEAM-4735: Oh interesting finding! Just filled BEAM-9204 to tackle it. > Make HBaseIO.read() based on SDF > > > Key: BEAM-4735 > URL: https://issues.apache.org/jira/browse/BEAM-4735 > Project: Beam > Issue Type: Improvement > Components: io-java-hbase >Reporter: Ismaël Mejía >Priority: Minor > > BEAM-4020 introduces HBaseIO reads based on SDF. So far the read() method > still uses the Source based API for two reasons: > 1. Most distributed runners don't supports Bounded SDF today. > 2. SDF does not support Dynamic Work Rebalancing but the Source API of HBase > already supports it so changing it means losing some functionality. > Once there is improvements in both (1) and (2) we should consider moving the > main read() function to use the SDF API and remove the Source based > implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits
[ https://issues.apache.org/jira/browse/BEAM-9204?focusedWorklogId=378121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378121 ] ASF GitHub Bot logged work on BEAM-9204: Author: ASF GitHub Bot Created on: 28/Jan/20 09:45 Start Date: 28/Jan/20 09:45 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #10700: [BEAM-9204] Fix HBase SplitRestriction to be based on provided Range URL: https://github.com/apache/beam/pull/10700 R: @lukecwik This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378121) Remaining Estimate: 0h Time Spent: 10m > HBase SDF @SplitRestriction does not take the range input into account to > restrict splits > - > > Key: BEAM-9204 > URL: https://issues.apache.org/jira/browse/BEAM-9204 > Project: Beam > Issue Type: Bug > Components: io-java-hbase >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > This is an issue because it is common if split is called multiple times work > this will produce repeated work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits
[ https://issues.apache.org/jira/browse/BEAM-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9204: --- Description: This is an issue because that the split is called multiple times and in this cas it will produce repeated work. (was: This is an issue because it is common if split is called multiple times work this will produce repeated work.) > HBase SDF @SplitRestriction does not take the range input into account to > restrict splits > - > > Key: BEAM-9204 > URL: https://issues.apache.org/jira/browse/BEAM-9204 > Project: Beam > Issue Type: Bug > Components: io-java-hbase >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > This is an issue because that the split is called multiple times and in this > cas it will produce repeated work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9205) Regression in validates runner tests configuration in spark module
Etienne Chauchot created BEAM-9205: -- Summary: Regression in validates runner tests configuration in spark module Key: BEAM-9205 URL: https://issues.apache.org/jira/browse/BEAM-9205 Project: Beam Issue Type: Test Components: runner-spark Reporter: Etienne Chauchot Assignee: Etienne Chauchot Not all the metrics tests are run: at least MetricsPusher is no more run -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9205) Regression in validates runner tests configuration in spark module
[ https://issues.apache.org/jira/browse/BEAM-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Etienne Chauchot updated BEAM-9205: --- Parent: BEAM-3310 Issue Type: Sub-task (was: Test) > Regression in validates runner tests configuration in spark module > -- > > Key: BEAM-9205 > URL: https://issues.apache.org/jira/browse/BEAM-9205 > Project: Beam > Issue Type: Sub-task > Components: runner-spark >Reporter: Etienne Chauchot >Assignee: Etienne Chauchot >Priority: Major > > Not all the metrics tests are run: at least MetricsPusher is no more run -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9205) Regression in validates runner tests configuration in spark module
[ https://issues.apache.org/jira/browse/BEAM-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Etienne Chauchot closed BEAM-9205. -- Fix Version/s: 2.20.0 Resolution: Fixed > Regression in validates runner tests configuration in spark module > -- > > Key: BEAM-9205 > URL: https://issues.apache.org/jira/browse/BEAM-9205 > Project: Beam > Issue Type: Sub-task > Components: runner-spark >Reporter: Etienne Chauchot >Assignee: Etienne Chauchot >Priority: Major > Fix For: 2.20.0 > > > Not all the metrics tests are run: at least MetricsPusher is no more run -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode
[ https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378151&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378151 ] ASF GitHub Bot logged work on BEAM-8972: Author: ASF GitHub Bot Created on: 28/Jan/20 10:59 Start Date: 28/Jan/20 10:59 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add Jenkins job with Combine test for portable Java URL: https://github.com/apache/beam/pull/10386#issuecomment-579191281 Run seed job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378151) Time Spent: 4.5h (was: 4h 20m) > Add a Jenkins job running Combine load test on Java with Flink in Portability > mode > -- > > Key: BEAM-8972 > URL: https://issues.apache.org/jira/browse/BEAM-8972 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Time Spent: 4.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode
[ https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378152&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378152 ] ASF GitHub Bot logged work on BEAM-8972: Author: ASF GitHub Bot Created on: 28/Jan/20 11:01 Start Date: 28/Jan/20 11:01 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add Jenkins job with Combine test for portable Java URL: https://github.com/apache/beam/pull/10386#issuecomment-579192080 Run Load Tests Java Combine Portable Flink Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378152) Time Spent: 4h 40m (was: 4.5h) > Add a Jenkins job running Combine load test on Java with Flink in Portability > mode > -- > > Key: BEAM-8972 > URL: https://issues.apache.org/jira/browse/BEAM-8972 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode
[ https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378154&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378154 ] ASF GitHub Bot logged work on BEAM-8972: Author: ASF GitHub Bot Created on: 28/Jan/20 11:02 Start Date: 28/Jan/20 11:02 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add Jenkins job with Combine test for portable Java URL: https://github.com/apache/beam/pull/10386#issuecomment-57919 Run Load Tests Java Combine Portable Flink Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378154) Time Spent: 5h (was: 4h 50m) > Add a Jenkins job running Combine load test on Java with Flink in Portability > mode > -- > > Key: BEAM-8972 > URL: https://issues.apache.org/jira/browse/BEAM-8972 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Time Spent: 5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode
[ https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378153&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378153 ] ASF GitHub Bot logged work on BEAM-8972: Author: ASF GitHub Bot Created on: 28/Jan/20 11:02 Start Date: 28/Jan/20 11:02 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add Jenkins job with Combine test for portable Java URL: https://github.com/apache/beam/pull/10386#issuecomment-579192080 Run Load Tests Java Combine Portable Flink Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378153) Time Spent: 4h 50m (was: 4h 40m) > Add a Jenkins job running Combine load test on Java with Flink in Portability > mode > -- > > Key: BEAM-8972 > URL: https://issues.apache.org/jira/browse/BEAM-8972 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8925) Beam Dependency Update Request: org.apache.tika:tika-core
[ https://issues.apache.org/jira/browse/BEAM-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colm O hEigeartaigh reassigned BEAM-8925: - Assignee: Colm O hEigeartaigh > Beam Dependency Update Request: org.apache.tika:tika-core > - > > Key: BEAM-8925 > URL: https://issues.apache.org/jira/browse/BEAM-8925 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Colm O hEigeartaigh >Priority: Major > > - 2019-12-09 12:20:22.212496 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-12-23 12:20:53.356760 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-12-30 14:15:58.081400 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2020-01-06 12:19:33.456649 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2020-01-13 12:18:38.940974 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2020-01-20 12:16:03.428169 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2020-01-27 12:17:01.302466 > - > Please consider upgrading the dependency org.apache.tika:tika-core. > The current version is 1.20. The latest version is 1.23 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.
[ https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378163 ] ASF GitHub Bot logged work on BEAM-9063: Author: ASF GitHub Bot Created on: 28/Jan/20 11:38 Start Date: 28/Jan/20 11:38 Worklog Time Spent: 10m Work Description: mxm commented on issue #10612: [DO NOT MERGE][BEAM-9063] migrate docker images to apache URL: https://github.com/apache/beam/pull/10612#issuecomment-579205196 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378163) Time Spent: 5.5h (was: 5h 20m) > Migrate docker images to apache namespace. > -- > > Key: BEAM-9063 > URL: https://issues.apache.org/jira/browse/BEAM-9063 > Project: Beam > Issue Type: Task > Components: beam-community >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: Not applicable > > Time Spent: 5.5h > Remaining Estimate: 0h > > https://hub.docker.com/u/apache -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.
[ https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378167&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378167 ] ASF GitHub Bot logged work on BEAM-9063: Author: ASF GitHub Bot Created on: 28/Jan/20 11:44 Start Date: 28/Jan/20 11:44 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10612: [DO NOT MERGE][BEAM-9063] migrate docker images to apache URL: https://github.com/apache/beam/pull/10612#discussion_r371752564 ## File path: release/src/main/scripts/build_release_candidate.sh ## @@ -45,6 +45,9 @@ PYTHON_ARTIFACTS_DIR=python BEAM_ROOT_DIR=beam WEBSITE_ROOT_DIR=beam-site +DOCKER_IMAGE_DEFAULT_REPO_ROOT=apache +DOCKER_IMAGE_DEFAULT_REPO_PREFIX=beam_ Review comment: Would it be worth sourcing these from a script? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378167) Time Spent: 5h 40m (was: 5.5h) > Migrate docker images to apache namespace. > -- > > Key: BEAM-9063 > URL: https://issues.apache.org/jira/browse/BEAM-9063 > Project: Beam > Issue Type: Task > Components: beam-community >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: Not applicable > > Time Spent: 5h 40m > Remaining Estimate: 0h > > https://hub.docker.com/u/apache -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.
[ https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378168 ] ASF GitHub Bot logged work on BEAM-9063: Author: ASF GitHub Bot Created on: 28/Jan/20 11:44 Start Date: 28/Jan/20 11:44 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10612: [DO NOT MERGE][BEAM-9063] migrate docker images to apache URL: https://github.com/apache/beam/pull/10612#discussion_r371752615 ## File path: release/src/main/scripts/publish_docker_images.sh ## @@ -24,6 +24,9 @@ set -e +DOCKER_IMAGE_DEFAULT_REPO_ROOT=apache +DOCKER_IMAGE_DEFAULT_REPO_PREFIX=beam_ Review comment: Would it be worth sourcing these from a script? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378168) Time Spent: 5h 40m (was: 5.5h) > Migrate docker images to apache namespace. > -- > > Key: BEAM-9063 > URL: https://issues.apache.org/jira/browse/BEAM-9063 > Project: Beam > Issue Type: Task > Components: beam-community >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: Not applicable > > Time Spent: 5h 40m > Remaining Estimate: 0h > > https://hub.docker.com/u/apache -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode
[ https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378184 ] ASF GitHub Bot logged work on BEAM-8972: Author: ASF GitHub Bot Created on: 28/Jan/20 12:30 Start Date: 28/Jan/20 12:30 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add Jenkins job with Combine test for portable Java URL: https://github.com/apache/beam/pull/10386#issuecomment-579223188 run seed job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378184) Time Spent: 5h 10m (was: 5h) > Add a Jenkins job running Combine load test on Java with Flink in Portability > mode > -- > > Key: BEAM-8972 > URL: https://issues.apache.org/jira/browse/BEAM-8972 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Time Spent: 5h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8550) @RequiresTimeSortedInput DoFn annotation
[ https://issues.apache.org/jira/browse/BEAM-8550?focusedWorklogId=378199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378199 ] ASF GitHub Bot logged work on BEAM-8550: Author: ASF GitHub Bot Created on: 28/Jan/20 12:51 Start Date: 28/Jan/20 12:51 Worklog Time Spent: 10m Work Description: je-ik commented on issue #8774: [BEAM-8550] Requires time sorted input URL: https://github.com/apache/beam/pull/8774#issuecomment-579230688 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378199) Time Spent: 9h 20m (was: 9h 10m) > @RequiresTimeSortedInput DoFn annotation > > > Key: BEAM-8550 > URL: https://issues.apache.org/jira/browse/BEAM-8550 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > > Implement new annotation {{@RequiresTimeSortedInput}} for stateful DoFn as > described in [design > document|https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/edit?usp=sharing]. > First implementation will assume that: > - time is defined by timestamp in associated WindowedValue > - allowed lateness is explicitly zero and all late elements are dropped > (due to being out of order) > The above properties are considered temporary and will be resolved by > subsequent extensions (backwards compatible). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378206 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 12:57 Start Date: 28/Jan/20 12:57 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579232716 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378206) Time Spent: 13h (was: 12h 50m) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 13h > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378207 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 12:57 Start Date: 28/Jan/20 12:57 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579232716 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378207) Time Spent: 13h 10m (was: 13h) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 13h 10m > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378208 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 12:57 Start Date: 28/Jan/20 12:57 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579232967 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378208) Time Spent: 13h 20m (was: 13h 10m) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 13h 20m > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode
[ https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378212&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378212 ] ASF GitHub Bot logged work on BEAM-8972: Author: ASF GitHub Bot Created on: 28/Jan/20 13:06 Start Date: 28/Jan/20 13:06 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add Jenkins job with Combine test for portable Java URL: https://github.com/apache/beam/pull/10386#issuecomment-579236550 Run Load Tests Java Combine Portable Flink Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378212) Time Spent: 5h 20m (was: 5h 10m) > Add a Jenkins job running Combine load test on Java with Flink in Portability > mode > -- > > Key: BEAM-8972 > URL: https://issues.apache.org/jira/browse/BEAM-8972 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Time Spent: 5h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8298) Implement state caching for side inputs
[ https://issues.apache.org/jira/browse/BEAM-8298?focusedWorklogId=378222&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378222 ] ASF GitHub Bot logged work on BEAM-8298: Author: ASF GitHub Bot Created on: 28/Jan/20 13:21 Start Date: 28/Jan/20 13:21 Worklog Time Spent: 10m Work Description: mxm commented on issue #10679: [BEAM-8298] Fully specify the necessary details to support side input caching. URL: https://github.com/apache/beam/pull/10679#issuecomment-579242154 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378222) Time Spent: 1h (was: 50m) > Implement state caching for side inputs > --- > > Key: BEAM-8298 > URL: https://issues.apache.org/jira/browse/BEAM-8298 > Project: Beam > Issue Type: Improvement > Components: runner-core, sdk-py-harness >Reporter: Maximilian Michels >Assignee: Jing Chen >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Caching is currently only implemented for user state. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378223&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378223 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 13:25 Start Date: 28/Jan/20 13:25 Worklog Time Spent: 10m Work Description: jbonofre commented on issue #10644: [BEAM-7427] Refactore JmsCheckpointMark to be usage via Coder URL: https://github.com/apache/beam/pull/10644#issuecomment-579243908 As discussed, I've keep `JmsCheckpointMark` in a dedicated class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378223) Time Spent: 8h 10m (was: 8h) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Mourad >Priority: Major > Time Spent: 8h 10m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378225&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378225 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 13:26 Start Date: 28/Jan/20 13:26 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579244488 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378225) Time Spent: 13.5h (was: 13h 20m) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 13.5h > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378227&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378227 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 13:26 Start Date: 28/Jan/20 13:26 Worklog Time Spent: 10m Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579232967 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378227) Time Spent: 13h 40m (was: 13.5h) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 13h 40m > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378228&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378228 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 13:28 Start Date: 28/Jan/20 13:28 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579245093 Run Direct ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378228) Time Spent: 13h 50m (was: 13h 40m) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 13h 50m > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9146) [Python] PTransform that integrates Video Intelligence functionality
[ https://issues.apache.org/jira/browse/BEAM-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025109#comment-17025109 ] Elias Djurfeldt commented on BEAM-9146: --- Are there any other PTransforms that call external API's in Beam? I'm working on implementing this but running into some design considerations regarding mocking the video intelligence API for test purposes. > [Python] PTransform that integrates Video Intelligence functionality > > > Key: BEAM-9146 > URL: https://issues.apache.org/jira/browse/BEAM-9146 > Project: Beam > Issue Type: Sub-task > Components: io-py-gcp >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > > The goal is to create a PTransform that integrates Google Cloud Video > Intelligence functionality [1]. > The transform should be able to take both video GCS location or video data > bytes as an input. > The transform should be put into _sdks/python/apache_beam/io/gcp/ai_ folder. > [1] https://cloud.google.com/video-intelligence/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378246&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378246 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 13:50 Start Date: 28/Jan/20 13:50 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10644: [BEAM-7427] Refactore JmsCheckpointMark to be usage via Coder URL: https://github.com/apache/beam/pull/10644#issuecomment-579254727 Sorry for the delay, the extra commit with the fixes looks good. I was thinking that since the stored messages are not needed to restore the progress of the reads on `UnboundedJmsReader` maybe the simplest fix is just to let them transient as you proposed. About the State changes maybe let's do those in a subsequent PR so we can get this fix out more quickly. WDYT If you agree just let the class as it was before and then I will merge. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378246) Time Spent: 8h 20m (was: 8h 10m) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Mourad >Priority: Major > Time Spent: 8h 20m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378249&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378249 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 13:52 Start Date: 28/Jan/20 13:52 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10644: [BEAM-7427] Refactore JmsCheckpointMark to be usage via Coder URL: https://github.com/apache/beam/pull/10644#issuecomment-579255395 Oh you already get rid of state hehe, my bad ok looking again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378249) Time Spent: 8.5h (was: 8h 20m) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Mourad >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness
[ https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378253 ] ASF GitHub Bot logged work on BEAM-8618: Author: ASF GitHub Bot Created on: 28/Jan/20 14:06 Start Date: 28/Jan/20 14:06 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear down unused DoFns periodically in Python SDK harness. URL: https://github.com/apache/beam/pull/10655#discussion_r371817547 ## File path: sdks/python/apache_beam/runners/worker/sdk_worker.py ## @@ -280,6 +283,7 @@ def get(self, instruction_id, bundle_descriptor_id): try: # pop() is threadsafe processor = self.cached_bundle_processors[bundle_descriptor_id].pop() + self.last_access_time[bundle_descriptor_id] = time.time() except IndexError: Review comment: This won't update the access time when we first create the processor in the except block. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378253) Time Spent: 40m (was: 0.5h) > Tear down unused DoFns periodically in Python SDK harness > - > > Key: BEAM-8618 > URL: https://issues.apache.org/jira/browse/BEAM-8618 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.20.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Per the discussion in the ML, detail can be found [1], the teardown of DoFns > should be supported in the portability framework. It happens at two places: > 1) Upon the control service termination > 2) Tear down the unused DoFns periodically > The aim of this JIRA is to add support for tear down the unused DoFns > periodically in Python SDK harness. > [1] > https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness
[ https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378255 ] ASF GitHub Bot logged work on BEAM-8618: Author: ASF GitHub Bot Created on: 28/Jan/20 14:06 Start Date: 28/Jan/20 14:06 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear down unused DoFns periodically in Python SDK harness. URL: https://github.com/apache/beam/pull/10655#discussion_r371817978 ## File path: sdks/python/apache_beam/runners/worker/sdk_worker.py ## @@ -315,18 +319,47 @@ def release(self, instruction_id): """ descriptor_id, processor = self.active_bundle_processors.pop(instruction_id) processor.reset() +self.last_access_time[descriptor_id] = time.time() self.cached_bundle_processors[descriptor_id].append(processor) def shutdown(self): """ Shutdown all ``BundleProcessor``s in the cache. """ +if self.periodic_shutdown: + self.periodic_shutdown.cancel() + self.periodic_shutdown.join() + self.periodic_shutdown = None + for instruction_id in self.active_bundle_processors: self.active_bundle_processors[instruction_id][1].shutdown() del self.active_bundle_processors[instruction_id] for cached_bundle_processors in self.cached_bundle_processors.values(): - while len(cached_bundle_processors) > 0: -cached_bundle_processors.pop().shutdown() + BundleProcessorCache._shutdown_cached_bundle_processors( + cached_bundle_processors) + + def _schedule_periodic_shutdown(self): +def shutdown_inactive_bundle_processors(): + for descriptor_id, last_access_time in self.last_access_time.items(): +if time.time() - last_access_time > 60: Review comment: We may want to make this configurable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378255) Time Spent: 1h (was: 50m) > Tear down unused DoFns periodically in Python SDK harness > - > > Key: BEAM-8618 > URL: https://issues.apache.org/jira/browse/BEAM-8618 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Per the discussion in the ML, detail can be found [1], the teardown of DoFns > should be supported in the portability framework. It happens at two places: > 1) Upon the control service termination > 2) Tear down the unused DoFns periodically > The aim of this JIRA is to add support for tear down the unused DoFns > periodically in Python SDK harness. > [1] > https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness
[ https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378257&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378257 ] ASF GitHub Bot logged work on BEAM-8618: Author: ASF GitHub Bot Created on: 28/Jan/20 14:06 Start Date: 28/Jan/20 14:06 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear down unused DoFns periodically in Python SDK harness. URL: https://github.com/apache/beam/pull/10655#discussion_r371819018 ## File path: sdks/python/apache_beam/runners/worker/sdk_worker.py ## @@ -315,18 +319,47 @@ def release(self, instruction_id): """ descriptor_id, processor = self.active_bundle_processors.pop(instruction_id) processor.reset() +self.last_access_time[descriptor_id] = time.time() self.cached_bundle_processors[descriptor_id].append(processor) def shutdown(self): """ Shutdown all ``BundleProcessor``s in the cache. """ +if self.periodic_shutdown: + self.periodic_shutdown.cancel() + self.periodic_shutdown.join() + self.periodic_shutdown = None + for instruction_id in self.active_bundle_processors: self.active_bundle_processors[instruction_id][1].shutdown() del self.active_bundle_processors[instruction_id] for cached_bundle_processors in self.cached_bundle_processors.values(): - while len(cached_bundle_processors) > 0: -cached_bundle_processors.pop().shutdown() + BundleProcessorCache._shutdown_cached_bundle_processors( + cached_bundle_processors) + + def _schedule_periodic_shutdown(self): +def shutdown_inactive_bundle_processors(): + for descriptor_id, last_access_time in self.last_access_time.items(): +if time.time() - last_access_time > 60: + BundleProcessorCache._shutdown_cached_bundle_processors( + self.cached_bundle_processors[descriptor_id]) + +from apache_beam.runners.worker.data_plane import PeriodicThread Review comment: I think we should move this to the import section. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378257) Time Spent: 1h 10m (was: 1h) > Tear down unused DoFns periodically in Python SDK harness > - > > Key: BEAM-8618 > URL: https://issues.apache.org/jira/browse/BEAM-8618 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Per the discussion in the ML, detail can be found [1], the teardown of DoFns > should be supported in the portability framework. It happens at two places: > 1) Upon the control service termination > 2) Tear down the unused DoFns periodically > The aim of this JIRA is to add support for tear down the unused DoFns > periodically in Python SDK harness. > [1] > https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness
[ https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378254 ] ASF GitHub Bot logged work on BEAM-8618: Author: ASF GitHub Bot Created on: 28/Jan/20 14:06 Start Date: 28/Jan/20 14:06 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear down unused DoFns periodically in Python SDK harness. URL: https://github.com/apache/beam/pull/10655#discussion_r371820931 ## File path: sdks/python/apache_beam/runners/worker/sdk_worker.py ## @@ -315,18 +319,47 @@ def release(self, instruction_id): """ descriptor_id, processor = self.active_bundle_processors.pop(instruction_id) processor.reset() +self.last_access_time[descriptor_id] = time.time() self.cached_bundle_processors[descriptor_id].append(processor) def shutdown(self): """ Shutdown all ``BundleProcessor``s in the cache. """ +if self.periodic_shutdown: + self.periodic_shutdown.cancel() + self.periodic_shutdown.join() + self.periodic_shutdown = None + for instruction_id in self.active_bundle_processors: self.active_bundle_processors[instruction_id][1].shutdown() del self.active_bundle_processors[instruction_id] for cached_bundle_processors in self.cached_bundle_processors.values(): - while len(cached_bundle_processors) > 0: -cached_bundle_processors.pop().shutdown() + BundleProcessorCache._shutdown_cached_bundle_processors( + cached_bundle_processors) + + def _schedule_periodic_shutdown(self): +def shutdown_inactive_bundle_processors(): + for descriptor_id, last_access_time in self.last_access_time.items(): +if time.time() - last_access_time > 60: + BundleProcessorCache._shutdown_cached_bundle_processors( + self.cached_bundle_processors[descriptor_id]) Review comment: Don't we have to remove the bundle processor list from the dictionary? Otherwise we may access a cached shutdown bundle processor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378254) Time Spent: 50m (was: 40m) > Tear down unused DoFns periodically in Python SDK harness > - > > Key: BEAM-8618 > URL: https://issues.apache.org/jira/browse/BEAM-8618 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.20.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Per the discussion in the ML, detail can be found [1], the teardown of DoFns > should be supported in the portability framework. It happens at two places: > 1) Upon the control service termination > 2) Tear down the unused DoFns periodically > The aim of this JIRA is to add support for tear down the unused DoFns > periodically in Python SDK harness. > [1] > https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness
[ https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378256&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378256 ] ASF GitHub Bot logged work on BEAM-8618: Author: ASF GitHub Bot Created on: 28/Jan/20 14:06 Start Date: 28/Jan/20 14:06 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear down unused DoFns periodically in Python SDK harness. URL: https://github.com/apache/beam/pull/10655#discussion_r371818162 ## File path: sdks/python/apache_beam/runners/worker/sdk_worker.py ## @@ -315,18 +319,47 @@ def release(self, instruction_id): """ descriptor_id, processor = self.active_bundle_processors.pop(instruction_id) processor.reset() +self.last_access_time[descriptor_id] = time.time() self.cached_bundle_processors[descriptor_id].append(processor) def shutdown(self): """ Shutdown all ``BundleProcessor``s in the cache. """ +if self.periodic_shutdown: + self.periodic_shutdown.cancel() + self.periodic_shutdown.join() + self.periodic_shutdown = None + for instruction_id in self.active_bundle_processors: self.active_bundle_processors[instruction_id][1].shutdown() del self.active_bundle_processors[instruction_id] for cached_bundle_processors in self.cached_bundle_processors.values(): - while len(cached_bundle_processors) > 0: -cached_bundle_processors.pop().shutdown() + BundleProcessorCache._shutdown_cached_bundle_processors( + cached_bundle_processors) + + def _schedule_periodic_shutdown(self): +def shutdown_inactive_bundle_processors(): + for descriptor_id, last_access_time in self.last_access_time.items(): +if time.time() - last_access_time > 60: + BundleProcessorCache._shutdown_cached_bundle_processors( + self.cached_bundle_processors[descriptor_id]) + +from apache_beam.runners.worker.data_plane import PeriodicThread +self.periodic_shutdown = PeriodicThread( +60, shutdown_inactive_bundle_processors) Review comment: Same here. Should be configurable or at least be extracted to a variable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378256) Time Spent: 1h (was: 50m) > Tear down unused DoFns periodically in Python SDK harness > - > > Key: BEAM-8618 > URL: https://issues.apache.org/jira/browse/BEAM-8618 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Per the discussion in the ML, detail can be found [1], the teardown of DoFns > should be supported in the portability framework. It happens at two places: > 1) Upon the control service termination > 2) Tear down the unused DoFns periodically > The aim of this JIRA is to add support for tear down the unused DoFns > periodically in Python SDK harness. > [1] > https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7427: --- Fix Version/s: 2.20.0 > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 8.5h > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7427: --- Affects Version/s: 2.19.0 2.13.0 2.14.0 2.15.0 2.16.0 2.17.0 2.18.0 > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía reassigned BEAM-7427: -- Assignee: Jean-Baptiste Onofré (was: Mourad) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378258&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378258 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 14:15 Start Date: 28/Jan/20 14:15 Worklog Time Spent: 10m Work Description: jbonofre commented on issue #10644: [BEAM-7427] Refactore JmsCheckpointMark to be usage via Coder URL: https://github.com/apache/beam/pull/10644#issuecomment-579265247 @iemejia thanks ! I will switch to other IOs improvements ;) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378258) Time Spent: 8h 40m (was: 8.5h) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 8h 40m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378269 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 14:26 Start Date: 28/Jan/20 14:26 Worklog Time Spent: 10m Work Description: suztomo commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579270595 I want that job too! The challenge is that because of the many existing linkage errors, I'd have to compare - linkage errors in a PR, and - linkage errors in origin/master Like a code coverage report. As I don't know how to do that, I'm still doing it `diff` command in my local environment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378269) Time Spent: 1h 50m (was: 1h 40m) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378278&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378278 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 14:38 Start Date: 28/Jan/20 14:38 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579276082 Well the manual comparison is not ideal but we can cope with that for the moment, what I don't want is to type the command for the 31 modules of this PR and then have to change it for other dependency upgrade. I just want some sort of `./gradlew :checkJavaLinkage` that works for the whole set of modules of the project. Is this 'feasible' with gradlew + Beam? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378278) Time Spent: 2h (was: 1h 50m) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378280&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378280 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 14:48 Start Date: 28/Jan/20 14:48 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #10644: [BEAM-7427] Refactore JmsCheckpointMark to be usage via Coder URL: https://github.com/apache/beam/pull/10644#issuecomment-579280837 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378280) Time Spent: 8h 50m (was: 8h 40m) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 8h 50m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378282&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378282 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 14:50 Start Date: 28/Jan/20 14:50 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10644: [BEAM-7427] Refactor JmsCheckpointMark to use SerializableCoder URL: https://github.com/apache/beam/pull/10644#issuecomment-579281715 Merged manually, thanks again JB! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378282) Time Spent: 9h 10m (was: 9h) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378281 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 14:50 Start Date: 28/Jan/20 14:50 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #10644: [BEAM-7427] Refactor JmsCheckpointMark to use SerializableCoder URL: https://github.com/apache/beam/pull/10644 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378281) Time Spent: 9h (was: 8h 50m) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 9h > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía resolved BEAM-7427. Resolution: Fixed > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378284 ] ASF GitHub Bot logged work on BEAM-7427: Author: ASF GitHub Bot Created on: 28/Jan/20 14:52 Start Date: 28/Jan/20 14:52 Worklog Time Spent: 10m Work Description: iemejia commented on issue #8757: [BEAM-7427] Fix JmsCheckpointMark Avro Encoding URL: https://github.com/apache/beam/pull/8757#issuecomment-579282755 For info #10644 was merged today. The fix will be part of Beam 2.20.0 since the vote for 2.19.0 has already started. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378284) Time Spent: 9h 20m (was: 9h 10m) > JmsCheckpointMark Avro Serialization issue with UnboundedSource > --- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 9h 20m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7691) Improve checkpoints documentation
[ https://issues.apache.org/jira/browse/BEAM-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7691: --- Summary: Improve checkpoints documentation (was: UnboundedSource.CheckpointMark should mention that implementations should be Serializable or have have an associated Coder) > Improve checkpoints documentation > - > > Key: BEAM-7691 > URL: https://issues.apache.org/jira/browse/BEAM-7691 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Ismaël Mejía >Priority: Minor > Labels: documentation, javadoc, newbie, starter > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7691) Improve checkpoints documentation
[ https://issues.apache.org/jira/browse/BEAM-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7691: --- Component/s: website > Improve checkpoints documentation > - > > Key: BEAM-7691 > URL: https://issues.apache.org/jira/browse/BEAM-7691 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core, website >Reporter: Ismaël Mejía >Priority: Minor > Labels: documentation, javadoc, newbie, starter > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7691) Improve checkpoints documentation
[ https://issues.apache.org/jira/browse/BEAM-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7691: --- Description: UnboundedSource.CheckpointMark should mention that implementations it should be encodable (have an associated Coder). Also maybe it is a good idea to explain a bit more the checkpointing semantics on Beam. > Improve checkpoints documentation > - > > Key: BEAM-7691 > URL: https://issues.apache.org/jira/browse/BEAM-7691 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core, website >Reporter: Ismaël Mejía >Priority: Minor > Labels: documentation, javadoc, newbie, starter > > UnboundedSource.CheckpointMark should mention that implementations it should > be encodable (have an associated Coder). Also maybe it is a good idea to > explain a bit more the checkpointing semantics on Beam. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle
[ https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378292&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378292 ] ASF GitHub Bot logged work on BEAM-9132: Author: ASF GitHub Bot Created on: 28/Jan/20 15:19 Start Date: 28/Jan/20 15:19 Worklog Time Spent: 10m Work Description: mxm commented on issue #10694: [BEAM-9132] Use a unique bundle id across all SDK workers bound to the same environment URL: https://github.com/apache/beam/pull/10694#issuecomment-579296135 This does not fix the issue because the id generation scheme is effectively the same. I'm still seeing the same errors, but I have a new trace. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378292) Time Spent: 1h 20m (was: 1h 10m) > State request handler is removed prematurely when closing ActiveBundle > -- > > Key: BEAM-9132 > URL: https://issues.apache.org/jira/browse/BEAM-9132 > Project: Beam > Issue Type: Bug > Components: java-fn-execution >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > We have observed these errors in a state-intense application: > {noformat} > Error processing instruction 107. Original traceback is > Traceback (most recent call last): > File "apache_beam/runners/common.py", line 780, in > apache_beam.runners.common.DoFnRunner.process > File "apache_beam/runners/common.py", line 587, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > File "apache_beam/runners/common.py", line 659, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > File "apache_beam/runners/common.py", line 880, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "apache_beam/runners/common.py", line 895, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "redacted.py", line 56, in process > recent_events_map = load_recent_events_map(recent_events_state) > File "redacted.py", line 128, in _load_recent_events_map > items_in_recent_events_bag = list(recent_events_state.read()) > File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__ > for elem in self.first: > File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__ > self._state_key, self._coder_impl, is_cached=self._is_cached) > File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get > self._materialize_iter(state_key, coder)) > File "apache_beam/runners/worker/sdk_worker.py", line 723, in > _materialize_iter > self._underlying.get_raw(state_key, continuation_token) > File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw > continuation_token=continuation_token))) > File "apache_beam/runners/worker/sdk_worker.py", line 637, in > _blocking_request > raise RuntimeError(response.error) > RuntimeError: Unknown process bundle instruction id '107' > {noformat} > Notice that the error is thrown on the Runner side. It seems to originate > from the {{ActiveBundle}} de-registering the state request handler too early > when the processing may still be going on in the SDK Harness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7427) JmsCheckpointMark can not be correctly encoded
[ https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7427: --- Summary: JmsCheckpointMark can not be correctly encoded (was: JmsCheckpointMark Avro Serialization issue with UnboundedSource) > JmsCheckpointMark can not be correctly encoded > -- > > Key: BEAM-7427 > URL: https://issues.apache.org/jira/browse/BEAM-7427 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, > 2.19.0 > Environment: Message Broker : solace > JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0 >Reporter: Mourad >Assignee: Jean-Baptiste Onofré >Priority: Major > Fix For: 2.20.0 > > Time Spent: 9h 20m > Remaining Estimate: 0h > > I get the following exception when reading from unbounded JMS Source: > > {code:java} > Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0 > at org.apache.avro.Schema.validateName(Schema.java:1151) > at org.apache.avro.Schema.access$200(Schema.java:81) > at org.apache.avro.Schema$Field.(Schema.java:403) > at org.apache.avro.Schema$Field.(Schema.java:396) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622) > at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740) > at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218) > at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215) > at > avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > at > avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > {code} > > The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to > generate schema. > JmsIO config : > > {code:java} > PCollection messages = pipeline.apply("read messages from the > events broker", JmsIO.readMessage() > .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) > .withMessageMapper(new DFAMessageMapper()) > .withCoder(AvroCoder.of(DFAMessage.class))); > {code} > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK
[ https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378293 ] ASF GitHub Bot logged work on BEAM-9175: Author: ASF GitHub Bot Created on: 28/Jan/20 15:20 Start Date: 28/Jan/20 15:20 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #10684: [BEAM-9175] Introduce an autoformatting tool to Python SDK URL: https://github.com/apache/beam/pull/10684#discussion_r371867218 ## File path: sdks/python/apache_beam/typehints/trivial_inference.py ## @@ -68,24 +68,23 @@ def instance_to_type(o): return typehints.Tuple[[instance_to_type(item) for item in o]] elif t == list: if len(o) > 0: - return typehints.List[ - typehints.Union[[instance_to_type(item) for item in o]] - ] + return typehints.List[typehints.Union[[ + instance_to_type(item) for item in o + ]]] else: return typehints.List[typehints.Any] elif t == set: if len(o) > 0: - return typehints.Set[ - typehints.Union[[instance_to_type(item) for item in o]] - ] + return typehints.Set[typehints.Union[[ + instance_to_type(item) for item in o + ]]] else: return typehints.Set[typehints.Any] elif t == dict: if len(o) > 0: return typehints.Dict[ typehints.Union[[instance_to_type(k) for k, v in o.items()]], - typehints.Union[[instance_to_type(v) for k, v in o.items()]], - ] + typehints.Union[[instance_to_type(v) for k, v in o.items()]], ] Review comment: It's OK now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378293) Time Spent: 2h 50m (was: 2h 40m) > Introduce an autoformatting tool to Python SDK > -- > > Key: BEAM-9175 > URL: https://issues.apache.org/jira/browse/BEAM-9175 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core, sdk-py-harness >Reporter: Michał Walenia >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > It seems there are three main options: > * black - very simple, but not configurable at all (except for line length), > would drastically change code style > * yapf - more options to tweak, can omit parts of code > * autopep8 - more similar to spotless - only touches code that breaks > formatting guidelines, can use pycodestyle and flake8 as configuration > The rigidity of Black makes it unusable for Beam. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK
[ https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378294&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378294 ] ASF GitHub Bot logged work on BEAM-9175: Author: ASF GitHub Bot Created on: 28/Jan/20 15:20 Start Date: 28/Jan/20 15:20 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #10684: [BEAM-9175] Introduce an autoformatting tool to Python SDK URL: https://github.com/apache/beam/pull/10684#discussion_r371867339 ## File path: sdks/python/apache_beam/typehints/trivial_inference.py ## @@ -303,10 +302,8 @@ def infer_return_type(c, input_types, debug=False, depth=5): elif inspect.isclass(c): if c in typehints.DISALLOWED_PRIMITIVE_TYPES: return { -list: typehints.List[Any], -set: typehints.Set[Any], -tuple: typehints.Tuple[Any, ...], -dict: typehints.Dict[Any, Any] +list: typehints.List[Any], set: typehints.Set[Any], tuple: Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378294) Time Spent: 3h (was: 2h 50m) > Introduce an autoformatting tool to Python SDK > -- > > Key: BEAM-9175 > URL: https://issues.apache.org/jira/browse/BEAM-9175 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core, sdk-py-harness >Reporter: Michał Walenia >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > It seems there are three main options: > * black - very simple, but not configurable at all (except for line length), > would drastically change code style > * yapf - more options to tweak, can omit parts of code > * autopep8 - more similar to spotless - only touches code that breaks > formatting guidelines, can use pycodestyle and flake8 as configuration > The rigidity of Black makes it unusable for Beam. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources
[ https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=378322&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378322 ] ASF GitHub Bot logged work on BEAM-9188: Author: ASF GitHub Bot Created on: 28/Jan/20 15:24 Start Date: 28/Jan/20 15:24 Worklog Time Spent: 10m Work Description: stankiewicz commented on pull request #10701: [BEAM-9188] CassandraIO split performance improvement - cache size of the table URL: https://github.com/apache/beam/pull/10701 Splitting CassandraIO source into multiple sources works fast as it uses one connection pool to Cassandra cluster but after that dataflow.worker.WorkerCustomSources is calling CassandraSource.getEstimatedSizeBytes for each source which setups and tears down connection to Cassandra cluster to calculate same size of table. This optimization introduces caching of size internally just to avoid additional queries. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStrea
[jira] [Resolved] (BEAM-4409) NoSuchMethodException reading from JmsIO
[ https://issues.apache.org/jira/browse/BEAM-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía resolved BEAM-4409. Fix Version/s: 2.20.0 Resolution: Fixed > NoSuchMethodException reading from JmsIO > > > Key: BEAM-4409 > URL: https://issues.apache.org/jira/browse/BEAM-4409 > Project: Beam > Issue Type: Bug > Components: io-java-jms >Affects Versions: 2.4.0 > Environment: Linux, Java 1.8, Beam 2.4, Direct Runner, ActiveMQ >Reporter: Edward Pricer >Priority: Major > Fix For: 2.20.0 > > > Running with the DirectRunner, and reading from a queue with JmsIO as an > unbounded source will produce a NoSuchMethodException. This occurs as the > UnboundedReadEvaluatorFactory.UnboundedReadEvaluator attempts to clone the > JmsCheckpointMark with the default (Avro) coder. > The following trivial code on the reader side reproduces the error > (DirectRunner must be in path). The messages on the queue for this test case > were simple TextMessages. I found this exception is triggered more readily > when messages are published rapidly (~200/second) > {code:java} > Pipeline p = > Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create()); > // read from the queue > ConnectionFactory factory = new > ActiveMQConnectionFactory("tcp://localhost:61616"); > PCollection inputStrings = p.apply("Read from queue", > JmsIO.readMessage() .withConnectionFactory(factory) > .withQueue("somequeue") .withCoder(StringUtf8Coder.of()) > .withMessageMapper((JmsIO.MessageMapper) message -> > ((TextMessage) message).getText())); > // decode > PCollection asStrings = inputStrings.apply("Decode Message", > ParDo.of(new DoFn() { @ProcessElement public > void processElement(ProcessContext context) { > System.out.println(context.element()); > context.output(context.element()); } })); > p.run(); > {code} > Stack trace: > {code:java} > Exception in thread "main" java.lang.RuntimeException: > java.lang.NoSuchMethodException: javax.jms.Message.() at > org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:353) at > org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:369) at > org.apache.avro.reflect.ReflectData.newRecord(ReflectData.java:901) at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:212) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) > at > org.apache.avro.reflect.ReflectDatumReader.readCollection(ReflectDatumReader.java:219) > at > org.apache.avro.reflect.ReflectDatumReader.readArray(ReflectDatumReader.java:137) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177) > at > org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:302) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222) > at > org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) > at org.apache.beam.sdk.coders.AvroCoder.decode(AvroCoder.java:318) at > org.apache.beam.sdk.coders.Coder.decode(Coder.java:170) at > org.apache.beam.sdk.util.CoderUtils.decodeFromSafeStream(CoderUtils.java:122) > at > org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:105) > at > org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:99) > at org.apache.beam.sdk.util.CoderUtils.clone(CoderUtils.java:148) at > org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:194) > at > org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:124) > at > org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:161) > at > org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:125) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) Caused by: > java.lang.NoSuchMethodException: javax.jms.Message.() at > java.lang.Class.getConstructor0(Class.java:3082) at > java.lang.Class.getDeclaredConstructor(Class.java:2178) at > org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:347) > {code} > > And a more contrived exampl
[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK
[ https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378323&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378323 ] ASF GitHub Bot logged work on BEAM-9175: Author: ASF GitHub Bot Created on: 28/Jan/20 15:30 Start Date: 28/Jan/20 15:30 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #10684: [BEAM-9175] Introduce an autoformatting tool to Python SDK URL: https://github.com/apache/beam/pull/10684#issuecomment-579301469 @robertwb I managed to add all knobs you asked for. I think the code looks better now (much less weird lines, that's for sure). I also added a pre-commit job that runs yapf with --diff option. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378323) Time Spent: 3h 10m (was: 3h) > Introduce an autoformatting tool to Python SDK > -- > > Key: BEAM-9175 > URL: https://issues.apache.org/jira/browse/BEAM-9175 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core, sdk-py-harness >Reporter: Michał Walenia >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > It seems there are three main options: > * black - very simple, but not configurable at all (except for line length), > would drastically change code style > * yapf - more options to tweak, can omit parts of code > * autopep8 - more similar to spotless - only touches code that breaks > formatting guidelines, can use pycodestyle and flake8 as configuration > The rigidity of Black makes it unusable for Beam. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK
[ https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378326 ] ASF GitHub Bot logged work on BEAM-9175: Author: ASF GitHub Bot Created on: 28/Jan/20 15:33 Start Date: 28/Jan/20 15:33 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #10684: [BEAM-9175] Introduce an autoformatting tool to Python SDK URL: https://github.com/apache/beam/pull/10684#issuecomment-579303250 There is still a number of pylint issues `Wrong continued indentation`. Most of them appear because of how lambda is formatted. For example: https://github.com/apache/beam/blob/7db61fbf2dd6eac4ffb542e684260edf0d892fea/sdks/python/apache_beam/io/gcp/bigquery_test.py#L908-L910 I don't know yet how to deal with it. Unless there is a knob for this, I'll just put a # yapf: disable comments in these places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378326) Time Spent: 3h 20m (was: 3h 10m) > Introduce an autoformatting tool to Python SDK > -- > > Key: BEAM-9175 > URL: https://issues.apache.org/jira/browse/BEAM-9175 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core, sdk-py-harness >Reporter: Michał Walenia >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > It seems there are three main options: > * black - very simple, but not configurable at all (except for line length), > would drastically change code style > * yapf - more options to tweak, can omit parts of code > * autopep8 - more similar to spotless - only touches code that breaks > formatting guidelines, can use pycodestyle and flake8 as configuration > The rigidity of Black makes it unusable for Beam. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378332&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378332 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 15:47 Start Date: 28/Jan/20 15:47 Worklog Time Spent: 10m Work Description: suztomo commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579313211 Let me think about that this week. https://issues.apache.org/jira/browse/BEAM-9206 For this PR, I would only check the modules that use jackson: `find . -name 'build.gradle' | xargs grep library.java.jackson_`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378332) Time Spent: 2h 10m (was: 2h) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9206) Easy way to run checkJavaLinkage?
Tomo Suzuki created BEAM-9206: - Summary: Easy way to run checkJavaLinkage? Key: BEAM-9206 URL: https://issues.apache.org/jira/browse/BEAM-9206 Project: Beam Issue Type: Bug Components: build-system Reporter: Tomo Suzuki Assignee: Tomo Suzuki Follow up of iemejia's comment: https://github.com/apache/beam/pull/10643#issuecomment-579276082 bq. I just want some sort of ./gradlew :checkJavaLinkage that works for the whole set of modules of the project. Is this 'feasible' with gradlew + Beam? h1. Considerations * Something that can run on Jenkins * Comparison with the result of origin/master * Simple way to run checkJavaLinkage for all modules -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9008) Add readAll() method to CassandraIO
[ https://issues.apache.org/jira/browse/BEAM-9008?focusedWorklogId=378338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378338 ] ASF GitHub Bot logged work on BEAM-9008: Author: ASF GitHub Bot Created on: 28/Jan/20 15:55 Start Date: 28/Jan/20 15:55 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10546: [BEAM-9008] Add CassandraIO readAll method URL: https://github.com/apache/beam/pull/10546#issuecomment-579317145 > @iemejia I think that could work, thanks for your patience as I try to understand what you're thinking. Some questions: No, thanks to you who has been the patient one during this discussion. > 1. If our `ReadFn extends DoFn, A>` and the only way we have connection information is from the `Read` passed in to the processElement, that means we need to re-establish a DB connection for each batch of queries we run? As in, the connection would be established in the `processElement` method and could not be in `setup` method? Yes exactly this will make the method simpler and the cost of starting a connection gets amortized by the processElement producing multiple outputs from a single connection. > 2. How would that work for the end user of a `PTransform, PCollection>`? Here is what I did in the test and would wand to document how end users could generate 'queries', > https://github.com/apache/beam/pull/10546/files#diff-8ba4ea3b09d563a67e29ff8584269d35R499 >Would we instead want to return a `PCollection>` by using something like `return CassandraIO.read().withRingRange(new RingRange(start, finish))`? If we do that however, we'd need to do the `withHosts` and all the other connection information, no? The other option is establishing one `ReadAll` PTransform that maps over the `Read` input and enriches the db connection information? You have a point here!. We need `class ReadAll extends PTransform, PCollection>` and there we read as intended with `ReadFn`. You would have to modify however the `expand` of `Read` to do `input.apply(Create.of(this)).apply(CassandraIO.readAll())` where `ReadAll` should expand into `input.apply(ParDo.of(splitFn)).apply(Reshuffle).apply(Read)` users should deal with building the PCollection of `Reads` before passing that collection to `ReadAll`. > 3. Originally I had wanted to have the ReadFn operate on a _collection_ of 'query' objects to ensure a way to enforce linearizability with our queries (mainly so we don't oversaturate a single node/shard). Currently the groupBy function a user passes in operates on the `RingRange` object, would we keep it that way and just, under the hood, allow for a single `Read` to hold a collection of RingRanges? If I understand this correctly this is covered by following the Create -> Split -> Reshuffle -> Read pattern mentioned above (in the mentioned IOs). So Split is the one who will generate a collection of `Read`s for each given `RingRange` then we use Reshuffle to guarantee that reads are redistributed and finally each read request is read by one worker. Hope this helps, don't hesitate to ask me more questions if still. I will try to answer quickly this time. Hope this helps This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378338) Time Spent: 4h (was: 3h 50m) > Add readAll() method to CassandraIO > --- > > Key: BEAM-9008 > URL: https://issues.apache.org/jira/browse/BEAM-9008 > Project: Beam > Issue Type: New Feature > Components: io-java-cassandra >Affects Versions: 2.16.0 >Reporter: vincent marquez >Assignee: vincent marquez >Priority: Minor > Time Spent: 4h > Remaining Estimate: 0h > > When querying a large cassandra database, it's often *much* more useful to > programatically generate the queries needed to to be run rather than reading > all partitions and attempting some filtering. > As an example: > {code:java} > public class Event { >@PartitionKey(0) public UUID accountId; >@PartitionKey(1)public String yearMonthDay; >@ClusteringKey public UUID eventId; >//other data... > }{code} > If there is ten years worth of data, you may want to only query one year's > worth. Here each token range would represent one 'token' but all events for > the day. > {code:java} > Set accounts = getRelevantAccounts(); > Set dateRange = generateDateRang
[jira] [Work logged] (BEAM-9008) Add readAll() method to CassandraIO
[ https://issues.apache.org/jira/browse/BEAM-9008?focusedWorklogId=378339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378339 ] ASF GitHub Bot logged work on BEAM-9008: Author: ASF GitHub Bot Created on: 28/Jan/20 15:58 Start Date: 28/Jan/20 15:58 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10546: [BEAM-9008] Add CassandraIO readAll method URL: https://github.com/apache/beam/pull/10546#issuecomment-579318866 Forgot to mention that in the above comment that in the Split function you have to split in every case save if the user provided a specific RingRange to read from. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378339) Time Spent: 4h 10m (was: 4h) > Add readAll() method to CassandraIO > --- > > Key: BEAM-9008 > URL: https://issues.apache.org/jira/browse/BEAM-9008 > Project: Beam > Issue Type: New Feature > Components: io-java-cassandra >Affects Versions: 2.16.0 >Reporter: vincent marquez >Assignee: vincent marquez >Priority: Minor > Time Spent: 4h 10m > Remaining Estimate: 0h > > When querying a large cassandra database, it's often *much* more useful to > programatically generate the queries needed to to be run rather than reading > all partitions and attempting some filtering. > As an example: > {code:java} > public class Event { >@PartitionKey(0) public UUID accountId; >@PartitionKey(1)public String yearMonthDay; >@ClusteringKey public UUID eventId; >//other data... > }{code} > If there is ten years worth of data, you may want to only query one year's > worth. Here each token range would represent one 'token' but all events for > the day. > {code:java} > Set accounts = getRelevantAccounts(); > Set dateRange = generateDateRange("2018-01-01", "2019-01-01"); > PCollection tokens = generateTokens(accounts, dateRange); > {code} > > I propose an additional _readAll()_ PTransform that can take a PCollection > of token ranges and can return a PCollection of what the query would > return. > *Question: How much code should be in common between both methods?* > Currently the read connector already groups all partitions into a List of > Token Ranges, so it would be simple to refactor the current read() based > method to a 'ParDo' based one and have them both share the same function. > Reasons against sharing code between read and readAll > * Not having the read based method return a BoundedSource connector would > mean losing the ability to know the size of the data returned > * Currently the CassandraReader executes all the grouped TokenRange queries > *asynchronously* which is (maybe?) fine when all that's happening is > splitting up all the partition ranges but terrible for executing potentially > millions of queries. > Reasons _for_ sharing code would be simplified code base and that both of > the above issues would most likely have a negligable performance impact. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9206) Easy way to run checkJavaLinkage?
[ https://issues.apache.org/jira/browse/BEAM-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9206: --- Status: Open (was: Triage Needed) > Easy way to run checkJavaLinkage? > - > > Key: BEAM-9206 > URL: https://issues.apache.org/jira/browse/BEAM-9206 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > > Follow up of iemejia's comment: > https://github.com/apache/beam/pull/10643#issuecomment-579276082 > bq. I just want some sort of ./gradlew :checkJavaLinkage that works for the > whole set of modules of the project. Is this 'feasible' with gradlew + Beam? > h1. Considerations > * Something that can run on Jenkins > * Comparison with the result of origin/master > * Simple way to run checkJavaLinkage for all modules -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378340 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 16:00 Start Date: 28/Jan/20 16:00 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579320038 Hehe so the 31 that I mentioned above, mmm not an easy to sell proposition. On the other hand I can help with the jenkins part if you get to do an incantation that works locally for all modules. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378340) Time Spent: 2h 20m (was: 2h 10m) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8298) Implement state caching for side inputs
[ https://issues.apache.org/jira/browse/BEAM-8298?focusedWorklogId=378345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378345 ] ASF GitHub Bot logged work on BEAM-8298: Author: ASF GitHub Bot Created on: 28/Jan/20 16:10 Start Date: 28/Jan/20 16:10 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10679: [BEAM-8298] Fully specify the necessary details to support side input caching. URL: https://github.com/apache/beam/pull/10679#issuecomment-579326907 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378345) Time Spent: 1h 10m (was: 1h) > Implement state caching for side inputs > --- > > Key: BEAM-8298 > URL: https://issues.apache.org/jira/browse/BEAM-8298 > Project: Beam > Issue Type: Improvement > Components: runner-core, sdk-py-harness >Reporter: Maximilian Michels >Assignee: Jing Chen >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Caching is currently only implemented for user state. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8542) Add async write to AWS SNS IO & remove retry logic
[ https://issues.apache.org/jira/browse/BEAM-8542?focusedWorklogId=378362&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378362 ] ASF GitHub Bot logged work on BEAM-8542: Author: ASF GitHub Bot Created on: 28/Jan/20 17:28 Start Date: 28/Jan/20 17:28 Worklog Time Spent: 10m Work Description: Akshay-Iyangar commented on issue #10078: [BEAM-8542] Change write to async in AWS SNS IO & remove retry logic URL: https://github.com/apache/beam/pull/10078#issuecomment-579364116 HI @aromanenko-dev - I'll be fixing the remaining stuff in this PR and will let you know as soon at is ready. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378362) Time Spent: 4h 40m (was: 4.5h) > Add async write to AWS SNS IO & remove retry logic > -- > > Key: BEAM-8542 > URL: https://issues.apache.org/jira/browse/BEAM-8542 > Project: Beam > Issue Type: Improvement > Components: io-java-aws >Reporter: Ajo Thomas >Assignee: Ajo Thomas >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > - While working with SNS IO for one of my work-related projects, I found that > the IO uses synchronous publishes during writes. I had a simple mock pipeline > where I was reading from a kinesis stream and publishing it to SNS using > Beam's SNS IO. For comparison, I also had a lamdba which did the same using > asynchronous publishes but was about 5x faster. Changing the SNS IO to use > async publishes would improve publish latencies. > - SNS IO also has some retry logic which isn't required as SNS clients can > handle retries. The retry logic in the SNS client is user-configurable and > therefore, an explicit retry logic in SNS IO is not required > I have a working version of the IO with these changes, will create a PR > linking this ticket to it once I get some feedback here. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378366&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378366 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 17:41 Start Date: 28/Jan/20 17:41 Worklog Time Spent: 10m Work Description: pabloem commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579369953 Run PythonLint PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378366) Time Spent: 43h 10m (was: 43h) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 43h 10m > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378367 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 17:42 Start Date: 28/Jan/20 17:42 Worklog Time Spent: 10m Work Description: pabloem commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579369985 Run PythonLint PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378367) Time Spent: 43h 20m (was: 43h 10m) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 43h 20m > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8298) Implement state caching for side inputs
[ https://issues.apache.org/jira/browse/BEAM-8298?focusedWorklogId=378369&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378369 ] ASF GitHub Bot logged work on BEAM-8298: Author: ASF GitHub Bot Created on: 28/Jan/20 17:48 Start Date: 28/Jan/20 17:48 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10679: [BEAM-8298] Fully specify the necessary details to support side input caching. URL: https://github.com/apache/beam/pull/10679 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378369) Time Spent: 1h 20m (was: 1h 10m) > Implement state caching for side inputs > --- > > Key: BEAM-8298 > URL: https://issues.apache.org/jira/browse/BEAM-8298 > Project: Beam > Issue Type: Improvement > Components: runner-core, sdk-py-harness >Reporter: Maximilian Michels >Assignee: Jing Chen >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Caching is currently only implemented for user state. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378378 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 17:54 Start Date: 28/Jan/20 17:54 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702 Defined a new @Restriction parameter type to pass in the restriction. Updated GetSize/GetInitialRestriction/SplitRestriction/NewTracker to take these new DoFn style parameters. Updated lots of documentation and existing implementations to use the new DoFn argument passing. Fixed ByteBuddyDoFnInvokerFactory to support return values that aren't references. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378379 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 17:56 Start Date: 28/Jan/20 17:56 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-579376066 R: @TheNeuralBit @boyuanzz CC: @iemejia @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378379) Time Spent: 8h 40m (was: 8.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9207) Create a script to define all variables used by release scripts
Hannah Jiang created BEAM-9207: -- Summary: Create a script to define all variables used by release scripts Key: BEAM-9207 URL: https://issues.apache.org/jira/browse/BEAM-9207 Project: Beam Issue Type: Task Components: dependencies Reporter: Hannah Jiang Now we are defining variables with each script and this cause the definitions are duplicated at each script. We should have a place which defines all these variables and shared by all scripts for release. * put it to dependencies component, because there is no release component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378380 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 17:57 Start Date: 28/Jan/20 17:57 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-579376564 I'm going to wait on #10702 and update this PR after that goes in since it is necessary to support the new DoFn style parameter passing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378380) Time Spent: 8h 50m (was: 8h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.
[ https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378381 ] ASF GitHub Bot logged work on BEAM-9063: Author: ASF GitHub Bot Created on: 28/Jan/20 17:58 Start Date: 28/Jan/20 17:58 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #10612: [DO NOT MERGE][BEAM-9063] migrate docker images to apache URL: https://github.com/apache/beam/pull/10612#discussion_r371964232 ## File path: release/src/main/scripts/build_release_candidate.sh ## @@ -45,6 +45,9 @@ PYTHON_ARTIFACTS_DIR=python BEAM_ROOT_DIR=beam WEBSITE_ROOT_DIR=beam-site +DOCKER_IMAGE_DEFAULT_REPO_ROOT=apache +DOCKER_IMAGE_DEFAULT_REPO_PREFIX=beam_ Review comment: Yeap, that's a great idea and I think it's better to have a script which holds all variables and shared by all release scripts, so I haven't used a script this time. I created a ticket to fix it. [BEAM-9207](https://issues.apache.org/jira/browse/BEAM-9207) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378381) Time Spent: 5h 50m (was: 5h 40m) > Migrate docker images to apache namespace. > -- > > Key: BEAM-9063 > URL: https://issues.apache.org/jira/browse/BEAM-9063 > Project: Beam > Issue Type: Task > Components: beam-community >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: Not applicable > > Time Spent: 5h 50m > Remaining Estimate: 0h > > https://hub.docker.com/u/apache -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7746) Add type hints to python code
[ https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=378383&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378383 ] ASF GitHub Bot logged work on BEAM-7746: Author: ASF GitHub Bot Created on: 28/Jan/20 18:08 Start Date: 28/Jan/20 18:08 Worklog Time Spent: 10m Work Description: chadrik commented on issue #10592: [BEAM-7746] Introduce a protocol to handle various types of partitioning buffers URL: https://github.com/apache/beam/pull/10592#issuecomment-579381109 All tests have passed. This is ready to go. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378383) Time Spent: 58.5h (was: 58h 20m) > Add type hints to python code > - > > Key: BEAM-7746 > URL: https://issues.apache.org/jira/browse/BEAM-7746 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Chad Dombrova >Assignee: Chad Dombrova >Priority: Major > Time Spent: 58.5h > Remaining Estimate: 0h > > As a developer of the beam source code, I would like the code to use pep484 > type hints so that I can clearly see what types are required, get completion > in my IDE, and enforce code correctness via a static analyzer like mypy. > This may be considered a precursor to BEAM-7060 > Work has been started here: [https://github.com/apache/beam/pull/9056] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6957) Spark Portable Runner: Support metrics
[ https://issues.apache.org/jira/browse/BEAM-6957?focusedWorklogId=378384&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378384 ] ASF GitHub Bot logged work on BEAM-6957: Author: ASF GitHub Bot Created on: 28/Jan/20 18:12 Start Date: 28/Jan/20 18:12 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #10693: [BEAM-6957] Enable Counter/Distribution metrics tests for Portable Spark Runner URL: https://github.com/apache/beam/pull/10693 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378384) Time Spent: 2h 20m (was: 2h 10m) > Spark Portable Runner: Support metrics > -- > > Key: BEAM-6957 > URL: https://issues.apache.org/jira/browse/BEAM-6957 > Project: Beam > Issue Type: Improvement > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > Fix For: 2.13.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9203) Programmatically determine if SQL exception is user error, unsupported, or bug
[ https://issues.apache.org/jira/browse/BEAM-9203?focusedWorklogId=378386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378386 ] ASF GitHub Bot logged work on BEAM-9203: Author: ASF GitHub Bot Created on: 28/Jan/20 18:20 Start Date: 28/Jan/20 18:20 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #10699: [BEAM-9203] Clarify exceptions in SQL modules URL: https://github.com/apache/beam/pull/10699#issuecomment-579385899 tag @amaliujia so it appears on my pull request list. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378386) Time Spent: 20m (was: 10m) > Programmatically determine if SQL exception is user error, unsupported, or bug > -- > > Key: BEAM-9203 > URL: https://issues.apache.org/jira/browse/BEAM-9203 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, dsl-sql-zetasql >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Right now there are a lot exceptions thrown by the Calcite SQL dialect and > ZetaSQL dialect of Beam SQL. It is hard to catch just the errors that are > user errors, or just the errors that are unsupported operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378387 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 18:20 Start Date: 28/Jan/20 18:20 Worklog Time Spent: 10m Work Description: rehmanmuradali commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579386009 @reuvenlax really needs your input with the failed test cases as the functionality is completed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378387) Time Spent: 14h (was: 13h 50m) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 14h > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers
[ https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378388&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378388 ] ASF GitHub Bot logged work on BEAM-2535: Author: ASF GitHub Bot Created on: 28/Jan/20 18:20 Start Date: 28/Jan/20 18:20 Worklog Time Spent: 10m Work Description: rehmanmuradali commented on issue #10627: [BEAM-2535] Support outputTimestamp and watermark holds in processing timers. URL: https://github.com/apache/beam/pull/10627#issuecomment-579386009 @reuvenlax really need your input with the failed test cases as the functionality is completed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378388) Time Spent: 14h 10m (was: 14h) > Allow explicit output time independent of firing specification for all timers > - > > Key: BEAM-2535 > URL: https://issues.apache.org/jira/browse/BEAM-2535 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-java-core >Reporter: Kenneth Knowles >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 14h 10m > Remaining Estimate: 0h > > Today, we have insufficient control over the event time timestamp of elements > output from a timer callback. > 1. For an event time timer, it is the timestamp of the timer itself. > 2. For a processing time timer, it is the current input watermark at the > time of processing. > But for both of these, we may want to reserve the right to output a > particular time, aka set a "watermark hold". > A naive implementation of a {{TimerWithWatermarkHold}} would work for making > sure output is not droppable, but does not fully explain window expiration > and late data/timer dropping. > In the natural interpretation of a timer as a feedback loop on a transform, > timers should be viewed as another channel of input, with a watermark, and > items on that channel _all need event time timestamps even if they are > delivered according to a different time domain_. > I propose that the specification for when a timer should fire should be > separated (with nice defaults) from the specification of the event time of > resulting outputs. These timestamps will determine a side channel with a new > "timer watermark" that constrains the output watermark. > - We still need to fire event time timers according to the input watermark, > so that event time timers fire. > - Late data dropping and window expiration will be in terms of the minimum > of the input watermark and the timer watermark. In this way, whenever a timer > is set, the window is not going to be garbage collected. > - We will need to make sure we have a way to "wake up" a window once it is > expired; this may be as simple as exhausting the timer channel as soon as the > input watermark indicates expiration of a window > This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It > seems reasonable to use timers as an implementation detail (e.g. in > runners-core utilities) without wanting any of this additional machinery. For > example, if there is no possibility of output from the timer callback. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2
[ https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378390 ] ASF GitHub Bot logged work on BEAM-9162: Author: ASF GitHub Bot Created on: 28/Jan/20 18:26 Start Date: 28/Jan/20 18:26 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10643: [BEAM-9162] Upgrade Jackson to version 2.10.2 URL: https://github.com/apache/beam/pull/10643#issuecomment-579388508 If the linkage checker had a way to ignore *pre-existing* linkage failures, we could turn it into a test by enumerating all known failures statically. The linkage checker would complain if there was a *new* failure that wasn't pre-existing or if the pre-existing failure wasn't being reported anymore (allowing us to maintain the list over time). The vendored gRPC 1.26.0 reduced the number of warnings in beam-sdks-java-core down to 4. Also, running the linkage checker per module would be useful and I can help with the Gradle bit if there was some good way to have linkage checker main return non zero status code on linkage errors and also if it supported enumerating pre-existing somehow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378390) Time Spent: 2.5h (was: 2h 20m) > Upgrade Jackson to version 2.10.2 > - > > Key: BEAM-9162 > URL: https://issues.apache.org/jira/browse/BEAM-9162 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > > Jackson has a new way to deal with [deserialization security > issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in > 2.10.x so worth the upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378391 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 18:28 Start Date: 28/Jan/20 18:28 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-579389168 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378391) Time Spent: 9h (was: 8h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378400 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 18:51 Start Date: 28/Jan/20 18:51 Worklog Time Spent: 10m Work Description: KevinGG commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579398909 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378400) Time Spent: 43h 40m (was: 43.5h) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 43h 40m > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378399 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 18:51 Start Date: 28/Jan/20 18:51 Worklog Time Spent: 10m Work Description: KevinGG commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579398856 Run PythonLint PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378399) Time Spent: 43.5h (was: 43h 20m) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 43.5h > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9203) Programmatically determine if SQL exception is user error, unsupported, or bug
[ https://issues.apache.org/jira/browse/BEAM-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-9203: --- Status: Open (was: Triage Needed) > Programmatically determine if SQL exception is user error, unsupported, or bug > -- > > Key: BEAM-9203 > URL: https://issues.apache.org/jira/browse/BEAM-9203 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, dsl-sql-zetasql >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Right now there are a lot exceptions thrown by the Calcite SQL dialect and > ZetaSQL dialect of Beam SQL. It is hard to catch just the errors that are > user errors, or just the errors that are unsupported operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7746) Add type hints to python code
[ https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=378404&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378404 ] ASF GitHub Bot logged work on BEAM-7746: Author: ASF GitHub Bot Created on: 28/Jan/20 19:01 Start Date: 28/Jan/20 19:01 Worklog Time Spent: 10m Work Description: udim commented on pull request #10592: [BEAM-7746] Introduce a protocol to handle various types of partitioning buffers URL: https://github.com/apache/beam/pull/10592 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378404) Time Spent: 58h 40m (was: 58.5h) > Add type hints to python code > - > > Key: BEAM-7746 > URL: https://issues.apache.org/jira/browse/BEAM-7746 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Chad Dombrova >Assignee: Chad Dombrova >Priority: Major > Time Spent: 58h 40m > Remaining Estimate: 0h > > As a developer of the beam source code, I would like the code to use pep484 > type hints so that I can clearly see what types are required, get completion > in my IDE, and enforce code correctness via a static analyzer like mypy. > This may be considered a precursor to BEAM-7060 > Work has been started here: [https://github.com/apache/beam/pull/9056] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle
[ https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378410&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378410 ] ASF GitHub Bot logged work on BEAM-9132: Author: ASF GitHub Bot Created on: 28/Jan/20 19:10 Start Date: 28/Jan/20 19:10 Worklog Time Spent: 10m Work Description: tweise commented on pull request #10694: [BEAM-9132] Avoid logging misleading error messages during pipeline failure URL: https://github.com/apache/beam/pull/10694#discussion_r372000633 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java ## @@ -464,6 +474,8 @@ ServerInfo getServerInfo() { } public void close() { Review comment: Isn't close only called from unref? If so, how does this change the behavior? (Possibly some more explanation needs to be added.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378410) Time Spent: 1.5h (was: 1h 20m) > State request handler is removed prematurely when closing ActiveBundle > -- > > Key: BEAM-9132 > URL: https://issues.apache.org/jira/browse/BEAM-9132 > Project: Beam > Issue Type: Bug > Components: java-fn-execution >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > We have observed these errors in a state-intense application: > {noformat} > Error processing instruction 107. Original traceback is > Traceback (most recent call last): > File "apache_beam/runners/common.py", line 780, in > apache_beam.runners.common.DoFnRunner.process > File "apache_beam/runners/common.py", line 587, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > File "apache_beam/runners/common.py", line 659, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > File "apache_beam/runners/common.py", line 880, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "apache_beam/runners/common.py", line 895, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "redacted.py", line 56, in process > recent_events_map = load_recent_events_map(recent_events_state) > File "redacted.py", line 128, in _load_recent_events_map > items_in_recent_events_bag = list(recent_events_state.read()) > File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__ > for elem in self.first: > File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__ > self._state_key, self._coder_impl, is_cached=self._is_cached) > File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get > self._materialize_iter(state_key, coder)) > File "apache_beam/runners/worker/sdk_worker.py", line 723, in > _materialize_iter > self._underlying.get_raw(state_key, continuation_token) > File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw > continuation_token=continuation_token))) > File "apache_beam/runners/worker/sdk_worker.py", line 637, in > _blocking_request > raise RuntimeError(response.error) > RuntimeError: Unknown process bundle instruction id '107' > {noformat} > Notice that the error is thrown on the Runner side. It seems to originate > from the {{ActiveBundle}} de-registering the state request handler too early > when the processing may still be going on in the SDK Harness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle
[ https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378411&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378411 ] ASF GitHub Bot logged work on BEAM-9132: Author: ASF GitHub Bot Created on: 28/Jan/20 19:12 Start Date: 28/Jan/20 19:12 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10694: [BEAM-9132] Avoid logging misleading error messages during pipeline failure URL: https://github.com/apache/beam/pull/10694#discussion_r372001232 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java ## @@ -464,6 +474,8 @@ ServerInfo getServerInfo() { } public void close() { Review comment: It's now also called from here: https://github.com/apache/beam/pull/10694/files#diff-e80c769f0011537cc2b60d3e7898cf5aR260 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378411) Time Spent: 1h 40m (was: 1.5h) > State request handler is removed prematurely when closing ActiveBundle > -- > > Key: BEAM-9132 > URL: https://issues.apache.org/jira/browse/BEAM-9132 > Project: Beam > Issue Type: Bug > Components: java-fn-execution >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > We have observed these errors in a state-intense application: > {noformat} > Error processing instruction 107. Original traceback is > Traceback (most recent call last): > File "apache_beam/runners/common.py", line 780, in > apache_beam.runners.common.DoFnRunner.process > File "apache_beam/runners/common.py", line 587, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > File "apache_beam/runners/common.py", line 659, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > File "apache_beam/runners/common.py", line 880, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "apache_beam/runners/common.py", line 895, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "redacted.py", line 56, in process > recent_events_map = load_recent_events_map(recent_events_state) > File "redacted.py", line 128, in _load_recent_events_map > items_in_recent_events_bag = list(recent_events_state.read()) > File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__ > for elem in self.first: > File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__ > self._state_key, self._coder_impl, is_cached=self._is_cached) > File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get > self._materialize_iter(state_key, coder)) > File "apache_beam/runners/worker/sdk_worker.py", line 723, in > _materialize_iter > self._underlying.get_raw(state_key, continuation_token) > File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw > continuation_token=continuation_token))) > File "apache_beam/runners/worker/sdk_worker.py", line 637, in > _blocking_request > raise RuntimeError(response.error) > RuntimeError: Unknown process bundle instruction id '107' > {noformat} > Notice that the error is thrown on the Runner side. It seems to originate > from the {{ActiveBundle}} de-registering the state request handler too early > when the processing may still be going on in the SDK Harness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378414&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378414 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 19:13 Start Date: 28/Jan/20 19:13 Worklog Time Spent: 10m Work Description: KevinGG commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579408205 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378414) Time Spent: 44h (was: 43h 50m) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 44h > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378413&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378413 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 19:13 Start Date: 28/Jan/20 19:13 Worklog Time Spent: 10m Work Description: KevinGG commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579408104 Retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378413) Time Spent: 43h 50m (was: 43h 40m) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 43h 50m > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle
[ https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378412&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378412 ] ASF GitHub Bot logged work on BEAM-9132: Author: ASF GitHub Bot Created on: 28/Jan/20 19:13 Start Date: 28/Jan/20 19:13 Worklog Time Spent: 10m Work Description: tweise commented on pull request #10694: [BEAM-9132] Avoid logging misleading error messages during pipeline failure URL: https://github.com/apache/beam/pull/10694#discussion_r372001730 ## File path: runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactoryTest.java ## @@ -360,6 +360,19 @@ public void closesEnvironmentOnCleanup() throws Exception { verify(remoteEnvironment).close(); } + @Test + public void closesEnvironmentOnCleanupWithPendingRefs() throws Exception { +try (DefaultJobBundleFactory bundleFactory = +createDefaultJobBundleFactory(envFactoryProviderMap)) { + DefaultJobBundleFactory.SimpleStageBundleFactory stageBundleFactory = + (DefaultJobBundleFactory.SimpleStageBundleFactory) + bundleFactory.forStage(getExecutableStage(environment)); + // The client is still being used, e.g. when the pipeline fails and is shut down + stageBundleFactory.currentClient.wrappedClient.ref(); Review comment: What would cause this extra ref in the actual operator lifecycle? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378412) Time Spent: 1h 50m (was: 1h 40m) > State request handler is removed prematurely when closing ActiveBundle > -- > > Key: BEAM-9132 > URL: https://issues.apache.org/jira/browse/BEAM-9132 > Project: Beam > Issue Type: Bug > Components: java-fn-execution >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > We have observed these errors in a state-intense application: > {noformat} > Error processing instruction 107. Original traceback is > Traceback (most recent call last): > File "apache_beam/runners/common.py", line 780, in > apache_beam.runners.common.DoFnRunner.process > File "apache_beam/runners/common.py", line 587, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > File "apache_beam/runners/common.py", line 659, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > File "apache_beam/runners/common.py", line 880, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "apache_beam/runners/common.py", line 895, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "redacted.py", line 56, in process > recent_events_map = load_recent_events_map(recent_events_state) > File "redacted.py", line 128, in _load_recent_events_map > items_in_recent_events_bag = list(recent_events_state.read()) > File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__ > for elem in self.first: > File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__ > self._state_key, self._coder_impl, is_cached=self._is_cached) > File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get > self._materialize_iter(state_key, coder)) > File "apache_beam/runners/worker/sdk_worker.py", line 723, in > _materialize_iter > self._underlying.get_raw(state_key, continuation_token) > File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw > continuation_token=continuation_token))) > File "apache_beam/runners/worker/sdk_worker.py", line 637, in > _blocking_request > raise RuntimeError(response.error) > RuntimeError: Unknown process bundle instruction id '107' > {noformat} > Notice that the error is thrown on the Runner side. It seems to originate > from the {{ActiveBundle}} de-registering the state request handler too early > when the processing may still be going on in the SDK Harness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow
[ https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378415&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378415 ] ASF GitHub Bot logged work on BEAM-7926: Author: ASF GitHub Bot Created on: 28/Jan/20 19:13 Start Date: 28/Jan/20 19:13 Worklog Time Spent: 10m Work Description: KevinGG commented on issue #10346: [BEAM-7926] Data-centric Interactive Part2 URL: https://github.com/apache/beam/pull/10346#issuecomment-579408305 Run PythonLint PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378415) Time Spent: 44h 10m (was: 44h) > Show PCollection with Interactive Beam in a data-centric user flow > -- > > Key: BEAM-7926 > URL: https://issues.apache.org/jira/browse/BEAM-7926 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 44h 10m > Remaining Estimate: 0h > > Support auto plotting / charting of materialized data of a given PCollection > with Interactive Beam. > Say an Interactive Beam pipeline defined as > > {code:java} > p = beam.Pipeline(InteractiveRunner()) > pcoll = p | 'Transform' >> transform() > pcoll2 = ... > pcoll3 = ...{code} > The use can call a single function and get auto-magical charting of the data. > e.g., > {code:java} > show(pcoll, pcoll2) > {code} > Throughout the process, a pipeline fragment is built to include only > transforms necessary to produce the desired pcolls (pcoll and pcoll2) and > execute that fragment. > This makes the Interactive Beam user flow data-centric. > > Detailed > [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9208) Add support for mapping columns to pubsub message attributes in flat schemas DDL
Brian Hulette created BEAM-9208: --- Summary: Add support for mapping columns to pubsub message attributes in flat schemas DDL Key: BEAM-9208 URL: https://issues.apache.org/jira/browse/BEAM-9208 Project: Beam Issue Type: Improvement Components: dsl-sql Reporter: Brian Hulette Assignee: Brian Hulette Context: https://lists.apache.org/thread.html/bf4c37f21bda194d7f8c40f6e7b9a776262415755cc1658412af3c76%40%3Cdev.beam.apache.org%3E The syntax should look something like this (proposed by [~alexvanboxel]): {{ CREATE TABLE people ( my_timestamp TIMESTAMP *OPTION(ref="pubsub:event_timestamp)*, my_id VARCHAR *OPTION(ref="pubsub:attributes['id_name']")*, name VARCHAR, age INTEGER ) TYPE 'pubsub' LOCATION 'projects/my-project/topics/my-topic' }} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle
[ https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378417&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378417 ] ASF GitHub Bot logged work on BEAM-9132: Author: ASF GitHub Bot Created on: 28/Jan/20 19:18 Start Date: 28/Jan/20 19:18 Worklog Time Spent: 10m Work Description: tweise commented on pull request #10694: [BEAM-9132] Avoid logging misleading error messages during pipeline failure URL: https://github.com/apache/beam/pull/10694#discussion_r372004299 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java ## @@ -255,6 +255,14 @@ public void close() throws Exception { // Clear the cache. This closes all active environments. // note this may cause open calls to be cancelled by the peer for (LoadingCache environmentCache : environmentCaches) { + for (WrappedSdkHarnessClient client : environmentCache.asMap().values()) { +try { + client.close(); Review comment: Worth mentioning that this is added to close the environments irrespective of open bundles, since this will occur only during shutdown? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378417) Time Spent: 2h (was: 1h 50m) > State request handler is removed prematurely when closing ActiveBundle > -- > > Key: BEAM-9132 > URL: https://issues.apache.org/jira/browse/BEAM-9132 > Project: Beam > Issue Type: Bug > Components: java-fn-execution >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > We have observed these errors in a state-intense application: > {noformat} > Error processing instruction 107. Original traceback is > Traceback (most recent call last): > File "apache_beam/runners/common.py", line 780, in > apache_beam.runners.common.DoFnRunner.process > File "apache_beam/runners/common.py", line 587, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > File "apache_beam/runners/common.py", line 659, in > apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window > File "apache_beam/runners/common.py", line 880, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "apache_beam/runners/common.py", line 895, in > apache_beam.runners.common._OutputProcessor.process_outputs > File "redacted.py", line 56, in process > recent_events_map = load_recent_events_map(recent_events_state) > File "redacted.py", line 128, in _load_recent_events_map > items_in_recent_events_bag = list(recent_events_state.read()) > File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__ > for elem in self.first: > File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__ > self._state_key, self._coder_impl, is_cached=self._is_cached) > File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get > self._materialize_iter(state_key, coder)) > File "apache_beam/runners/worker/sdk_worker.py", line 723, in > _materialize_iter > self._underlying.get_raw(state_key, continuation_token) > File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw > continuation_token=continuation_token))) > File "apache_beam/runners/worker/sdk_worker.py", line 637, in > _blocking_request > raise RuntimeError(response.error) > RuntimeError: Unknown process bundle instruction id '107' > {noformat} > Notice that the error is thrown on the Runner side. It seems to originate > from the {{ActiveBundle}} de-registering the state request handler too early > when the processing may still be going on in the SDK Harness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6703) Support Java 11 in Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6703?focusedWorklogId=378418&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378418 ] ASF GitHub Bot logged work on BEAM-6703: Author: ASF GitHub Bot Created on: 28/Jan/20 19:18 Start Date: 28/Jan/20 19:18 Worklog Time Spent: 10m Work Description: Ardagan commented on pull request #10689: [BEAM-6703] Make Dataflow ValidatesRunner test use Java 11 in test execution URL: https://github.com/apache/beam/pull/10689#discussion_r372004381 ## File path: .test-infra/jenkins/job_PostCommit_Java_ValidatesRunner_Dataflow_Java11.groovy ## @@ -20,26 +20,40 @@ import CommonJobProperties as commonJobProperties import PostcommitJobBuilder -PostcommitJobBuilder.postCommitJob('beam_PostCommit_Java11_ValidatesRunner_Dataflow', +PostcommitJobBuilder.postCommitJob('beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11', 'Run Dataflow ValidatesRunner Java 11', 'Google Cloud Dataflow Runner ValidatesRunner Tests On Java 11', this) { Review comment: I suggest use trigger phrase "Run beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11" it is same in terms of being descriptive, but much easier to figure out. Or "Run Google Cloud Dataflow Runner ValidatesRunner Tests On Java 11". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378418) Time Spent: 16.5h (was: 16h 20m) > Support Java 11 in Jenkins > -- > > Key: BEAM-6703 > URL: https://issues.apache.org/jira/browse/BEAM-6703 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow, runner-direct >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Fix For: Not applicable > > Time Spent: 16.5h > Remaining Estimate: 0h > > In this issue I'll create a Jenkins job that compiles Dataflow and Direct > runners with tests using Java 8 and runs Validates Runner suites with Java 11 > Runtime. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9209) Add support for mapping columns to pubsub message event_timestamp when using flat schemas DDL
Brian Hulette created BEAM-9209: --- Summary: Add support for mapping columns to pubsub message event_timestamp when using flat schemas DDL Key: BEAM-9209 URL: https://issues.apache.org/jira/browse/BEAM-9209 Project: Beam Issue Type: Improvement Components: dsl-sql Reporter: Brian Hulette Assignee: Brian Hulette Context: https://lists.apache.org/thread.html/bf4c37f21bda194d7f8c40f6e7b9a776262415755cc1658412af3c76%40%3Cdev.beam.apache.org%3E The syntax should look something like this (proposed by [~alexvanboxel]): {code} CREATE TABLE people ( my_timestamp TIMESTAMP *OPTION(ref="pubsub:event_timestamp)*, my_id VARCHAR *OPTION(ref="pubsub:attributes['id_name']")*, name VARCHAR, age INTEGER ) TYPE 'pubsub' LOCATION 'projects/my-project/topics/my-topic' {code} This jira pertains specifically to the timestamp portion of this syntax. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6703) Support Java 11 in Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6703?focusedWorklogId=378420&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378420 ] ASF GitHub Bot logged work on BEAM-6703: Author: ASF GitHub Bot Created on: 28/Jan/20 19:20 Start Date: 28/Jan/20 19:20 Worklog Time Spent: 10m Work Description: Ardagan commented on issue #10689: [BEAM-6703] Make Dataflow ValidatesRunner test use Java 11 in test execution URL: https://github.com/apache/beam/pull/10689#issuecomment-579411180 R: @markflyhigh, @lukecwik This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378420) Time Spent: 16h 40m (was: 16.5h) > Support Java 11 in Jenkins > -- > > Key: BEAM-6703 > URL: https://issues.apache.org/jira/browse/BEAM-6703 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow, runner-direct >Reporter: Michał Walenia >Assignee: Michał Walenia >Priority: Minor > Fix For: Not applicable > > Time Spent: 16h 40m > Remaining Estimate: 0h > > In this issue I'll create a Jenkins job that compiles Dataflow and Direct > runners with tests using Java 8 and runs Validates Runner suites with Java 11 > Runtime. -- This message was sent by Atlassian Jira (v8.3.4#803005)