[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=423849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423849 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 17/Apr/20 00:34 Start Date: 17/Apr/20 00:34 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423849) Time Spent: 18h 50m (was: 18h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=423805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423805 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Apr/20 22:44 Start Date: 16/Apr/20 22:44 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#issuecomment-614934489 Run RAT PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423805) Time Spent: 18h 40m (was: 18.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=423145=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423145 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Apr/20 00:07 Start Date: 16/Apr/20 00:07 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#discussion_r409204369 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java ## @@ -761,72 +795,111 @@ public void processElementForElementAndRestriction( continue; } -// Make sure to get the output watermark before we split to ensure that the lower bound -// applies to both the primary and residual. -KV watermarkAndState = -currentWatermarkEstimator.getWatermarkAndState(); -SplitResult result = currentTracker.trySplit(0); +// Attempt to checkpoint the current restriction. +HandlesSplits.SplitResult splitResult = Review comment: Got it. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423145) Time Spent: 18.5h (was: 18h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422778 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 15/Apr/20 15:13 Start Date: 15/Apr/20 15:13 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#issuecomment-614099306 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422778) Time Spent: 18h 20m (was: 18h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422418 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Apr/20 23:10 Start Date: 14/Apr/20 23:10 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#discussion_r408487984 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java ## @@ -761,72 +795,111 @@ public void processElementForElementAndRestriction( continue; } -// Make sure to get the output watermark before we split to ensure that the lower bound -// applies to both the primary and residual. -KV watermarkAndState = -currentWatermarkEstimator.getWatermarkAndState(); -SplitResult result = currentTracker.trySplit(0); +// Attempt to checkpoint the current restriction. +HandlesSplits.SplitResult splitResult = Review comment: The DoFn returned `ProcessContinuation#resume` and not `ProcessContinuation#stop`. The current restriction within the restriction tracker represents the entire restriction so splitting it provides the remainder and should update the restriction tracker so that the primary is in a done state so that `checkDone` down below can succeed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422418) Time Spent: 18h 10m (was: 18h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422408=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422408 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Apr/20 22:53 Start Date: 14/Apr/20 22:53 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#discussion_r408482142 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java ## @@ -761,72 +795,111 @@ public void processElementForElementAndRestriction( continue; } -// Make sure to get the output watermark before we split to ensure that the lower bound -// applies to both the primary and residual. -KV watermarkAndState = -currentWatermarkEstimator.getWatermarkAndState(); -SplitResult result = currentTracker.trySplit(0); +// Attempt to checkpoint the current restriction. +HandlesSplits.SplitResult splitResult = Review comment: Why do we want to checkpoint after processElement finishing? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422408) Time Spent: 18h (was: 17h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 18h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422201=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422201 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Apr/20 18:04 Start Date: 14/Apr/20 18:04 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#issuecomment-613593412 CC: @youngoli This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422201) Time Spent: 17h 50m (was: 17h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 17h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422196=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422196 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Apr/20 17:57 Start Date: 14/Apr/20 17:57 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414 The next step is to plumb the split request to the BeamFnDataReadRunner which will forward it to the FnApiDoFnRunner. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422197=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422197 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Apr/20 17:57 Start Date: 14/Apr/20 17:57 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11414: [BEAM-5605, BEAM-2939] Add support for FnApiDoFnRunner to handle split calls. URL: https://github.com/apache/beam/pull/11414#issuecomment-613590197 R: @ihji @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422197) Time Spent: 17h 40m (was: 17.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 17h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=390833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390833 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 21/Feb/20 18:26 Start Date: 21/Feb/20 18:26 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10920: [BEAM-5605] Eagerly close the BoundedReader once we have read everything or have failed. URL: https://github.com/apache/beam/pull/10920 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390833) Time Spent: 17h 20m (was: 17h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 17h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=390346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390346 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 21/Feb/20 00:58 Start Date: 21/Feb/20 00:58 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10920: [BEAM-5605] Eagerly close the BoundedReader once we have read everything or have failed. URL: https://github.com/apache/beam/pull/10920#issuecomment-589446799 R: @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390346) Time Spent: 17h 10m (was: 17h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 17h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=390338=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390338 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 21/Feb/20 00:38 Start Date: 21/Feb/20 00:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10920: [BEAM-5605] Eagerly close the BoundedReader once we have read everything or have failed. URL: https://github.com/apache/beam/pull/10920#issuecomment-589441993 R: @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390338) Time Spent: 17h (was: 16h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 17h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=390337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390337 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 21/Feb/20 00:37 Start Date: 21/Feb/20 00:37 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10920: [BEAM-5605] Eagerly close the BoundedReader once we have read everything or have failed. URL: https://github.com/apache/beam/pull/10920 If the bundle throws an exception for some other reason, the currentReader will go out of scope and will not be closed cleanly. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=390333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390333 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 21/Feb/20 00:26 Start Date: 21/Feb/20 00:26 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10893: [BEAM-5605] Honor the bounded source timestamps timestamp in the SDF wrapper. URL: https://github.com/apache/beam/pull/10893 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390333) Time Spent: 16h 40m (was: 16.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 16h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=390125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390125 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 20/Feb/20 18:07 Start Date: 20/Feb/20 18:07 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10893: [BEAM-5605] Honor the bounded source timestamps timestamp in the SDF wrapper. URL: https://github.com/apache/beam/pull/10893#discussion_r382168112 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -242,18 +243,19 @@ public void splitRestriction( } @NewTracker -public RestrictionTracker, Object[]> restrictionTracker( +public RestrictionTracker, TimestampedValue[]> restrictionTracker( @Restriction BoundedSource restriction, PipelineOptions pipelineOptions) { return new BoundedSourceAsSDFRestrictionTracker<>(restriction, pipelineOptions); } @ProcessElement public void processElement( -RestrictionTracker, Object[]> tracker, OutputReceiver receiver) +RestrictionTracker, TimestampedValue[]> tracker, +OutputReceiver receiver) throws IOException { - Object[] out = new Object[1]; + TimestampedValue[] out = new TimestampedValue[1]; while (tracker.tryClaim(out)) { -receiver.output((T) out[0]); +receiver.outputWithTimestamp(out[0].getValue(), out[0].getTimestamp()); Review comment: yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390125) Time Spent: 16.5h (was: 16h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 16.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=389586=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389586 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 19/Feb/20 19:51 Start Date: 19/Feb/20 19:51 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #10893: [BEAM-5605] Honor the bounded source timestamps timestamp in the SDF wrapper. URL: https://github.com/apache/beam/pull/10893#discussion_r381507574 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -242,18 +243,19 @@ public void splitRestriction( } @NewTracker -public RestrictionTracker, Object[]> restrictionTracker( +public RestrictionTracker, TimestampedValue[]> restrictionTracker( @Restriction BoundedSource restriction, PipelineOptions pipelineOptions) { return new BoundedSourceAsSDFRestrictionTracker<>(restriction, pipelineOptions); } @ProcessElement public void processElement( -RestrictionTracker, Object[]> tracker, OutputReceiver receiver) +RestrictionTracker, TimestampedValue[]> tracker, +OutputReceiver receiver) throws IOException { - Object[] out = new Object[1]; + TimestampedValue[] out = new TimestampedValue[1]; while (tracker.tryClaim(out)) { -receiver.output((T) out[0]); +receiver.outputWithTimestamp(out[0].getValue(), out[0].getTimestamp()); Review comment: Is the timestamp for the output timestamp of records from BoundedSource? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389586) Time Spent: 16h 20m (was: 16h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 16h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=389242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389242 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 19/Feb/20 01:02 Start Date: 19/Feb/20 01:02 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10893: [BEAM-5605] Honor the bounded source timestamps timestamp in the SDF wrapper. URL: https://github.com/apache/beam/pull/10893#issuecomment-587981986 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389242) Time Spent: 16h 10m (was: 16h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 16h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=389138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389138 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 18/Feb/20 22:24 Start Date: 18/Feb/20 22:24 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10893: [BEAM-5605] Honor the bounded source timestamps timestamp in the SDF wrapper. URL: https://github.com/apache/beam/pull/10893#issuecomment-587934667 R: @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389138) Time Spent: 16h (was: 15h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 16h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=389137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389137 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 18/Feb/20 22:24 Start Date: 18/Feb/20 22:24 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10893: [BEAM-5605] Honor the bounded source timestamps timestamp in the SDF wrapper. URL: https://github.com/apache/beam/pull/10893 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387484 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 17:21 Start Date: 14/Feb/20 17:21 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586384650 Oh I see this is for the portable runner. Sorry for the extra noise This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387484) Time Spent: 15h 40m (was: 15.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 15h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387482=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387482 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 17:19 Start Date: 14/Feb/20 17:19 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586383904 I filed https://issues.apache.org/jira/browse/BEAM-9317 to track this one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387482) Time Spent: 15.5h (was: 15h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 15.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387481=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387481 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 17:16 Start Date: 14/Feb/20 17:16 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586382756 The PVR failures I linked above definitely seem to have started when this PR went in: https://github.com/apache/beam/commit/c1d054faf043677d8700de9ed6eada03cccf55e8 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387481) Time Spent: 15h 20m (was: 15h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 15h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387480 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 17:15 Start Date: 14/Feb/20 17:15 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586382380 @iemejia Your both right in two different things being broken. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387480) Time Spent: 15h 10m (was: 15h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 15h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387472 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 16:54 Start Date: 14/Feb/20 16:54 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586373893 Apparently the fix of Dynamic Timers for Dataflow broke the ValidatesRunner tests for Flink so this is unrelated. See my post in the mailing list. Tests are failing since the exact commit for the fix: 7719708a04d5d0eff3048dbd58ac1337889f8ba5 For details on the exception: https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/6608/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387472) Time Spent: 14h 50m (was: 14h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 14h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387474 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 16:54 Start Date: 14/Feb/20 16:54 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586373893 Apparently the fix of Dynamic Timers for Dataflow broke the ValidatesRunner tests for Flink so this is unrelated to this PR. See my post in the mailing list. Tests are failing since the exact commit for the fix: 7719708a04d5d0eff3048dbd58ac1337889f8ba5 For details on the exception: https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/6608/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387474) Time Spent: 15h (was: 14h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 15h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387466 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 16:34 Start Date: 14/Feb/20 16:34 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586365799 AFAICT Spark_Batch was already flaky, but now its failing consistently, and the Flink ones started failing at the same time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387466) Time Spent: 14h 40m (was: 14.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 14h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387457 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 16:27 Start Date: 14/Feb/20 16:27 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586363000 > It looks like this broke PVR_{Spark,Flink} > > https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/4092/ > https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/4104/ > https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/2089/ Taking a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387457) Time Spent: 14.5h (was: 14h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 14.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=387450=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387450 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Feb/20 16:11 Start Date: 14/Feb/20 16:11 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-586356166 It looks like this broke PVR_{Spark,Flink} https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/4092/ https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/4104/ https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/2089/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 387450) Time Spent: 14h 20m (was: 14h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 14h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386875 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Feb/20 21:02 Start Date: 13/Feb/20 21:02 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386875) Time Spent: 14h 10m (was: 14h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 14h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386815 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Feb/20 19:05 Start Date: 13/Feb/20 19:05 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585921415 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386815) Time Spent: 14h (was: 13h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 14h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386173=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386173 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 19:37 Start Date: 12/Feb/20 19:37 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585380318 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386173) Time Spent: 13.5h (was: 13h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 13.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386175=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386175 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 19:37 Start Date: 12/Feb/20 19:37 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585380436 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386175) Time Spent: 13h 50m (was: 13h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 13h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386174=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386174 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 19:37 Start Date: 12/Feb/20 19:37 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585380395 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386174) Time Spent: 13h 40m (was: 13.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 13h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386094 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 17:39 Start Date: 12/Feb/20 17:39 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585327146 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386094) Time Spent: 13h 20m (was: 13h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 13h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386091=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386091 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 17:38 Start Date: 12/Feb/20 17:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585325730 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386091) Time Spent: 13h (was: 12h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 13h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386090 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 17:38 Start Date: 12/Feb/20 17:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585325143 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386090) Time Spent: 12h 50m (was: 12h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 12h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=386092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386092 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 12/Feb/20 17:38 Start Date: 12/Feb/20 17:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-585325987 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 386092) Time Spent: 13h 10m (was: 13h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 13h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=385506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385506 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Feb/20 22:47 Start Date: 11/Feb/20 22:47 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r377945680 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/JavaReadViaImpulse.java ## @@ -1,176 +0,0 @@ -/* Review comment: Gotcha. I already approved this, but consider it double-approved. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 385506) Time Spent: 12h 40m (was: 12.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 12h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=385436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385436 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Feb/20 20:55 Start Date: 11/Feb/20 20:55 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r377892829 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/JavaReadViaImpulse.java ## @@ -1,176 +0,0 @@ -/* Review comment: Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 385436) Time Spent: 12.5h (was: 12h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 12.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=385434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385434 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Feb/20 20:49 Start Date: 11/Feb/20 20:49 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r37788 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -177,4 +205,128 @@ public void populateDisplayData(DisplayData.Builder builder) { .include("source", source); } } + + /** + * A splittable {@link DoFn} which executes a {@link BoundedSource}. + * + * We model the element as the original source and the restriction as the sub-source. This + * allows us to split the sub-source over and over yet still receive "source" objects as inputs. + */ + static class BoundedSourceAsSDFWrapperFn extends DoFn, T> { +private static final long DEFAULT_DESIRED_BUNDLE_SIZE_BYTES = 64 * (1 << 20); + +@GetInitialRestriction +public BoundedSource initialRestriction(@Element BoundedSource element) { + return element; +} + +@GetSize +public double getSize( +@Restriction BoundedSource restriction, PipelineOptions pipelineOptions) +throws Exception { + return restriction.getEstimatedSizeBytes(pipelineOptions); +} + +@SplitRestriction +public void splitRestriction( +@Restriction BoundedSource restriction, +OutputReceiver> receiver, +PipelineOptions pipelineOptions) +throws Exception { + for (BoundedSource split : + restriction.split(DEFAULT_DESIRED_BUNDLE_SIZE_BYTES, pipelineOptions)) { +receiver.output(split); + } +} + +@NewTracker +public RestrictionTracker, Object[]> restrictionTracker( +@Restriction BoundedSource restriction, PipelineOptions pipelineOptions) { + return new BoundedSourceAsSDFRestrictionTracker<>(restriction, pipelineOptions); +} + +@ProcessElement +public void processElement( +RestrictionTracker, Object[]> tracker, OutputReceiver receiver) +throws IOException { + Object[] out = new Object[1]; + while (tracker.tryClaim(out)) { +receiver.output((T) out[0]); + } +} + +@GetRestrictionCoder +public Coder> restrictionCoder() { + return SerializableCoder.of(new TypeDescriptor>() {}); +} + +/** + * A fake restriction tracker which adapts to the {@link BoundedSource} API. The restriction + * object is used to advance the underlying source and to "return" the current element. + */ +private static class BoundedSourceAsSDFRestrictionTracker Review comment: :clap: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 385434) Time Spent: 12h 20m (was: 12h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 12h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=385394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385394 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Feb/20 19:34 Start Date: 11/Feb/20 19:34 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-584811885 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 385394) Time Spent: 12h 10m (was: 12h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 12h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=385388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-385388 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Feb/20 19:25 Start Date: 11/Feb/20 19:25 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r377847499 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/JavaReadViaImpulse.java ## @@ -1,176 +0,0 @@ -/* Review comment: Just to make sure I'm understanding correctly: This transform was only used for portable pipelines, right? So it isn't needed anymore now that we execute bounded reads by wrapping them in an SDF? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 385388) Time Spent: 12h (was: 11h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 12h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=384826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384826 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 10/Feb/20 23:12 Start Date: 10/Feb/20 23:12 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r377372981 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource source) { public final PCollection expand(PBegin input) { source.validate(); + if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), "beam_fn_api")) { Review comment: I was able to remove JavaReadViaImpulse. It required fixing up some tests that weren't setting the `beam_fn_api` experiment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 384826) Time Spent: 11h 50m (was: 11h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 11h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=382422=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382422 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 05/Feb/20 18:23 Start Date: 05/Feb/20 18:23 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382422) Time Spent: 11h 40m (was: 11.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 11h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=380251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380251 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 31/Jan/20 23:00 Start Date: 31/Jan/20 23:00 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r373722630 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -871,13 +960,20 @@ public Duration getAllowedTimestampSkew() { * Annotation for the method that creates a new {@link RestrictionTracker} for the restriction of * a https://s.apache.org/splittable-do-fn;>splittable {@link DoFn}. * - * Signature: {@code MyRestrictionTracker newTracker(RestrictionT restriction, );} where {@code MyRestrictionTracker} must be a subtype of {@code - * RestrictionTracker}. + * Signature: {@code MyRestrictionTracker newTracker();} * - * The optional arguments are allowed to be: + * This method must satisfy the following constraints: * * + * The return type must be a subtype of {@code RestrictionTracker}. + * It is suggested to use as narrow of a return type definition as possible (for example + * prefer to use a square type over a shape type as a square is a type of a shape). + * If one of its arguments is tagged with the {@link Element} annotation, then it will be Review comment: No. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 380251) Time Spent: 11.5h (was: 11h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 11.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=380239=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380239 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 31/Jan/20 22:39 Start Date: 31/Jan/20 22:39 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372145661 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -871,13 +960,20 @@ public Duration getAllowedTimestampSkew() { * Annotation for the method that creates a new {@link RestrictionTracker} for the restriction of * a https://s.apache.org/splittable-do-fn;>splittable {@link DoFn}. * - * Signature: {@code MyRestrictionTracker newTracker(RestrictionT restriction, );} where {@code MyRestrictionTracker} must be a subtype of {@code - * RestrictionTracker}. + * Signature: {@code MyRestrictionTracker newTracker();} * - * The optional arguments are allowed to be: + * This method must satisfy the following constraints: * * + * The return type must be a subtype of {@code RestrictionTracker}. + * It is suggested to use as narrow of a return type definition as possible (for example + * prefer to use a square type over a shape type as a square is a type of a shape). + * If one of its arguments is tagged with the {@link Element} annotation, then it will be Review comment: Does creating tracker require passing in `element`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 380239) Time Spent: 11h 20m (was: 11h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 11h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=380224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380224 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 31/Jan/20 22:09 Start Date: 31/Jan/20 22:09 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-580932788 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 380224) Time Spent: 11h 10m (was: 11h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 11h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=379500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379500 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 30/Jan/20 17:38 Start Date: 30/Jan/20 17:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-580369182 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 379500) Time Spent: 11h (was: 10h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 11h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=379499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379499 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 30/Jan/20 17:38 Start Date: 30/Jan/20 17:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-580369130 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 379499) Time Spent: 10h 50m (was: 10h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 10h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378951 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 17:59 Start Date: 29/Jan/20 17:59 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372540565 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -758,11 +804,19 @@ public Duration getAllowedTimestampSkew() { * Annotation for the method that maps an element to an initial restriction for a https://s.apache.org/splittable-do-fn;>splittable {@link DoFn}. * - * Signature: {@code RestrictionT getInitialRestriction(InputT element, );} + * Signature: {@code RestrictionT getInitialRestriction();} * - * The optional arguments are allowed to be: + * This method must satisfy the following constraints: * * + * The return type {@code RestrictionT} defines the restriction type used within this + * splittable DoFn. All other methods that use a {@link Restriction @Restriction} parameter + * must use the same type that is used here. It is suggested to use as narrow of a return + * type definition as possible (for example prefer to use a square type over a shape type as + * a square is a type of a shape). + * If one of its arguments is tagged with the {@link Element} annotation, then it will be + * passed the current element being processed; the argument must be of type {@code InputT}. + * Note that schema element parameters are currently unsupported. Review comment: good point. I filed [BEAM-9217](https://issues.apache.org/jira/browse/BEAM-9217) to follow-up with that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378951) Time Spent: 10h 40m (was: 10.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 10h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378929 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 17:13 Start Date: 29/Jan/20 17:13 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372516471 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -758,11 +804,19 @@ public Duration getAllowedTimestampSkew() { * Annotation for the method that maps an element to an initial restriction for a https://s.apache.org/splittable-do-fn;>splittable {@link DoFn}. * - * Signature: {@code RestrictionT getInitialRestriction(InputT element, );} + * Signature: {@code RestrictionT getInitialRestriction();} * - * The optional arguments are allowed to be: + * This method must satisfy the following constraints: * * + * The return type {@code RestrictionT} defines the restriction type used within this + * splittable DoFn. All other methods that use a {@link Restriction @Restriction} parameter + * must use the same type that is used here. It is suggested to use as narrow of a return + * type definition as possible (for example prefer to use a square type over a shape type as + * a square is a type of a shape). + * If one of its arguments is tagged with the {@link Element} annotation, then it will be + * passed the current element being processed; the argument must be of type {@code InputT}. + * Note that schema element parameters are currently unsupported. Review comment: I would also suggest updating the DoFn documentation related to schemas since the docs around what is FieldAccess and how @Element parameters interact seems to be incorrect since the documentation currently states that "the argument type must match the input type of this DoFn" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378929) Time Spent: 10.5h (was: 10h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 10.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378928 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 17:11 Start Date: 29/Jan/20 17:11 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372515532 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. * It may define a {@link NewTracker} method returning a subtype of {@code * RestrictionTracker} where {@code R} is the restriction type returned by {@link * GetInitialRestriction}. This method is optional in case the restriction type returned by * {@link GetInitialRestriction} implements {@link HasDefaultTracker}. * It may define a {@link GetRestrictionCoder} method. * The type of restrictions used by all of these methods must be the same. - * Its {@link ProcessElement} method may return a {@link ProcessContinuation} to - * indicate whether there is more work to be done for the current element. - * Its {@link ProcessElement} method must not use any extra context parameters, such - * as {@link BoundedWindow}. * The {@link DoFn} itself may be annotated with {@link BoundedPerElement} or {@link * UnboundedPerElement}, but not both at the same time. If it's not annotated with either of * these, it's assumed to be {@link BoundedPerElement} if its {@link ProcessElement} method * returns {@code void} and {@link UnboundedPerElement} if it returns a {@link * ProcessContinuation}. + * Timers and state must not be used. * * * A non-splittable {@link DoFn} must not define any of these methods. + * + * This method must satisfy the following constraints: Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378928) Time Spent: 10h 20m (was: 10h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 10h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378927 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 17:09 Start Date: 29/Jan/20 17:09 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372514429 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. * It may define a {@link NewTracker} method returning a subtype of {@code * RestrictionTracker} where {@code R} is the restriction type returned by {@link * GetInitialRestriction}. This method is optional in case the restriction type returned by * {@link GetInitialRestriction} implements {@link HasDefaultTracker}. * It may define a {@link GetRestrictionCoder} method. * The type of restrictions used by all of these methods must be the same. Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378927) Time Spent: 10h 10m (was: 10h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 10h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378925 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 17:09 Start Date: 29/Jan/20 17:09 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372514093 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. * It may define a {@link NewTracker} method returning a subtype of {@code * RestrictionTracker} where {@code R} is the restriction type returned by {@link * GetInitialRestriction}. This method is optional in case the restriction type returned by * {@link GetInitialRestriction} implements {@link HasDefaultTracker}. Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378925) Time Spent: 10h (was: 9h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 10h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378924 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 17:08 Start Date: 29/Jan/20 17:08 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372514047 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378924) Time Spent: 9h 50m (was: 9h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378612 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 01:17 Start Date: 29/Jan/20 01:17 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372141617 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. * It may define a {@link NewTracker} method returning a subtype of {@code * RestrictionTracker} where {@code R} is the restriction type returned by {@link * GetInitialRestriction}. This method is optional in case the restriction type returned by * {@link GetInitialRestriction} implements {@link HasDefaultTracker}. * It may define a {@link GetRestrictionCoder} method. * The type of restrictions used by all of these methods must be the same. - * Its {@link ProcessElement} method may return a {@link ProcessContinuation} to - * indicate whether there is more work to be done for the current element. - * Its {@link ProcessElement} method must not use any extra context parameters, such - * as {@link BoundedWindow}. * The {@link DoFn} itself may be annotated with {@link BoundedPerElement} or {@link * UnboundedPerElement}, but not both at the same time. If it's not annotated with either of * these, it's assumed to be {@link BoundedPerElement} if its {@link ProcessElement} method * returns {@code void} and {@link UnboundedPerElement} if it returns a {@link * ProcessContinuation}. + * Timers and state must not be used. * * * A non-splittable {@link DoFn} must not define any of these methods. + * + * This method must satisfy the following constraints: Review comment: Would be helpful to add an "If this DoFn is splittable," here just to be very clear. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378612) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378614 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 01:17 Start Date: 29/Jan/20 01:17 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372137192 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. * It may define a {@link NewTracker} method returning a subtype of {@code * RestrictionTracker} where {@code R} is the restriction type returned by {@link * GetInitialRestriction}. This method is optional in case the restriction type returned by * {@link GetInitialRestriction} implements {@link HasDefaultTracker}. * It may define a {@link GetRestrictionCoder} method. * The type of restrictions used by all of these methods must be the same. Review comment: nit: maybe this should be first and should read `The type of restrictions, {@code R}, used in all restriction methods must be the same.` Then you can reference R elsewhere without having to define it again. Up to you, just an idea I had that may increase clarity This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378614) Time Spent: 9h 40m (was: 9.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378611 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 01:17 Start Date: 29/Jan/20 01:17 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372140813 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -758,11 +804,19 @@ public Duration getAllowedTimestampSkew() { * Annotation for the method that maps an element to an initial restriction for a https://s.apache.org/splittable-do-fn;>splittable {@link DoFn}. * - * Signature: {@code RestrictionT getInitialRestriction(InputT element, );} + * Signature: {@code RestrictionT getInitialRestriction();} * - * The optional arguments are allowed to be: + * This method must satisfy the following constraints: * * + * The return type {@code RestrictionT} defines the restriction type used within this + * splittable DoFn. All other methods that use a {@link Restriction @Restriction} parameter + * must use the same type that is used here. It is suggested to use as narrow of a return + * type definition as possible (for example prefer to use a square type over a shape type as + * a square is a type of a shape). + * If one of its arguments is tagged with the {@link Element} annotation, then it will be + * passed the current element being processed; the argument must be of type {@code InputT}. + * Note that schema element parameters are currently unsupported. Review comment: What does it mean that schema element parameters aren't supported - that you can't use `FieldAccess`? Or that we won't automatically convert types that have equivalent schemas? both (seems likely)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378611) Time Spent: 9.5h (was: 9h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378610 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 01:17 Start Date: 29/Jan/20 01:17 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372130170 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. Review comment: What happens if one chooses not to implement these? Might be useful to describe that here to encourage users to do it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378610) Time Spent: 9h 20m (was: 9h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378613 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 01:17 Start Date: 29/Jan/20 01:17 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#discussion_r372136072 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java ## @@ -614,45 +612,93 @@ public Duration getAllowedTimestampSkew() { * See https://s.apache.org/splittable-do-fn;>the proposal for an overview of the * involved concepts (splittable DoFn, restriction, restriction tracker). * - * If a {@link DoFn} is splittable, the following constraints must be respected: + * A splittable {@link DoFn} must obey the following constraints: * * * It must define a {@link GetInitialRestriction} method. - * It may define a {@link GetSize} method. - * It may define a {@link SplitRestriction} method. + * It should define a {@link GetSize} method or ensure that the {@link + * RestrictionTracker} implements {@link Sizes.HasSize}. + * It should define a {@link SplitRestriction} method. * It may define a {@link NewTracker} method returning a subtype of {@code * RestrictionTracker} where {@code R} is the restriction type returned by {@link * GetInitialRestriction}. This method is optional in case the restriction type returned by * {@link GetInitialRestriction} implements {@link HasDefaultTracker}. Review comment: Is this similar to the `GetSize` requirement in that one _should_ do one or the other? Or is it required that the user does exactly one of them? Maybe this is just because I'm unfamiliar with SDF, but I think these either/or requirements could be clearer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378613) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378576 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 29/Jan/20 00:02 Start Date: 29/Jan/20 00:02 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-579531443 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378576) Time Spent: 9h 10m (was: 9h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378391 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 18:28 Start Date: 28/Jan/20 18:28 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-579389168 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378391) Time Spent: 9h (was: 8h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 9h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378380 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 17:57 Start Date: 28/Jan/20 17:57 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-579376564 I'm going to wait on #10702 and update this PR after that goes in since it is necessary to support the new DoFn style parameter passing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378380) Time Spent: 8h 50m (was: 8h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378379 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 17:56 Start Date: 28/Jan/20 17:56 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702#issuecomment-579376066 R: @TheNeuralBit @boyuanzz CC: @iemejia @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 378379) Time Spent: 8h 40m (was: 8.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378378 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 28/Jan/20 17:54 Start Date: 28/Jan/20 17:54 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10702: [BEAM-5605] Migrate splittable DoFn methods to use "new" DoFn style argument providing. URL: https://github.com/apache/beam/pull/10702 Defined a new @Restriction parameter type to pass in the restriction. Updated GetSize/GetInitialRestriction/SplitRestriction/NewTracker to take these new DoFn style parameters. Updated lots of documentation and existing implementations to use the new DoFn argument passing. Fixed ByteBuddyDoFnInvokerFactory to support return values that aren't references. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=375822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-375822 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 22/Jan/20 19:21 Start Date: 22/Jan/20 19:21 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r369754696 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource source) { public final PCollection expand(PBegin input) { source.validate(); + if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), "beam_fn_api")) { Review comment: Users/framework will need to ensure that the `beam_fn_api` experiment is always used. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 375822) Time Spent: 8h 20m (was: 8h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=373167=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373167 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Jan/20 17:41 Start Date: 16/Jan/20 17:41 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r367558277 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource source) { public final PCollection expand(PBegin input) { source.validate(); + if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), "beam_fn_api")) { Review comment: I thought the only application of this override is with the `beam_fn_api` flag enabled. The legacy translation does not support Impulse. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 373167) Time Spent: 8h 10m (was: 8h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=373143=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373143 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Jan/20 16:51 Start Date: 16/Jan/20 16:51 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r367532741 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource source) { public final PCollection expand(PBegin input) { source.validate(); + if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), "beam_fn_api")) { Review comment: Eventually yes but for now we could exclude the override if "beam_fn_api" was used as an experiment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 373143) Time Spent: 8h (was: 7h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 8h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=373141=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373141 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Jan/20 16:50 Start Date: 16/Jan/20 16:50 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r367532741 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource source) { public final PCollection expand(PBegin input) { source.validate(); + if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), "beam_fn_api")) { Review comment: We could exclude the override if "beam_fn_api" was used as an experiment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 373141) Time Spent: 7h 50m (was: 7h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 7h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=372902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-372902 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Jan/20 09:34 Start Date: 16/Jan/20 09:34 Worklog Time Spent: 10m Work Description: mxm commented on pull request #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#discussion_r367300103 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java ## @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource source) { public final PCollection expand(PBegin input) { source.validate(); + if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), "beam_fn_api")) { Review comment: Does that mean we can remove https://github.com/apache/beam/blob/e1852ca6af92d61467bfa7dac84e96a2924eefac/runners/portability/java/src/main/java/org/apache/beam/runners/portability/PortableRunner.java#L154 ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 372902) Time Spent: 7h 40m (was: 7.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 7h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=372690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-372690 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 16/Jan/20 01:29 Start Date: 16/Jan/20 01:29 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-574938033 please don't merge as I'm testing this within Google to see if this negatively impacts internal pipelines. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 372690) Time Spent: 7.5h (was: 7h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 7.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371928 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Jan/20 22:17 Start Date: 14/Jan/20 22:17 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment. URL: https://github.com/apache/beam/pull/10576#issuecomment-574400656 R: @youngoli @boyuanzz CC: @tweise @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 371928) Time Spent: 7h 20m (was: 7h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 7h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371722 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Jan/20 17:26 Start Date: 14/Jan/20 17:26 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10576: [WIP, BEAM-5605] Convert all BoundedSources to SplittableDoFns when using portable pipelines. URL: https://github.com/apache/beam/pull/10576#issuecomment-574284697 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 371722) Time Spent: 7h 10m (was: 7h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 7h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371244 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 14/Jan/20 00:19 Start Date: 14/Jan/20 00:19 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10576: [WIP, BEAM-5605] Convert all BoundedSources to SplittableDoFns when using portable pipelines. URL: https://github.com/apache/beam/pull/10576 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371198=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371198 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 23:29 Start Date: 13/Jan/20 23:29 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10556: [BEAM-5605] Add support for additional parameters to SplittableDofn methods URL: https://github.com/apache/beam/pull/10556 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 371198) Time Spent: 6h 50m (was: 6h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 6h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371197=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371197 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 23:27 Start Date: 13/Jan/20 23:27 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 371197) Time Spent: 6h 40m (was: 6.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 6h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371153 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 22:18 Start Date: 13/Jan/20 22:18 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#issuecomment-573899466 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 371153) Time Spent: 6.5h (was: 6h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 6.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=371024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-371024 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 19:33 Start Date: 13/Jan/20 19:33 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#issuecomment-573832008 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 371024) Time Spent: 6h 20m (was: 6h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 6h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370961 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 17:55 Start Date: 13/Jan/20 17:55 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365943009 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -170,7 +181,109 @@ public void registerInputLocation() { apiServiceDescriptor, LogicalEndpoint.of(processBundleInstructionIdSupplier.get(), pTransformId), coder, -consumer); +this::forwardElementToConsumer); + } + + public void forwardElementToConsumer(WindowedValue element) throws Exception { +synchronized (splittingLock) { + if (index == stopIndex - 1) { +return; + } + index += 1; +} +consumer.accept(element); + } + + public void split( + ProcessBundleSplitRequest request, ProcessBundleSplitResponse.Builder response) { +DesiredSplit desiredSplit = request.getDesiredSplitsMap().get(pTransformId); +if (desiredSplit == null) { + return; +} + +long totalBufferSize = desiredSplit.getEstimatedInputElements(); + +HandlesSplits splittingConsumer = null; +if (consumer instanceof HandlesSplits) { + splittingConsumer = ((HandlesSplits) consumer); +} + +synchronized (splittingLock) { + // Since we hold the splittingLock, we guarantee that we will not pass the next element + // to the downstream consumer. We still have a race where the downstream consumer may + // have yet to see the element or has completed processing the element by the time + // we ask it to split (even after we have asked for its progress). + + // If the split request we received was delayed and is less then the known number of elements + // then use "index + 1" as the total size. Similarly, if we have already split and the + // split request is bounded incorrectly, use the stop index as the upper bound. + if (totalBufferSize < index + 1) { +totalBufferSize = index + 1; + } else if (totalBufferSize > stopIndex) { +totalBufferSize = stopIndex; + } + + // In the case where we have yet to process an element, set the current element progress to 1. + double currentElementProgress = 1; + + // If we have started processing at least one element, attempt to get the downstream + // progress defaulting to 0.5 if no progress was able to get fetched. + if (index >= 0) { +if (splittingConsumer != null) { + currentElementProgress = splittingConsumer.getProgress(); +} else { + currentElementProgress = 0.5; +} + } + + checkArgument( + desiredSplit.getAllowedSplitPointsList().isEmpty(), + "TODO: BEAM-3836, support split point restrictions."); + + // Now figure out where to split. + // + // The units here (except for keepOfElementRemainder) are all in terms of number or + // (possibly fractional) elements. + + // Compute the amount of "remaining" work that we know of. + double remainder = totalBufferSize - index - currentElementProgress; + // Compute the fraction of work that we should "keep". Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370961) Time Spent: 6h 10m (was: 6h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 6h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370959 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 17:54 Start Date: 13/Jan/20 17:54 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365942512 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -131,6 +137,11 @@ private final BeamFnDataClient beamFnDataClient; private final Coder> coder; + private final Object splittingLock = new Object(); + // 0-based count of the number of elements + private long index = -1; + // 0-based count of the number of elements Review comment: first element to not process. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370959) Time Spent: 6h (was: 5h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 6h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370936=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370936 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 13/Jan/20 17:32 Start Date: 13/Jan/20 17:32 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365932235 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -170,7 +181,109 @@ public void registerInputLocation() { apiServiceDescriptor, LogicalEndpoint.of(processBundleInstructionIdSupplier.get(), pTransformId), coder, -consumer); +this::forwardElementToConsumer); + } + + public void forwardElementToConsumer(WindowedValue element) throws Exception { +synchronized (splittingLock) { + if (index == stopIndex - 1) { +return; + } + index += 1; +} +consumer.accept(element); + } + + public void split( + ProcessBundleSplitRequest request, ProcessBundleSplitResponse.Builder response) { +DesiredSplit desiredSplit = request.getDesiredSplitsMap().get(pTransformId); +if (desiredSplit == null) { + return; +} + +long totalBufferSize = desiredSplit.getEstimatedInputElements(); + +HandlesSplits splittingConsumer = null; +if (consumer instanceof HandlesSplits) { + splittingConsumer = ((HandlesSplits) consumer); +} + +synchronized (splittingLock) { + // Since we hold the splittingLock, we guarantee that we will not pass the next element + // to the downstream consumer. We still have a race where the downstream consumer may + // have yet to see the element or has completed processing the element by the time + // we ask it to split (even after we have asked for its progress). + + // If the split request we received was delayed and is less then the known number of elements + // then use "index + 1" as the total size. Similarly, if we have already split and the + // split request is bounded incorrectly, use the stop index as the upper bound. + if (totalBufferSize < index + 1) { +totalBufferSize = index + 1; + } else if (totalBufferSize > stopIndex) { +totalBufferSize = stopIndex; + } + + // In the case where we have yet to process an element, set the current element progress to 1. Review comment: Its logically the same where the else clause is the default based upon what we initialize. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370936) Time Spent: 5h 50m (was: 5h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370222 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Jan/20 01:49 Start Date: 11/Jan/20 01:49 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365483403 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -170,7 +181,109 @@ public void registerInputLocation() { apiServiceDescriptor, LogicalEndpoint.of(processBundleInstructionIdSupplier.get(), pTransformId), coder, -consumer); +this::forwardElementToConsumer); + } + + public void forwardElementToConsumer(WindowedValue element) throws Exception { +synchronized (splittingLock) { + if (index == stopIndex - 1) { +return; + } + index += 1; +} +consumer.accept(element); + } + + public void split( + ProcessBundleSplitRequest request, ProcessBundleSplitResponse.Builder response) { +DesiredSplit desiredSplit = request.getDesiredSplitsMap().get(pTransformId); +if (desiredSplit == null) { + return; +} + +long totalBufferSize = desiredSplit.getEstimatedInputElements(); + +HandlesSplits splittingConsumer = null; +if (consumer instanceof HandlesSplits) { + splittingConsumer = ((HandlesSplits) consumer); +} + +synchronized (splittingLock) { + // Since we hold the splittingLock, we guarantee that we will not pass the next element + // to the downstream consumer. We still have a race where the downstream consumer may + // have yet to see the element or has completed processing the element by the time + // we ask it to split (even after we have asked for its progress). + + // If the split request we received was delayed and is less then the known number of elements + // then use "index + 1" as the total size. Similarly, if we have already split and the + // split request is bounded incorrectly, use the stop index as the upper bound. + if (totalBufferSize < index + 1) { +totalBufferSize = index + 1; + } else if (totalBufferSize > stopIndex) { +totalBufferSize = stopIndex; + } + + // In the case where we have yet to process an element, set the current element progress to 1. Review comment: Should this be an else clause below? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370222) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370221=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370221 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Jan/20 01:49 Start Date: 11/Jan/20 01:49 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365488343 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -170,7 +181,109 @@ public void registerInputLocation() { apiServiceDescriptor, LogicalEndpoint.of(processBundleInstructionIdSupplier.get(), pTransformId), coder, -consumer); +this::forwardElementToConsumer); + } + + public void forwardElementToConsumer(WindowedValue element) throws Exception { +synchronized (splittingLock) { + if (index == stopIndex - 1) { +return; + } + index += 1; +} +consumer.accept(element); + } + + public void split( + ProcessBundleSplitRequest request, ProcessBundleSplitResponse.Builder response) { +DesiredSplit desiredSplit = request.getDesiredSplitsMap().get(pTransformId); +if (desiredSplit == null) { + return; +} + +long totalBufferSize = desiredSplit.getEstimatedInputElements(); + +HandlesSplits splittingConsumer = null; +if (consumer instanceof HandlesSplits) { + splittingConsumer = ((HandlesSplits) consumer); +} + +synchronized (splittingLock) { + // Since we hold the splittingLock, we guarantee that we will not pass the next element + // to the downstream consumer. We still have a race where the downstream consumer may + // have yet to see the element or has completed processing the element by the time + // we ask it to split (even after we have asked for its progress). + + // If the split request we received was delayed and is less then the known number of elements + // then use "index + 1" as the total size. Similarly, if we have already split and the + // split request is bounded incorrectly, use the stop index as the upper bound. + if (totalBufferSize < index + 1) { +totalBufferSize = index + 1; + } else if (totalBufferSize > stopIndex) { +totalBufferSize = stopIndex; + } + + // In the case where we have yet to process an element, set the current element progress to 1. + double currentElementProgress = 1; + + // If we have started processing at least one element, attempt to get the downstream + // progress defaulting to 0.5 if no progress was able to get fetched. + if (index >= 0) { +if (splittingConsumer != null) { + currentElementProgress = splittingConsumer.getProgress(); +} else { + currentElementProgress = 0.5; +} + } + + checkArgument( + desiredSplit.getAllowedSplitPointsList().isEmpty(), + "TODO: BEAM-3836, support split point restrictions."); + + // Now figure out where to split. + // + // The units here (except for keepOfElementRemainder) are all in terms of number or + // (possibly fractional) elements. + + // Compute the amount of "remaining" work that we know of. + double remainder = totalBufferSize - index - currentElementProgress; + // Compute the fraction of work that we should "keep". Review comment: Compute the number of elements that we should "keep." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370221) Time Spent: 5h 40m (was: 5.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370220 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Jan/20 01:49 Start Date: 11/Jan/20 01:49 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365479961 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -131,6 +137,11 @@ private final BeamFnDataClient beamFnDataClient; private final Coder> coder; + private final Object splittingLock = new Object(); + // 0-based count of the number of elements Review comment: 0 based index of the current element being processed(?). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370220) Time Spent: 5.5h (was: 5h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370219 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Jan/20 01:49 Start Date: 11/Jan/20 01:49 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #10501: [BEAM-5605] Add support for channel splitting to the gRPC read "source" and propagate "split" calls to the downstream receiver URL: https://github.com/apache/beam/pull/10501#discussion_r365480062 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java ## @@ -131,6 +137,11 @@ private final BeamFnDataClient beamFnDataClient; private final Coder> coder; + private final Object splittingLock = new Object(); + // 0-based count of the number of elements + private long index = -1; + // 0-based count of the number of elements Review comment: 0-based index of the first element to not process. (Or is this the last element to process?) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370219) Time Spent: 5.5h (was: 5h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370182 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Jan/20 00:30 Start Date: 11/Jan/20 00:30 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10556: [BEAM-5605] Add support for additional parameters to SplittableDofn methods URL: https://github.com/apache/beam/pull/10556 This adds PipelineOptions, BoundedWindow, PaneInfo and the elements timestamp as additional valid parameters to SplittableDoFns getInitialRestriction/splitRestriction/newTracker methods. Note that this is not intended to be the full set of allowed optional parameters. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370183=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370183 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 11/Jan/20 00:30 Start Date: 11/Jan/20 00:30 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10556: [BEAM-5605] Add support for additional parameters to SplittableDofn methods URL: https://github.com/apache/beam/pull/10556#issuecomment-573257356 R: @youngoli @boyuanzz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370183) Time Spent: 5h 20m (was: 5h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=370004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-370004 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 10/Jan/20 19:07 Start Date: 10/Jan/20 19:07 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10535: [BEAM-5605] Add support for executing pair with restriction, split restriction, split and size restriction, process element and restriction and process sized element and restriction within the Java SDK harness. URL: https://github.com/apache/beam/pull/10535 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 370004) Time Spent: 5h (was: 4h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=369941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369941 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 10/Jan/20 17:38 Start Date: 10/Jan/20 17:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10535: [BEAM-5605] Add support for executing pair with restriction, split restriction, split and size restriction, process element and restriction and process sized element and restriction within the Java SDK harness. URL: https://github.com/apache/beam/pull/10535#discussion_r365350253 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java ## @@ -162,9 +475,127 @@ public void output(OutputT output, Instant timestamp, BoundedWindow window) { outputTo(consumers, WindowedValue.of(output, timestamp, window, PaneInfo.NO_FIRING)); } }; +switch (context.pTransform.getSpec().getUrn()) { + case PTransformTranslation.SPLITTABLE_SPLIT_RESTRICTION_URN: +this.outputSplitRestrictionReceiver = +new OutputReceiver() { + + @Override + public void output(RestrictionT output) { +outputTo( +mainOutputConsumers, +(WindowedValue) + currentElement.withValue(KV.of(currentElement.getValue(), output))); + } + + @Override + public void outputWithTimestamp(RestrictionT output, Instant timestamp) { +outputTo( +mainOutputConsumers, +(WindowedValue) +WindowedValue.of( +KV.of(currentElement.getValue(), output), +timestamp, +currentWindow, +currentElement.getPane())); + } +}; +break; + case PTransformTranslation.SPLITTABLE_SPLIT_AND_SIZE_RESTRICTIONS_URN: +this.outputSplitRestrictionReceiver = +new OutputReceiver() { + + @Override + public void output(RestrictionT output) { +RestrictionTracker outputTracker = +doFnInvoker.invokeNewTracker(output); +outputTo( +mainOutputConsumers, +(WindowedValue) +currentElement.withValue( +KV.of( +KV.of(currentElement.getValue(), output), +outputTracker instanceof HasSize +? ((HasSize) outputTracker).getSize() +: 1.0))); + } + + @Override + public void outputWithTimestamp(RestrictionT output, Instant timestamp) { +outputTo( +mainOutputConsumers, +(WindowedValue) +WindowedValue.of( +KV.of(currentElement.getValue(), output), Review comment: Yes, fixed in the version below. I'm relying on the migration to using SDF everywhere will catch the edge cases via the validates runner tests since unit testing the execution of a single instance is quite verbose. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 369941) Time Spent: 4h 50m (was: 4h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 4h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=369515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369515 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 10/Jan/20 01:23 Start Date: 10/Jan/20 01:23 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10535: [BEAM-5605] Add support for executing pair with restriction, split restriction, split and size restriction, process element and restriction and process sized element and restriction within the Java SDK harness. URL: https://github.com/apache/beam/pull/10535#discussion_r365025326 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java ## @@ -162,9 +475,127 @@ public void output(OutputT output, Instant timestamp, BoundedWindow window) { outputTo(consumers, WindowedValue.of(output, timestamp, window, PaneInfo.NO_FIRING)); } }; +switch (context.pTransform.getSpec().getUrn()) { + case PTransformTranslation.SPLITTABLE_SPLIT_RESTRICTION_URN: +this.outputSplitRestrictionReceiver = +new OutputReceiver() { + + @Override + public void output(RestrictionT output) { +outputTo( +mainOutputConsumers, +(WindowedValue) + currentElement.withValue(KV.of(currentElement.getValue(), output))); + } + + @Override + public void outputWithTimestamp(RestrictionT output, Instant timestamp) { +outputTo( +mainOutputConsumers, +(WindowedValue) +WindowedValue.of( +KV.of(currentElement.getValue(), output), +timestamp, +currentWindow, +currentElement.getPane())); + } +}; +break; + case PTransformTranslation.SPLITTABLE_SPLIT_AND_SIZE_RESTRICTIONS_URN: +this.outputSplitRestrictionReceiver = +new OutputReceiver() { + + @Override + public void output(RestrictionT output) { +RestrictionTracker outputTracker = +doFnInvoker.invokeNewTracker(output); +outputTo( +mainOutputConsumers, +(WindowedValue) +currentElement.withValue( +KV.of( +KV.of(currentElement.getValue(), output), +outputTracker instanceof HasSize +? ((HasSize) outputTracker).getSize() +: 1.0))); + } + + @Override + public void outputWithTimestamp(RestrictionT output, Instant timestamp) { +outputTo( +mainOutputConsumers, +(WindowedValue) +WindowedValue.of( +KV.of(currentElement.getValue(), output), Review comment: Shouldn't this method also produce sizes, like the `output` method above? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 369515) Time Spent: 4h 40m (was: 4.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 4h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=368467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-368467 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 08/Jan/20 22:58 Start Date: 08/Jan/20 22:58 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10535: [BEAM-5605] Add support for executing pair with restriction, split restriction, split and size restriction, process element and restriction and process sized element and restriction within the Java SDK harness. URL: https://github.com/apache/beam/pull/10535#issuecomment-572298643 R: @youngoli This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 368467) Time Spent: 4.5h (was: 4h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 4.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=368431=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-368431 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 08/Jan/20 21:59 Start Date: 08/Jan/20 21:59 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10535: [BEAM-5605] Add support for executing pair with restriction, split restriction, split and size restriction, process element and restriction and process sized element and restriction within the Java SDK harness. URL: https://github.com/apache/beam/pull/10535 I removed the SplittableProcessElementsRunner and call the DoFnInvoker directly. I left the SplittableElementsInvoker as is since it is used by non portable SplittableDoFns. I plan to have a follow-up change that removes the "context" object as it is no longer necessary. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365597=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365597 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 03/Jan/20 01:48 Start Date: 03/Jan/20 01:48 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10490: [BEAM-5605] Fix type used to describe channel splits to match type used on estimated_input_elements. URL: https://github.com/apache/beam/pull/10490 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365597) Time Spent: 4h (was: 3h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 4h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365599 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 03/Jan/20 01:48 Start Date: 03/Jan/20 01:48 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10489: [BEAM-5605] Ensure that split calls are routed to the active bundle processor for the bundle id. URL: https://github.com/apache/beam/pull/10489 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365599) Time Spent: 4h 10m (was: 4h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 4h 10m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365561 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 03/Jan/20 00:31 Start Date: 03/Jan/20 00:31 Worklog Time Spent: 10m Work Description: lostluck commented on issue #10490: [BEAM-5605] Fix type used to describe channel splits to match type used on estimated_input_elements. URL: https://github.com/apache/beam/pull/10490#issuecomment-570414665 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365561) Time Spent: 3h 50m (was: 3h 40m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 3h 50m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365553=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365553 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 03/Jan/20 00:12 Start Date: 03/Jan/20 00:12 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10489: [BEAM-5605] Ensure that split calls are routed to the active bundle processor for the bundle id. URL: https://github.com/apache/beam/pull/10489#issuecomment-570409980 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365553) Time Spent: 3h 40m (was: 3.5h) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 3h 40m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365522=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365522 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 02/Jan/20 22:39 Start Date: 02/Jan/20 22:39 Worklog Time Spent: 10m Work Description: lostluck commented on issue #10490: [BEAM-5605] Fix type used to describe channel splits to match type used on estimated_input_elements. URL: https://github.com/apache/beam/pull/10490#issuecomment-570381879 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365522) Time Spent: 3.5h (was: 3h 20m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 3.5h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365513 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 02/Jan/20 22:14 Start Date: 02/Jan/20 22:14 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10490: [BEAM-5605] Fix type used to describe channel splits to match type used on estimated_input_elements. URL: https://github.com/apache/beam/pull/10490#issuecomment-570373620 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365513) Time Spent: 3h 20m (was: 3h 10m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 3h 20m > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365512 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 02/Jan/20 22:14 Start Date: 02/Jan/20 22:14 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10490: [BEAM-5605] Fix type used to describe channel splits to match type used on estimated_input_elements. URL: https://github.com/apache/beam/pull/10490 All these values need to support the same "range" of values. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch
[ https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=365508=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365508 ] ASF GitHub Bot logged work on BEAM-5605: Author: ASF GitHub Bot Created on: 02/Jan/20 21:59 Start Date: 02/Jan/20 21:59 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10489: [BEAM-5605] Ensure that split calls are routed to the active bundle processor for the bundle id. URL: https://github.com/apache/beam/pull/10489#discussion_r362646025 ## File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java ## @@ -294,6 +296,19 @@ private void createRunnerAndConsumersForPTransformRecursively( return BeamFnApi.InstructionResponse.newBuilder().setProcessBundle(response); } + /** Splits an active bundle. */ Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365508) Time Spent: 3h (was: 2h 50m) > Support Portable SplittableDoFn for batch > - > > Key: BEAM-5605 > URL: https://issues.apache.org/jira/browse/BEAM-5605 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 3h > Remaining Estimate: 0h > > Roll-up item tracking work towards supporting portable SplittableDoFn for > batch -- This message was sent by Atlassian Jira (v8.3.4#803005)