[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=473096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473096 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Aug/20 22:04 Start Date: 20/Aug/20 22:04 Worklog Time Spent: 10m Work Description: youngoli merged pull request #12628: URL: https://github.com/apache/beam/pull/12628 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 473096) Time Spent: 13h 40m (was: 13.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 13h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472700&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472700 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Aug/20 00:20 Start Date: 20/Aug/20 00:20 Worklog Time Spent: 10m Work Description: youngoli commented on a change in pull request #12628: URL: https://github.com/apache/beam/pull/12628#discussion_r473466587 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -167,6 +167,23 @@ const ( // TODO: ViewFn, etc. ) +var methodNames = []string{ Review comment: Yeah I think I'll go with the set approach instead. Give it another look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472700) Time Spent: 13.5h (was: 13h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 13.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472699&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472699 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Aug/20 00:19 Start Date: 20/Aug/20 00:19 Worklog Time Spent: 10m Work Description: youngoli commented on a change in pull request #12628: URL: https://github.com/apache/beam/pull/12628#discussion_r473466587 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -167,6 +167,23 @@ const ( // TODO: ViewFn, etc. ) +var methodNames = []string{ Review comment: Yeah I think I'll go with the set approach instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472699) Time Spent: 13h 20m (was: 13h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 13h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472519 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 16:37 Start Date: 19/Aug/20 16:37 Worklog Time Spent: 10m Work Description: lostluck merged pull request #12629: URL: https://github.com/apache/beam/pull/12629 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472519) Time Spent: 13h 10m (was: 13h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 13h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472501 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 16:16 Start Date: 19/Aug/20 16:16 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #12629: URL: https://github.com/apache/beam/pull/12629#issuecomment-676523334 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472501) Time Spent: 13h (was: 12h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 13h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472500&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472500 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 16:15 Start Date: 19/Aug/20 16:15 Worklog Time Spent: 10m Work Description: lostluck commented on a change in pull request #12628: URL: https://github.com/apache/beam/pull/12628#discussion_r473145499 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -167,6 +167,23 @@ const ( // TODO: ViewFn, etc. ) +var methodNames = []string{ Review comment: Consider using a map[string]struct{} instead instead of looping over a slice. Alternatively, given that this is a combination of both doFnNames and sdfNames, consider var methodNames = append(append(nil, doFnNames...), sdfNames...) which avoids duplicating the constant lists. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472500) Time Spent: 12h 50m (was: 12h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 12h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472289&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472289 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 04:11 Start Date: 19/Aug/20 04:11 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #12629: URL: https://github.com/apache/beam/pull/12629#issuecomment-675840689 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472289) Time Spent: 12h 40m (was: 12.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 12h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472288&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472288 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 04:11 Start Date: 19/Aug/20 04:11 Worklog Time Spent: 10m Work Description: youngoli opened a new pull request #12629: URL: https://github.com/apache/beam/pull/12629 This just means including the RestrictionEncoder. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBui
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472279&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472279 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 04:01 Start Date: 19/Aug/20 04:01 Worklog Time Spent: 10m Work Description: youngoli opened a new pull request #12628: URL: https://github.com/apache/beam/pull/12628 Wasn't sure how to test this before creating the PR. Will the precommit test starcgen changes to make sure it doesn't break anything? Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2 --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | --- Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https:/
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=472280&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-472280 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Aug/20 04:01 Start Date: 19/Aug/20 04:01 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #12628: URL: https://github.com/apache/beam/pull/12628#issuecomment-675838164 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 472280) Time Spent: 12h 20m (was: 12h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: P2 > Time Spent: 12h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=412609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412609 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 30/Mar/20 21:40 Start Date: 30/Mar/20 21:40 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#discussion_r400511146 ## File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers.go ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package exec + +import ( + "github.com/apache/beam/sdks/go/pkg/beam/core/funcx" + "github.com/apache/beam/sdks/go/pkg/beam/core/sdf" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx" + "github.com/apache/beam/sdks/go/pkg/beam/internal/errors" + "reflect" +) + +func invokeCreateInitialRestriction(fn *funcx.Fn, elms *FullValue) (rest interface{}, err error) { + if fn == nil { + return nil, nil + } + + switch fnT := fn.Fn.(type) { + case reflectx.Func1x1: + return fnT.Call1x1(elms.Elm), nil + case reflectx.Func2x1: + return fnT.Call2x1(elms.Elm, elms.Elm2), nil + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 1: + inputs = []interface{}{elms.Elm} + case 2: + inputs = []interface{}{elms.Elm, elms.Elm2} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf CreateInitialRestriction invoker") + } + rets := fn.Fn.Call(inputs) + return rets[0], nil + } +} + +func invokeSplitRestriction(fn *funcx.Fn, elms *FullValue, rest interface{}) ([]interface{}, error) { Review comment: Yeah, I was going to document the abstract signatures in the frontend comment for SDF, which will most likely be in the same comment that currently documents DoFns (I think it's in beam/pardo.go). That said, I can definitely leave a comment here, with a TODO to replace it when the frontend documentation is in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 412609) Time Spent: 12h (was: 11h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 12h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=412496&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412496 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 30/Mar/20 18:17 Start Date: 30/Mar/20 18:17 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#discussion_r400397429 ## File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers_test.go ## @@ -0,0 +1,262 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package exec + +import ( + "github.com/apache/beam/sdks/go/pkg/beam/core/graph" + "github.com/google/go-cmp/cmp" + "testing" +) + +func init() { +} + +// TestInvokes runs tests on each SDF method invoker, using the SDFs defined +// in this file. Tests both single-element and KV element cases. +func TestInvokes(t *testing.T) { + // Setup. + dfn, err := graph.NewDoFn(&Sdf{}, graph.NumMainInputs(graph.MainSingle)) + if err != nil { + t.Fatalf("invalid function: %v", err) + } + sdf := (*graph.SplittableDoFn)(dfn) + + dfn, err = graph.NewDoFn(&KvSdf{}, graph.NumMainInputs(graph.MainKv)) + if err != nil { + t.Fatalf("invalid function: %v", err) + } + kvsdf := (*graph.SplittableDoFn)(dfn) + + // Tests. + t.Run("invokeCreateInitialRestriction", func(t *testing.T) { + tests := []struct { + name string + sdf *graph.SplittableDoFn + elms *FullValue + want Restriction + }{ + {"SingleElem", sdf, &FullValue{Elm: 5}, Restriction{5}}, + {"KvElem", kvsdf, &FullValue{Elm: 5, Elm2: 2}, Restriction{7}}, + } + for _, test := range tests { + test := test + fn := test.sdf.CreateInitialRestrictionFn() + t.Run(test.name, func(t *testing.T) { + got, err := invokeCreateInitialRestriction(fn, test.elms) + if err != nil { + t.Fatalf("invokeCreateInitialRestriction failed: %v", err) + } + if !cmp.Equal(got, test.want) { + t.Errorf("invokeCreateInitialRestriction(%v) has incorrect output: got: %v, want: %v", test.elms, got, test.want) + } + }) + } + }) + + t.Run("invokeSplitRestriction", func(t *testing.T) { + tests := []struct { + name string + sdf *graph.SplittableDoFn + elms *FullValue + rest Restriction + want []interface{} + }{ + { + "SingleElem", + sdf, + &FullValue{Elm: 5}, + Restriction{3}, + []interface{}{Restriction{8}, Restriction{9}}, + }, { + "KvElem", + kvsdf, + &FullValue{Elm: 5, Elm2: 2}, + Restriction{3}, + []interface{}{Restriction{8}, Restriction{5}}, + }, + } + for _, test := range tests { + test := test + fn := test.sdf.SplitRestrictionFn() + t.Run(test.name, func(t *testing.T) { + got, err := invokeSplitRestriction(fn, test.elms, test.rest) + if err != nil { + t.Fatalf("invokeSplitRestriction failed: %v", e
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=412495&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412495 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 30/Mar/20 18:17 Start Date: 30/Mar/20 18:17 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#discussion_r400396738 ## File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers.go ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package exec + +import ( + "github.com/apache/beam/sdks/go/pkg/beam/core/funcx" + "github.com/apache/beam/sdks/go/pkg/beam/core/sdf" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx" + "github.com/apache/beam/sdks/go/pkg/beam/internal/errors" + "reflect" +) + +func invokeCreateInitialRestriction(fn *funcx.Fn, elms *FullValue) (rest interface{}, err error) { + if fn == nil { + return nil, nil + } + + switch fnT := fn.Fn.(type) { + case reflectx.Func1x1: + return fnT.Call1x1(elms.Elm), nil + case reflectx.Func2x1: + return fnT.Call2x1(elms.Elm, elms.Elm2), nil + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 1: + inputs = []interface{}{elms.Elm} + case 2: + inputs = []interface{}{elms.Elm, elms.Elm2} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf CreateInitialRestriction invoker") + } + rets := fn.Fn.Call(inputs) + return rets[0], nil + } +} + +func invokeSplitRestriction(fn *funcx.Fn, elms *FullValue, rest interface{}) ([]interface{}, error) { + if fn == nil { + return nil, nil + } + + var ret interface{} + switch fnT := fn.Fn.(type) { + case reflectx.Func2x1: + ret = fnT.Call2x1(elms.Elm, rest) + case reflectx.Func3x1: + ret = fnT.Call3x1(elms.Elm, elms.Elm2, rest) + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 2: + inputs = []interface{}{elms.Elm, rest} + case 3: + inputs = []interface{}{elms.Elm, elms.Elm2, rest} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf SplitRestriction invoker") + } + ret = fn.Fn.Call(inputs)[0] + } + + // Return value is an interface{}, but we need to convert it to a []interface{}. + val := reflect.ValueOf(ret) + s := make([]interface{}, 0, val.Len()) + for i := 0; i < val.Len(); i++ { + s = append(s, val.Index(i).Interface()) + } Review comment: I won't worry about it until after optimization, where it might be clearer what we can re-use. This is presently nicely factored, so unless we can make it shorter/clearer with a reuse of DoFn invoker (possibly by adding other helpers/generated methods, etc or otherwise), this is fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 412495) Time Spent: 11h 40m (was: 11.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: http
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=412494&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412494 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 30/Mar/20 18:17 Start Date: 30/Mar/20 18:17 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#discussion_r400394073 ## File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers.go ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package exec + +import ( + "github.com/apache/beam/sdks/go/pkg/beam/core/funcx" + "github.com/apache/beam/sdks/go/pkg/beam/core/sdf" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx" + "github.com/apache/beam/sdks/go/pkg/beam/internal/errors" + "reflect" +) + +func invokeCreateInitialRestriction(fn *funcx.Fn, elms *FullValue) (rest interface{}, err error) { + if fn == nil { + return nil, nil + } + + switch fnT := fn.Fn.(type) { + case reflectx.Func1x1: + return fnT.Call1x1(elms.Elm), nil + case reflectx.Func2x1: + return fnT.Call2x1(elms.Elm, elms.Elm2), nil + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 1: + inputs = []interface{}{elms.Elm} + case 2: + inputs = []interface{}{elms.Elm, elms.Elm2} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf CreateInitialRestriction invoker") + } + rets := fn.Fn.Call(inputs) + return rets[0], nil + } +} + +func invokeSplitRestriction(fn *funcx.Fn, elms *FullValue, rest interface{}) ([]interface{}, error) { Review comment: It occurs to me that we don't seem to have the abstract signatures documented for the "Provider" methods like SplitRestriction. IIRC we avoided doing that for now to avoid accidental usage? It might be good to document the expectation here at least, for this and the other SDF methods. eg. SplitRestriction(K?, V, RT) []RT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 412494) Time Spent: 11.5h (was: 11h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 11.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=412467&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412467 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 30/Mar/20 17:45 Start Date: 30/Mar/20 17:45 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#discussion_r400377621 ## File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers.go ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package exec + +import ( + "github.com/apache/beam/sdks/go/pkg/beam/core/funcx" + "github.com/apache/beam/sdks/go/pkg/beam/core/sdf" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx" + "github.com/apache/beam/sdks/go/pkg/beam/internal/errors" + "reflect" +) + +func invokeCreateInitialRestriction(fn *funcx.Fn, elms *FullValue) (rest interface{}, err error) { + if fn == nil { + return nil, nil + } + + switch fnT := fn.Fn.(type) { + case reflectx.Func1x1: + return fnT.Call1x1(elms.Elm), nil + case reflectx.Func2x1: + return fnT.Call2x1(elms.Elm, elms.Elm2), nil + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 1: + inputs = []interface{}{elms.Elm} + case 2: + inputs = []interface{}{elms.Elm, elms.Elm2} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf CreateInitialRestriction invoker") + } + rets := fn.Fn.Call(inputs) + return rets[0], nil + } +} + +func invokeSplitRestriction(fn *funcx.Fn, elms *FullValue, rest interface{}) ([]interface{}, error) { + if fn == nil { + return nil, nil + } + + var ret interface{} + switch fnT := fn.Fn.(type) { Review comment: Just the concrete note that the reason the existing invokers are written the way they are, is so every time they're called, they don't need to make these choices over again. This is a WIP PR, so no worries. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 412467) Time Spent: 11h 20m (was: 11h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 11h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=412436&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412436 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 30/Mar/20 17:01 Start Date: 30/Mar/20 17:01 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#discussion_r400350089 ## File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers.go ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package exec + +import ( + "github.com/apache/beam/sdks/go/pkg/beam/core/funcx" + "github.com/apache/beam/sdks/go/pkg/beam/core/sdf" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx" + "github.com/apache/beam/sdks/go/pkg/beam/internal/errors" + "reflect" +) + +func invokeCreateInitialRestriction(fn *funcx.Fn, elms *FullValue) (rest interface{}, err error) { + if fn == nil { + return nil, nil + } + + switch fnT := fn.Fn.(type) { + case reflectx.Func1x1: + return fnT.Call1x1(elms.Elm), nil + case reflectx.Func2x1: + return fnT.Call2x1(elms.Elm, elms.Elm2), nil + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 1: + inputs = []interface{}{elms.Elm} + case 2: + inputs = []interface{}{elms.Elm, elms.Elm2} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf CreateInitialRestriction invoker") + } + rets := fn.Fn.Call(inputs) + return rets[0], nil + } +} + +func invokeSplitRestriction(fn *funcx.Fn, elms *FullValue, rest interface{}) ([]interface{}, error) { + if fn == nil { + return nil, nil + } + + var ret interface{} + switch fnT := fn.Fn.(type) { + case reflectx.Func2x1: + ret = fnT.Call2x1(elms.Elm, rest) + case reflectx.Func3x1: + ret = fnT.Call3x1(elms.Elm, elms.Elm2, rest) + default: + var inputs []interface{} + switch fn.Fn.Type().NumIn() { + case 2: + inputs = []interface{}{elms.Elm, rest} + case 3: + inputs = []interface{}{elms.Elm, elms.Elm2, rest} + default: + err := errors.Errorf("fn %v has unexpected number of inputs: %v", fn.Fn.Name(), fn.Fn.Type().NumIn()) + return nil, errors.WithContext(err, "sdf SplitRestriction invoker") + } + ret = fn.Fn.Call(inputs)[0] + } + + // Return value is an interface{}, but we need to convert it to a []interface{}. + val := reflect.ValueOf(ret) + s := make([]interface{}, 0, val.Len()) + for i := 0; i < val.Len(); i++ { + s = append(s, val.Index(i).Interface()) + } Review comment: This feels like the only bit that doesn't fit into the existing invokers. It would just need an additional wrapper to handle this part (since it's equivalent to the single return value kind), though handling the *FullValue we get back might be less clean. Hmmm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 412436) Time Spent: 11h 10m (was: 11h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=411651&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411651 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 28/Mar/20 07:40 Start Date: 28/Mar/20 07:40 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257#issuecomment-605409185 CC: @lostluck Like I mentioned in the commit message above, this is lacking runtime optimizations. I got the testing working and wanted to get this in a PR so you can give some general feedback so I can know if I'm on the right track. But as-is, the change isn't done yet. My current plan for optimization is to follow the regular invoker's example and make this a method we can call in Setup, have it output a CallFn, and then we can use that in ProcessElement to actually invoke the method, without the extra boilerplate that doesn't need to be done per-element. I avoided all that for the first go-round because I didn't understand why it was being done that way, and I just wanted the simplest implementation possible to get it working. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 411651) Time Spent: 11h (was: 10h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 11h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=411649&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411649 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 28/Mar/20 07:35 Start Date: 28/Mar/20 07:35 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11257: [BEAM-3301] Create runtime invokers for SDF methods. URL: https://github.com/apache/beam/pull/11257 This is just a first pass, and I now realize that I can do a bunch more to optimize this for execution time, but that will come in a subsequent commit probably. For now, this commit adds SDF method invokers that have functioning unit tests. Next stage will be trying to get it to return a function, so we can avoid a lot of the unnecessary code that's currently getting executed once per element, like the switch statements. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://bui
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409970&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409970 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 25/Mar/20 23:39 Start Date: 25/Mar/20 23:39 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11225: [BEAM-3301] Fix another bug in DoFn validation, in exec. URL: https://github.com/apache/beam/pull/11225#issuecomment-604145810 Btw, made a Jira for adding a test: BEAM-9611 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409970) Time Spent: 10h 40m (was: 10.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 10h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409961&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409961 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 25/Mar/20 23:22 Start Date: 25/Mar/20 23:22 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11225: [BEAM-3301] Fix another bug in DoFn validation, in exec. URL: https://github.com/apache/beam/pull/11225 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409961) Time Spent: 10.5h (was: 10h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 10.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409842&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409842 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 25/Mar/20 21:53 Start Date: 25/Mar/20 21:53 Worklog Time Spent: 10m Work Description: lostluck commented on issue #11225: [BEAM-3301] Fix another bug in DoFn validation, in exec. URL: https://github.com/apache/beam/pull/11225#issuecomment-604107438 Run Go Postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409842) Time Spent: 10h 20m (was: 10h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 10h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409743&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409743 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 25/Mar/20 19:15 Start Date: 25/Mar/20 19:15 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11225: [BEAM-3301] Fix another bug in DoFn validation, in exec. URL: https://github.com/apache/beam/pull/11225#issuecomment-604033853 R: @lostluck As a note for the future, does it seem valuable to add an integration test that catches this case? It just needs to have a GBK followed by a DoFn that reads the values as an Iter or ReIter. I can make a Jira for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409743) Time Spent: 10h 10m (was: 10h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 10h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409742&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409742 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 25/Mar/20 19:13 Start Date: 25/Mar/20 19:13 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11225: [BEAM-3301] Fix another bug in DoFn validation, in exec. URL: https://github.com/apache/beam/pull/11225 This was the same error from the previous bugfix, but I missed it in the exec package since I only checked locally. Instead of fixing it the same way as before, I'm loosening the strictness of the validation since we probably don't need fully strict validation in the exec package. At this point, every DoFn should already have been validated by the graph package. Previous PR is #11179 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostC
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409180&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409180 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 24/Mar/20 23:42 Start Date: 24/Mar/20 23:42 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#issuecomment-603561036 Squashed fixup commits This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409180) Time Spent: 9h 40m (was: 9.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409181&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409181 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 24/Mar/20 23:42 Start Date: 24/Mar/20 23:42 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409181) Time Spent: 9h 50m (was: 9h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409129 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 24/Mar/20 22:17 Start Date: 24/Mar/20 22:17 Worklog Time Spent: 10m Work Description: lostluck commented on issue #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#issuecomment-603533918 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409129) Time Spent: 9.5h (was: 9h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409005 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 24/Mar/20 18:30 Start Date: 24/Mar/20 18:30 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#discussion_r396917290 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -562,7 +595,13 @@ func (fn *GoodSdf) RestrictionSize(int, RestT) float64 { return 0 } -// TODO(BEAM-3301): Add ProcessElement impl. when restriction trackers are in. +func (fn *GoodSdf) CreateTracker(RestT) *RTrackerT { + return &RTrackerT{} +} + +func (fn *GoodSdf) ProcessElement(*RTrackerT, int) int { Review comment: That was one of the approaches I considered. I got some feedback on it both ways, but ultimately I didn't really like that approach because it's a bit unintuitive for users to get a different RTracker type than what they created. Documentation would have to do some extra legwork. Plus, it goes against the trend in Go to have users understand what's happening with concurrency. But anyway, I'm open to that approach and may pivot to it if it makes sense, but for now the plan is, when we add dynamic splitting, to provide a concurrency wrapper and have users wrap their RTrackers themselves, or just write their own concurrency. (This is apparently also the way python does it, so it's not completely unprecedented.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409005) Time Spent: 9h 20m (was: 9h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409003&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409003 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 24/Mar/20 18:30 Start Date: 24/Mar/20 18:30 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#discussion_r396912202 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -676,39 +737,77 @@ func (fn *BadSdfElementTRestSize) RestrictionSize(float32, RestT) float64 { type BadRestT struct{} type BadSdfRestTSplitRestParam struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestTSplitRestParam) SplitRestriction(int, BadRestT) []RestT { return []RestT{} } type BadSdfRestTSplitRestReturn struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestTSplitRestReturn) SplitRestriction(int, RestT) []BadRestT { return []BadRestT{} } type BadSdfRestTRestSize struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestTRestSize) RestrictionSize(int, BadRestT) float64 { return 0 } +type BadSdfRestTCreateTracker struct { + *GoodSdf +} + +func (fn *BadSdfRestTCreateTracker) CreateTracker(BadRestT) *RTrackerT { + return &RTrackerT{} +} + // Examples of other type validation that needs to be done. type BadSdfRestSizeReturn struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestSizeReturn) BadSdfRestSizeReturn(int, RestT) int { return 0 } +type BadRTrackerT struct{} Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409003) Time Spent: 9h 10m (was: 9h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=409004&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409004 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 24/Mar/20 18:30 Start Date: 24/Mar/20 18:30 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#discussion_r396912979 ## File path: sdks/go/pkg/beam/core/sdf/sdf.go ## @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package sdf is experimental, incomplete, and not yet meant for general usage. +package sdf + +// RTracker is an interface used to interact with restrictions while processing elements in +// SplittableDoFns. Each implementation of RTracker is expected to be used for tracking a single +// restriction type, which is the type that should be used to create the RTracker, and output by +// TrySplit. +type RTracker interface { + // TryClaim attempts to claim the block of work in the current restriction located at a given + // position. This method must be used in the ProcessElement method of Splittable DoFns to claim + // work before performing it. If no work is claimed, the ProcessElement is not allowed to perform + // work or emit outputs. If the claim is successful, the DoFn must process the entire block. If + // the claim is unsuccessful the ProcessElement method of the DoFn must return without performing + // any additional work or emitting any outputs. + // + // TryClaim accepts an arbitrary value that can be interpreted as the position of a block, and + // returns a boolean indicating whether the claim succeeded. + // + // If the claim fails due to an error, that error can be retrieved with GetError. + // + // For SDFs to work properly, claims must always be monotonically increasing in reference to the + // restriction's start and end points, and every block of work in a restriction must be claimed. + // + // This pseudocode example illustrates the typical usage of TryClaim: + // + // pos = position of first block after restriction.start + // for TryClaim(pos) == true { + // // Do all work in the claimed block and emit outputs. + // pos = position of next block + // } + // return + TryClaim(pos interface{}) (ok bool) + + // GetError returns the error that made this RTracker stop executing, and it returns null if no Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 409004) Time Spent: 9h 20m (was: 9h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=408378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-408378 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 23/Mar/20 23:02 Start Date: 23/Mar/20 23:02 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#discussion_r396803249 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -562,7 +595,13 @@ func (fn *GoodSdf) RestrictionSize(int, RestT) float64 { return 0 } -// TODO(BEAM-3301): Add ProcessElement impl. when restriction trackers are in. +func (fn *GoodSdf) CreateTracker(RestT) *RTrackerT { + return &RTrackerT{} +} + +func (fn *GoodSdf) ProcessElement(*RTrackerT, int) int { Review comment: What do you think of having ProcessElement actually just have an sdf.RTracker value? Having it as the interface simplifies our wrapping approach for dynamic splitting, and means the framework can do it all the time, for safety etc. CreateTracker would still need the actual implementation type, and check that it implements sdf.RTracker of course. We can always extend things to allow a user to "unwrap" the interface if they need direct access to their RTracker implementation for whatever reason. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 408378) Time Spent: 8h 50m (was: 8h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=408379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-408379 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 23/Mar/20 23:02 Start Date: 23/Mar/20 23:02 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#discussion_r396795769 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -676,39 +737,77 @@ func (fn *BadSdfElementTRestSize) RestrictionSize(float32, RestT) float64 { type BadRestT struct{} type BadSdfRestTSplitRestParam struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestTSplitRestParam) SplitRestriction(int, BadRestT) []RestT { return []RestT{} } type BadSdfRestTSplitRestReturn struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestTSplitRestReturn) SplitRestriction(int, RestT) []BadRestT { return []BadRestT{} } type BadSdfRestTRestSize struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestTRestSize) RestrictionSize(int, BadRestT) float64 { return 0 } +type BadSdfRestTCreateTracker struct { + *GoodSdf +} + +func (fn *BadSdfRestTCreateTracker) CreateTracker(BadRestT) *RTrackerT { + return &RTrackerT{} +} + // Examples of other type validation that needs to be done. type BadSdfRestSizeReturn struct { - *GoodDoFn + *GoodSdf } func (fn *BadSdfRestSizeReturn) BadSdfRestSizeReturn(int, RestT) int { return 0 } +type BadRTrackerT struct{} Review comment: Consider commenting that this "RTracker" isn't implementing the RTracker interface. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 408379) Time Spent: 8h 50m (was: 8h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=408380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-408380 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 23/Mar/20 23:02 Start Date: 23/Mar/20 23:02 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#discussion_r396804143 ## File path: sdks/go/pkg/beam/core/sdf/sdf.go ## @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package sdf is experimental, incomplete, and not yet meant for general usage. +package sdf + +// RTracker is an interface used to interact with restrictions while processing elements in +// SplittableDoFns. Each implementation of RTracker is expected to be used for tracking a single +// restriction type, which is the type that should be used to create the RTracker, and output by +// TrySplit. +type RTracker interface { + // TryClaim attempts to claim the block of work in the current restriction located at a given + // position. This method must be used in the ProcessElement method of Splittable DoFns to claim + // work before performing it. If no work is claimed, the ProcessElement is not allowed to perform + // work or emit outputs. If the claim is successful, the DoFn must process the entire block. If + // the claim is unsuccessful the ProcessElement method of the DoFn must return without performing + // any additional work or emitting any outputs. + // + // TryClaim accepts an arbitrary value that can be interpreted as the position of a block, and + // returns a boolean indicating whether the claim succeeded. + // + // If the claim fails due to an error, that error can be retrieved with GetError. + // + // For SDFs to work properly, claims must always be monotonically increasing in reference to the + // restriction's start and end points, and every block of work in a restriction must be claimed. + // + // This pseudocode example illustrates the typical usage of TryClaim: + // + // pos = position of first block after restriction.start + // for TryClaim(pos) == true { + // // Do all work in the claimed block and emit outputs. + // pos = position of next block + // } + // return + TryClaim(pos interface{}) (ok bool) + + // GetError returns the error that made this RTracker stop executing, and it returns null if no Review comment: returns nil* This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 408380) Time Spent: 9h (was: 8h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=408256&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-408256 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 23/Mar/20 20:28 Start Date: 23/Mar/20 20:28 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#issuecomment-602838126 Whoops, forgot reviewers. R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 408256) Time Spent: 8h 40m (was: 8.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407330&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407330 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 23:48 Start Date: 20/Mar/20 23:48 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#issuecomment-601956825 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407330) Time Spent: 8.5h (was: 8h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407329&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407329 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 23:48 Start Date: 20/Mar/20 23:48 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188 Adding RTrackers as an interface, and adding them to the SDF validation. I think this is the last real code involved in SDF validation, assuming I'm not forgetting anything. I might do a second pass on the error messages because they seem inconsistent with the old error messages, but the next major task is going to be working on the SDF exec code and doing some testing with the Flink runner. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407302&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407302 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 22:25 Start Date: 20/Mar/20 22:25 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407302) Time Spent: 8h 10m (was: 8h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407195 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:32 Start Date: 20/Mar/20 19:32 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395850295 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -446,23 +444,16 @@ func validateMainInputs(fn *Fn, method *funcx.Fn, methodName string, numMainIn m return err } - // Check that the first numMainIn inputs are not side inputs (Iters or - // ReIters). We aren't able to catch singleton side inputs here since - // they're indistinguishable from main inputs. - mainInputs := method.Param[pos : pos+int(numMainIn)] - for i, p := range mainInputs { - if p.Kind != funcx.FnValue { - err := errors.Errorf("expected main input parameter but found "+ - "side input parameter in position %v", - pos+i) - err = errors.SetTopLevelMsgf(err, - "Method %v in DoFn %v should have all main inputs before side inputs, "+ - "but a side input (as Iter or ReIter) appears as parameter %v when a "+ - "main input was expected.", - methodName, fn.Name(), pos+i) - err = errors.WithContextf(err, "method %v", methodName) - return err - } + // Check that the first input is not an Iter or ReIter (those aren't valid + // as the first main input). + first := method.Param[pos].Kind + if first != funcx.FnValue { + err := errors.New("first main input parameter must be value type") Review comment: I'll just add it in real quick while squashing the commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407195) Time Spent: 8h (was: 7h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407186&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407186 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:12 Start Date: 20/Mar/20 19:12 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395840270 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -446,23 +444,16 @@ func validateMainInputs(fn *Fn, method *funcx.Fn, methodName string, numMainIn m return err } - // Check that the first numMainIn inputs are not side inputs (Iters or - // ReIters). We aren't able to catch singleton side inputs here since - // they're indistinguishable from main inputs. - mainInputs := method.Param[pos : pos+int(numMainIn)] - for i, p := range mainInputs { - if p.Kind != funcx.FnValue { - err := errors.Errorf("expected main input parameter but found "+ - "side input parameter in position %v", - pos+i) - err = errors.SetTopLevelMsgf(err, - "Method %v in DoFn %v should have all main inputs before side inputs, "+ - "but a side input (as Iter or ReIter) appears as parameter %v when a "+ - "main input was expected.", - methodName, fn.Name(), pos+i) - err = errors.WithContextf(err, "method %v", methodName) - return err - } + // Check that the first input is not an Iter or ReIter (those aren't valid + // as the first main input). + first := method.Param[pos].Kind + if first != funcx.FnValue { + err := errors.New("first main input parameter must be value type") Review comment: ...must be a value type.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407186) Time Spent: 7h 50m (was: 7h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407185&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407185 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:06 Start Date: 20/Mar/20 19:06 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395838378 ## File path: sdks/go/pkg/beam/pcollection.go ## @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType { return p.n.Type() } +// OutputsKV returns whether the output of this PCollection are single value +// elements or KV pairs. +func (p PCollection) OutputsKV() bool { Review comment: That's my usual guideline. If I use it once, keep it in place; twice, copy it; three times, helper function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407185) Time Spent: 7h 40m (was: 7.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407180&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407180 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:01 Start Date: 20/Mar/20 19:01 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395836071 ## File path: sdks/go/pkg/beam/pcollection.go ## @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType { return p.n.Type() } +// OutputsKV returns whether the output of this PCollection are single value +// elements or KV pairs. +func (p PCollection) OutputsKV() bool { Review comment: I was originally picturing this as a helper function for callers of NewDoFn. It seems easy for future callers to make a mistake and only check if the PCollection is a KV and forget to check for CoGBK, so I thought a helper method would be useful in the future. 1. I missed that pardo.go is in the same package as pcollection.go. I'm also leaning to not expanding the user surface if it's not necessary. 2 & 3. Yeah I was unsure about the name, since it's not technically checking for KVs, I just couldn't think of anything better. IsKeyed sounds good though. 4. That's the other part I was debating. My goal was to make it easy to avoid the mistake in the future, but thinking about it... It seems unlikely that anyone else would even be using this code, and I would expect that if they were they were an advanced user doing something tricky. I think I'll go with just putting the conditional in pardo.go and adding a comment. We can always turn it into a helper later if it does get used in multiple places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407180) Time Spent: 7.5h (was: 7h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=406810&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-406810 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 05:26 Start Date: 20/Mar/20 05:26 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395444326 ## File path: sdks/go/pkg/beam/pcollection.go ## @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType { return p.n.Type() } +// OutputsKV returns whether the output of this PCollection are single value +// elements or KV pairs. +func (p PCollection) OutputsKV() bool { Review comment: 1. No need to have this exported right now, since we can't make a breaking change later. It's only used in the same package. Let's not expand the user surface unless there are good user usages, as a broad API get confusing. 2. IsKV would be a more precise name, since PCollections are a logical representation of all their data, not actually a source or a sink. They can represent KV type or they aren't. 3. This isn't checking if it's a KV type, it's checking if it's a Keyed type or not, since it's also checking if it's a CoGBK. 4. Since this is only used in the one place, it's reasonable to move the conditional there instead of adding the one off helper method. I'm always willing to hear other opinions! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 406810) Time Spent: 7h 20m (was: 7h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=406806&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-406806 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 05:16 Start Date: 20/Mar/20 05:16 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179 Accidentally left a bug in DoFn validation. It was treating main inputs as if they could only be FnValues, when they could also be FnIter or FnReIter as well, such as in the case of a GBK or CoGBK, which meant valid pipelines were failing validation. This fixes that. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Bu
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=406807&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-406807 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 05:16 Start Date: 20/Mar/20 05:16 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#issuecomment-601541864 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 406807) Time Spent: 7h 10m (was: 7h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=40&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-40 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 00:19 Start Date: 20/Mar/20 00:19 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 40) Time Spent: 6h 50m (was: 6h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 6h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=406615&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-406615 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 21:49 Start Date: 19/Mar/20 21:49 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r395339195 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -470,6 +542,169 @@ func (fn *BadDoFnAmbiguousSideInput) StartBundle(bool) { func (fn *BadDoFnAmbiguousSideInput) FinishBundle(bool) { } +// Examples of correct SplittableDoFn signatures + +type RestT struct{} + +type GoodSdf struct { Review comment: Makes sense! Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 406615) Time Spent: 6h 40m (was: 6.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 6h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405941&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405941 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 04:23 Start Date: 19/Mar/20 04:23 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394783417 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -196,13 +217,31 @@ func (f *DoFn) Name() string { // IsSplittable returns whether the DoFn is a valid Splittable DoFn. func (f *DoFn) IsSplittable() bool { - return false // TODO(BEAM-3301): Implement this when we add SDFs. + isSdf, _ := validateSdfMethodsPresent((*Fn)(f)) Review comment: Good point, I'll keep it in mind for the future. For this one, though, I realized it's better just to check for the presence of a single SDF method anyway, rather than calling that validation method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405941) Time Spent: 6h (was: 5h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 6h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405943&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405943 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 04:23 Start Date: 19/Mar/20 04:23 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394783450 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -569,6 +622,188 @@ func validateSideInputsNumUnknown(processFnInputs []funcx.FnParam, method *funcx return nil } +// validateSdfMethods validates that all SDF methods are either present or +// missing in a Fn, and then returns true if they're present and false +// otherwise. If some are present and some are missing, it returns an error. +func validateSdfMethodsPresent(fn *Fn) (bool, error) { + // Check if first sdf method is present or not, and compare all subsequent + // methods to that result. If there's a mismatch, then we only fail after + // finishing the loop so we can output all the missing methods. + missing := make([]string, 0) + var first, fail bool + + for i, name := range sdfNames { + _, ok := fn.methods[name] + if !ok { + missing = append(missing, name) + } + + if i == 0 { + first = ok + } else if ok != first { + fail = true + } + } + + if fail { + err := errors.Errorf("not all SplittableDoFn methods are present. Missing methods: %v", missing) + return false, err + } + + return first, nil Review comment: That looks much more readable and easy to follow than the current one. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405943) Time Spent: 6h 20m (was: 6h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 6h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405944&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405944 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 04:23 Start Date: 19/Mar/20 04:23 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394783477 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -470,6 +542,169 @@ func (fn *BadDoFnAmbiguousSideInput) StartBundle(bool) { func (fn *BadDoFnAmbiguousSideInput) FinishBundle(bool) { } +// Examples of correct SplittableDoFn signatures + +type RestT struct{} + +type GoodSdf struct { Review comment: I was just skipping it because this commit hasn't added restriction trackers yet, but adding a TODO to remind me to change it later is a good idea. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405944) Time Spent: 6.5h (was: 6h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 6.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405942&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405942 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 04:23 Start Date: 19/Mar/20 04:23 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394783430 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -569,6 +622,188 @@ func validateSideInputsNumUnknown(processFnInputs []funcx.FnParam, method *funcx return nil } +// validateSdfMethods validates that all SDF methods are either present or +// missing in a Fn, and then returns true if they're present and false +// otherwise. If some are present and some are missing, it returns an error. +func validateSdfMethodsPresent(fn *Fn) (bool, error) { + // Check if first sdf method is present or not, and compare all subsequent + // methods to that result. If there's a mismatch, then we only fail after + // finishing the loop so we can output all the missing methods. + missing := make([]string, 0) Review comment: Done. I forgot that append works on nil slices. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405942) Time Spent: 6h 10m (was: 6h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405882 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 00:58 Start Date: 19/Mar/20 00:58 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394618692 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -196,13 +217,31 @@ func (f *DoFn) Name() string { // IsSplittable returns whether the DoFn is a valid Splittable DoFn. func (f *DoFn) IsSplittable() bool { - return false // TODO(BEAM-3301): Implement this when we add SDFs. + isSdf, _ := validateSdfMethodsPresent((*Fn)(f)) Review comment: A semantic note: the error shouldn't be ignored. Semantically if a function returns an error, the non error return values are not guaranteed to be valid. So conventionally, the way to handle IsSplittable here is to check for the error, and return false if not nil, and otherwise return what the boolean is. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405882) Time Spent: 5h 50m (was: 5h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405880&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405880 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 00:58 Start Date: 19/Mar/20 00:58 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394722487 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -569,6 +622,188 @@ func validateSideInputsNumUnknown(processFnInputs []funcx.FnParam, method *funcx return nil } +// validateSdfMethods validates that all SDF methods are either present or +// missing in a Fn, and then returns true if they're present and false +// otherwise. If some are present and some are missing, it returns an error. +func validateSdfMethodsPresent(fn *Fn) (bool, error) { + // Check if first sdf method is present or not, and compare all subsequent + // methods to that result. If there's a mismatch, then we only fail after + // finishing the loop so we can output all the missing methods. + missing := make([]string, 0) Review comment: Prefer declaring empty slices using the var syntax. var missing []string This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405880) Time Spent: 5h 40m (was: 5.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405881&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405881 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 00:58 Start Date: 19/Mar/20 00:58 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394725254 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -569,6 +622,188 @@ func validateSideInputsNumUnknown(processFnInputs []funcx.FnParam, method *funcx return nil } +// validateSdfMethods validates that all SDF methods are either present or +// missing in a Fn, and then returns true if they're present and false +// otherwise. If some are present and some are missing, it returns an error. +func validateSdfMethodsPresent(fn *Fn) (bool, error) { + // Check if first sdf method is present or not, and compare all subsequent + // methods to that result. If there's a mismatch, then we only fail after + // finishing the loop so we can output all the missing methods. + missing := make([]string, 0) + var first, fail bool + + for i, name := range sdfNames { + _, ok := fn.methods[name] + if !ok { + missing = append(missing, name) + } + + if i == 0 { + first = ok + } else if ok != first { + fail = true + } + } + + if fail { + err := errors.Errorf("not all SplittableDoFn methods are present. Missing methods: %v", missing) + return false, err + } + + return first, nil Review comment: I see that this function is trying to distinguish between partial coverage or complete coverage, but I think it could be simpler. Consider that the booleans could be removed by comparing whether len(missing) == len(sdfNames) to check if it's simply not an SDF at all (and thus, no error should be returned). ``` switch len(missing) { case 0: return true, nil case len(sdfNames): return false, nil default: err := errors.Errorf("not all SplittableDoFn methods are present. Missing methods: %v", missing) return false, err } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405881) Time Spent: 5h 40m (was: 5.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=405879&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405879 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 19/Mar/20 00:58 Start Date: 19/Mar/20 00:58 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#discussion_r394728377 ## File path: sdks/go/pkg/beam/core/graph/fn_test.go ## @@ -470,6 +542,169 @@ func (fn *BadDoFnAmbiguousSideInput) StartBundle(bool) { func (fn *BadDoFnAmbiguousSideInput) FinishBundle(bool) { } +// Examples of correct SplittableDoFn signatures + +type RestT struct{} + +type GoodSdf struct { Review comment: Shouldn't the ProcessElement method be adjusted here for the Restriction tracker parameter for a "GoodSdf" ? I guess technically if we don't care about dynamic splitting at all, it should be allowed. It's not unreasonable to prevent it until we have the right idea how to do that, but if so, lets put a TODO in here somewhere to make the intent explicit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 405879) Time Spent: 5h 40m (was: 5.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=404390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404390 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 17/Mar/20 01:31 Start Date: 17/Mar/20 01:31 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#issuecomment-599832332 Btw, only major thing I was unsure of. Should I be hiding this functionality behind an experiment flag of some kind? This initial one doesn't do so, and I don't expect people would randomly use it before it's ready because there's no user-facing documentation indicating its usage in the "beam" package, but it's still technically possible to use it right now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 404390) Time Spent: 5.5h (was: 5h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=404388&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404388 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 17/Mar/20 01:29 Start Date: 17/Mar/20 01:29 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144 Added SDF detection and method validation when creating DoFns. This adds all the validation except for anything to do with restriction trackers, which will be added in a later change. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=404389&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404389 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 17/Mar/20 01:29 Start Date: 17/Mar/20 01:29 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11144: [BEAM-3301] Perform SDF validation (missing RestrictionTrackers). URL: https://github.com/apache/beam/pull/11144#issuecomment-599831761 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 404389) Time Spent: 5h 20m (was: 5h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=404382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404382 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 17/Mar/20 01:09 Start Date: 17/Mar/20 01:09 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 404382) Time Spent: 5h (was: 4h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=404362&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404362 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 17/Mar/20 00:29 Start Date: 17/Mar/20 00:29 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r393382209 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -239,52 +279,50 @@ func AsDoFn(fn *Fn) (*DoFn, error) { return nil, addContext(err, fn) } - // Start validating DoFn. First, check that ProcessElement has a main input. + // Validate ProcessElement has correct number of main inputs (as indicated by + // numMainIn), and that main inputs are before side inputs. processFn := fn.methods[processElementName] - pos, num, ok := processFn.Inputs() - if ok { - first := processFn.Param[pos].Kind - if first != funcx.FnValue { - err := errors.New("side input parameters must follow main input parameter") - err = errors.SetTopLevelMsgf(err, - "Method %v of DoFns should always have a main input before side inputs, "+ - "but it has side inputs (as Iters or ReIters) first in DoFn %v.", - processElementName, fn.Name()) - err = errors.WithContextf(err, "method %v", processElementName) - return nil, addContext(err, fn) - } + if err := validateMainInputs(fn, processFn, processElementName, numMainIn); err != nil { + return nil, addContext(err, fn) + } + + // If numMainIn is unknown, we can try inferring it from the second input in ProcessElement. + // If there is none, or it's not a FnValue type, then we can safely infer that there's only + // one main input. + pos, num, _ := processFn.Inputs() + if numMainIn == MainUnknown && (num == 1 || processFn.Param[pos+1].Kind != funcx.FnValue) { + numMainIn = MainSingle } // If the ProcessElement function includes side inputs or emit functions those must also be Review comment: At most relaxed we'd be able to either not require them at all if none are used, or isolate them by their types. All instances of a given side input or emit with the same type would need to be listed at once, since otherwise we have no way to distinguish them except by position. Permitting Nothing to be set would be the most convenient, or permitting only the Side Inputs and not requireing the Emits. For now though, it's better to be more strict now and relax later, since the inverse is impossible, and such variety is harder to maintain if unnecessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 404362) Time Spent: 4h 50m (was: 4h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=402558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-402558 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 12/Mar/20 23:44 Start Date: 12/Mar/20 23:44 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-598480073 I think doing this validation in the ParDo transform is something worth looking into, and I'd be up for it if it worked as well as you describe. I'm definitely not a fan of having to do validation without any info about the actual output/input involved. I've even entertained the idea of doing something similar, but it would be a decently large refactor (2-3 days?) and has the chance of hitting additional roadblocks, so I haven't really made time to look into it yet. Definitely something worth taking a day or two to look into after SDF is done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 402558) Time Spent: 4h 40m (was: 4.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=402556&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-402556 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 12/Mar/20 23:38 Start Date: 12/Mar/20 23:38 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r391956907 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -239,52 +279,50 @@ func AsDoFn(fn *Fn) (*DoFn, error) { return nil, addContext(err, fn) } - // Start validating DoFn. First, check that ProcessElement has a main input. + // Validate ProcessElement has correct number of main inputs (as indicated by + // numMainIn), and that main inputs are before side inputs. processFn := fn.methods[processElementName] - pos, num, ok := processFn.Inputs() - if ok { - first := processFn.Param[pos].Kind - if first != funcx.FnValue { - err := errors.New("side input parameters must follow main input parameter") - err = errors.SetTopLevelMsgf(err, - "Method %v of DoFns should always have a main input before side inputs, "+ - "but it has side inputs (as Iters or ReIters) first in DoFn %v.", - processElementName, fn.Name()) - err = errors.WithContextf(err, "method %v", processElementName) - return nil, addContext(err, fn) - } + if err := validateMainInputs(fn, processFn, processElementName, numMainIn); err != nil { + return nil, addContext(err, fn) + } + + // If numMainIn is unknown, we can try inferring it from the second input in ProcessElement. + // If there is none, or it's not a FnValue type, then we can safely infer that there's only + // one main input. + pos, num, _ := processFn.Inputs() + if numMainIn == MainUnknown && (num == 1 || processFn.Param[pos+1].Kind != funcx.FnValue) { Review comment: validateMainInputs performs error checks we need to do before we can infer # of main inputs (stuff like making sure we have at least 1 input present). So moving this before validateMainInputs would just mean moving those error checks back above the inferring and nothing really changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 402556) Time Spent: 4.5h (was: 4h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=402555&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-402555 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 12/Mar/20 23:38 Start Date: 12/Mar/20 23:38 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r391958266 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -239,52 +279,50 @@ func AsDoFn(fn *Fn) (*DoFn, error) { return nil, addContext(err, fn) } - // Start validating DoFn. First, check that ProcessElement has a main input. + // Validate ProcessElement has correct number of main inputs (as indicated by + // numMainIn), and that main inputs are before side inputs. processFn := fn.methods[processElementName] - pos, num, ok := processFn.Inputs() - if ok { - first := processFn.Param[pos].Kind - if first != funcx.FnValue { - err := errors.New("side input parameters must follow main input parameter") - err = errors.SetTopLevelMsgf(err, - "Method %v of DoFns should always have a main input before side inputs, "+ - "but it has side inputs (as Iters or ReIters) first in DoFn %v.", - processElementName, fn.Name()) - err = errors.WithContextf(err, "method %v", processElementName) - return nil, addContext(err, fn) - } + if err := validateMainInputs(fn, processFn, processElementName, numMainIn); err != nil { + return nil, addContext(err, fn) + } + + // If numMainIn is unknown, we can try inferring it from the second input in ProcessElement. + // If there is none, or it's not a FnValue type, then we can safely infer that there's only + // one main input. + pos, num, _ := processFn.Inputs() + if numMainIn == MainUnknown && (num == 1 || processFn.Param[pos+1].Kind != funcx.FnValue) { + numMainIn = MainSingle } // If the ProcessElement function includes side inputs or emit functions those must also be Review comment: It's part of the API for start/finishBundle. I don't remember why it's done that way though. lostluck@ might be able to answer why when he gets back. There might be room to make the side inputs/emits in start/finishBundle optional, but I believe right now it's mandatory (if we don't catch and throw an error here, it'll just break later on in translation or execution or something). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 402555) Time Spent: 4.5h (was: 4h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=402327&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-402327 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 12/Mar/20 17:17 Start Date: 12/Mar/20 17:17 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r391766344 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -239,52 +279,50 @@ func AsDoFn(fn *Fn) (*DoFn, error) { return nil, addContext(err, fn) } - // Start validating DoFn. First, check that ProcessElement has a main input. + // Validate ProcessElement has correct number of main inputs (as indicated by + // numMainIn), and that main inputs are before side inputs. processFn := fn.methods[processElementName] - pos, num, ok := processFn.Inputs() - if ok { - first := processFn.Param[pos].Kind - if first != funcx.FnValue { - err := errors.New("side input parameters must follow main input parameter") - err = errors.SetTopLevelMsgf(err, - "Method %v of DoFns should always have a main input before side inputs, "+ - "but it has side inputs (as Iters or ReIters) first in DoFn %v.", - processElementName, fn.Name()) - err = errors.WithContextf(err, "method %v", processElementName) - return nil, addContext(err, fn) - } + if err := validateMainInputs(fn, processFn, processElementName, numMainIn); err != nil { + return nil, addContext(err, fn) + } + + // If numMainIn is unknown, we can try inferring it from the second input in ProcessElement. + // If there is none, or it's not a FnValue type, then we can safely infer that there's only + // one main input. + pos, num, _ := processFn.Inputs() + if numMainIn == MainUnknown && (num == 1 || processFn.Param[pos+1].Kind != funcx.FnValue) { + numMainIn = MainSingle } // If the ProcessElement function includes side inputs or emit functions those must also be Review comment: Not related to this PR but why? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 402327) Time Spent: 4h 20m (was: 4h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=402328&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-402328 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 12/Mar/20 17:17 Start Date: 12/Mar/20 17:17 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r391771719 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -239,52 +279,50 @@ func AsDoFn(fn *Fn) (*DoFn, error) { return nil, addContext(err, fn) } - // Start validating DoFn. First, check that ProcessElement has a main input. + // Validate ProcessElement has correct number of main inputs (as indicated by + // numMainIn), and that main inputs are before side inputs. processFn := fn.methods[processElementName] - pos, num, ok := processFn.Inputs() - if ok { - first := processFn.Param[pos].Kind - if first != funcx.FnValue { - err := errors.New("side input parameters must follow main input parameter") - err = errors.SetTopLevelMsgf(err, - "Method %v of DoFns should always have a main input before side inputs, "+ - "but it has side inputs (as Iters or ReIters) first in DoFn %v.", - processElementName, fn.Name()) - err = errors.WithContextf(err, "method %v", processElementName) - return nil, addContext(err, fn) - } + if err := validateMainInputs(fn, processFn, processElementName, numMainIn); err != nil { + return nil, addContext(err, fn) + } + + // If numMainIn is unknown, we can try inferring it from the second input in ProcessElement. + // If there is none, or it's not a FnValue type, then we can safely infer that there's only + // one main input. + pos, num, _ := processFn.Inputs() + if numMainIn == MainUnknown && (num == 1 || processFn.Param[pos+1].Kind != funcx.FnValue) { Review comment: Wouldn't it make sense to infer the number of inputs before validateMainInputs? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 402328) Time Spent: 4h 20m (was: 4h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=401801&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-401801 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 11/Mar/20 21:40 Start Date: 11/Mar/20 21:40 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-597891830 R: @lukecwik Adding Luke to finish up this review since Robert (lostluck@) is on vacation for a bit. Since this was already mostly reviewed, the main thing I'm looking for is someone to confirm that I addressed Robert's previous review comments with the latest commit. You don't need to review the full change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 401801) Time Spent: 4h 10m (was: 4h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=399972&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-399972 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 09/Mar/20 04:52 Start Date: 09/Mar/20 04:52 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-596330576 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 399972) Time Spent: 4h (was: 3h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398822&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398822 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Mar/20 01:03 Start Date: 06/Mar/20 01:03 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-595523525 Done: https://jira.apache.org/jira/browse/BEAM-9459 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398822) Time Spent: 3h 50m (was: 3h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398806&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398806 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Mar/20 00:50 Start Date: 06/Mar/20 00:50 Worklog Time Spent: 10m Work Description: lostluck commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-595520053 Could you file a JIRA with the trace and assign it to me please? I'm in the middle of packing. https://github.com/apache/beam/pull/11061 is the revert. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398806) Time Spent: 3h 40m (was: 3.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398800&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398800 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Mar/20 00:47 Start Date: 06/Mar/20 00:47 Worklog Time Spent: 10m Work Description: lostluck commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-595519138 No, but it looks like it's somehow related to mine. I'm going to roll it back. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398800) Time Spent: 3.5h (was: 3h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398784&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398784 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Mar/20 00:25 Start Date: 06/Mar/20 00:25 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-595512896 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398784) Time Spent: 3h 20m (was: 3h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398783&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398783 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Mar/20 00:24 Start Date: 06/Mar/20 00:24 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-595512813 The Postcommit error doesn't seem to be directly related to my change from what I can tell: > Error message from worker: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error received from SDK harness for instruction -488: process bundle failed for instruction -488 using plan -445 : panic: Unexpected coder: CoGBK goroutine 87 [running]: > runtime/debug.Stack(0xc00109d970, 0xd2c5e0, 0xc00113cb00) >/usr/lib/go-1.12/src/runtime/debug/stack.go:24 +0x9d > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec.callNoPanic.func1(0xc00109db90) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec/util.go:40 +0x60 > panic(0xd2c5e0, 0xc00113cb00) >/usr/lib/go-1.12/src/runtime/panic.go:522 +0x1b5 > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec.MakeElementEncoder(0xc9bdb0, 0xc00114b620, 0xc000822000) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec/coder.go:91 +0x479 > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec.(*PCollection).Up(0xc000c20fc0, 0x10018e0, 0xc000c40f00, 0x0, 0xc0010b7b50) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec/pcollection.go:59 +0xfe > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec.callNoPanic(0x10018e0, 0xc000c40f00, 0xc0010b7c28, 0x0, 0x0) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec/util.go:43 +0x6c > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec.(*Plan).Execute(0xc001222ee0, 0x10018e0, 0xc000c40f00, 0xc000d1a490, 0x4, 0xff0340, 0xc00114b440, 0xff0380, 0xc000c40f40, 0xc0010b7de0, ...) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec/plan.go:93 +0xdf > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness.(*control).handleInstruction(0xc0001f4480, 0x10017a0, 0xc0001bafc0, 0xc000c40d40, 0xc0001bafc0) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness/harness.go:211 +0xa34 > github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness.Main.func2(0x10017a0, 0xc0001bafc0, 0xc000c40d40) > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness/harness.go:118 +0x1cf > created by github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness.Main > /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Go_PR/src/sdks/go/test/.gogradle/project_gopath/src/github.com/apache/beam/sdks/go/test/vendor/github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness/harness.go:131 +0x6e8 > > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 3
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398741&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398741 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 22:44 Start Date: 05/Mar/20 22:44 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-595484373 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398741) Time Spent: 3h (was: 2h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398740&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398740 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 22:43 Start Date: 05/Mar/20 22:43 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r388609914 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,74 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// The following constants prefixed with "Main" represent possible numbers of +// DoFn main inputs for DoFn construction and validation. Any value not defined +// here is an invalid number of main inputs. +const ( + MainUnknown = -1 // The number of main inputs is unknown for DoFn validation. Review comment: I'm leaving it exported only because AsDoFn is currently exported and takes one of these constants as an input. Making this unexported would make it impossible to call AsDoFn with the existing behavior (unknown num. of inputs). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398740) Time Spent: 2h 50m (was: 2h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398736&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398736 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 22:40 Start Date: 05/Mar/20 22:40 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r388608923 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,74 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// The following constants prefixed with "Main" represent possible numbers of Review comment: I definitely like those options better. Went with the unexported constant type, since it makes the code more self-documenting as opposed to raw numbers. Also removed the validation check on that parameter, like you suggested. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398736) Time Spent: 2h 40m (was: 2.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398520&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398520 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 17:20 Start Date: 05/Mar/20 17:20 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r388424329 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,74 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// The following constants prefixed with "Main" represent possible numbers of +// DoFn main inputs for DoFn construction and validation. Any value not defined +// here is an invalid number of main inputs. +const ( + MainUnknown = -1 // The number of main inputs is unknown for DoFn validation. Review comment: Consider if it is necessary to have an unknown constant exported at all? Even in the unexported type version of this code, Unknown a side effect of not passing the NumMainInput hint, rather than something a user should explicitly set. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398520) Time Spent: 2.5h (was: 2h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398519 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 17:20 Start Date: 05/Mar/20 17:20 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r388386045 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,74 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// The following constants prefixed with "Main" represent possible numbers of Review comment: I'm wary about exporting these constants. For one, they're untyped constants, so they're functionally the numbers themselves. Otherwise the "right" go way to expose them so they have meaning would be to have an unexported type so users can't define their own, and then define the constants. ``` type mainInputs int32 const ( MainUnknown mainInputs = -1 MainSingle mainInputs = 1 MainKV mainInputs = 2 ) ``` Then any functional option configuration method can accept them to have type safe, pre-validated input numbers. ``` func NumInputs(mi mainInputs) Option { return func(c *config) { c.numMainIn = mi } } ``` This then saves needing to have a validation error, since package users can't define their own mainInputs. Another alternative is to do away with the exported constants altogether, keep the validation, but simply document that valid inputs are 1 and 2 for singletons and KVs respectively. Either is preferable to the current approach. Lets not lose sight that the purpose here is to pass a hint down to make the DoFn parameters easier to analyse. Windows and EventTimes are propagated with the main input, but don't "count" since they are easily detectable by type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398519) Time Spent: 2.5h (was: 2h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398153&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398153 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 05:49 Start Date: 05/Mar/20 05:49 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r388089432 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,58 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// Constants so we can avoid magic numbers in validation. Represent number of +// DoFn main inputs based on what kind of input the DoFn has. +const ( + unknownInNum = -1 // Used when we don't know the number of main inputs. + singleInNum = 1 + kvInNum = 2 +) + // NewDoFn constructs a DoFn from the given value, if possible. func NewDoFn(fn interface{}) (*DoFn, error) { ret, err := NewFn(fn) if err != nil { return nil, errors.WithContext(errors.Wrapf(err, "invalid DoFn"), "constructing DoFn") } - return AsDoFn(ret) + return AsDoFn(ret, unknownInNum) } -// AsDoFn converts a Fn to a DoFn, if possible. -func AsDoFn(fn *Fn) (*DoFn, error) { +// NewDoFnKv constructs a DoFn from the given value, if possible, with +// improved validation from knowing whether the DoFn's main input is a KV or +// single element. +func NewDoFnKv(fn interface{}, mainKv bool) (*DoFn, error) { + ret, err := NewFn(fn) + if err != nil { + return nil, errors.WithContext(errors.Wrapf(err, "invalid DoFn"), "constructing DoFn") + } + + if mainKv { + return AsDoFn(ret, kvInNum) + } else { + return AsDoFn(ret, singleInNum) + } +} + +// AsDoFn converts a Fn to a DoFn, if possible. numMainIn specifies how many +// main inputs are expected in the DoFn's method signatures. Valid values are +// -1 (unknown), 1 (single elements), or 2 (KVs). If the value is unknown then +// validation is done by best effort and may miss some edge cases. +func AsDoFn(fn *Fn, numMainIn int) (*DoFn, error) { addContext := func(err error, fn *Fn) error { return errors.WithContextf(err, "graph.AsDoFn: for Fn named %v", fn.Name()) } + // Validate numMainIn. This check should match this method's comment. + if numMainIn != unknownInNum && + numMainIn != singleInNum && + numMainIn != kvInNum { + err := errors.Errorf("invalid number of main inputs given. "+ + "Got: %v, Want: One of the following: %v", + processElementName, []int{unknownInNum, singleInNum, kvInNum}) + return nil, addContext(err, fn) + } Review comment: I like that much better, done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398153) Time Spent: 2h 20m (was: 2h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=398152&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-398152 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Mar/20 05:48 Start Date: 05/Mar/20 05:48 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r388089313 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,58 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// Constants so we can avoid magic numbers in validation. Represent number of +// DoFn main inputs based on what kind of input the DoFn has. +const ( + unknownInNum = -1 // Used when we don't know the number of main inputs. + singleInNum = 1 + kvInNum = 2 +) + // NewDoFn constructs a DoFn from the given value, if possible. func NewDoFn(fn interface{}) (*DoFn, error) { ret, err := NewFn(fn) if err != nil { return nil, errors.WithContext(errors.Wrapf(err, "invalid DoFn"), "constructing DoFn") } - return AsDoFn(ret) + return AsDoFn(ret, unknownInNum) } -// AsDoFn converts a Fn to a DoFn, if possible. -func AsDoFn(fn *Fn) (*DoFn, error) { +// NewDoFnKv constructs a DoFn from the given value, if possible, with +// improved validation from knowing whether the DoFn's main input is a KV or +// single element. +func NewDoFnKv(fn interface{}, mainKv bool) (*DoFn, error) { Review comment: Done, went with the variadic options made of functions approach. If anyone else is reading this, based it off this article: https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 398152) Time Spent: 2h 10m (was: 2h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=395344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-395344 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 29/Feb/20 00:03 Start Date: 29/Feb/20 00:03 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r385893620 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,58 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// Constants so we can avoid magic numbers in validation. Represent number of +// DoFn main inputs based on what kind of input the DoFn has. +const ( + unknownInNum = -1 // Used when we don't know the number of main inputs. + singleInNum = 1 + kvInNum = 2 +) + // NewDoFn constructs a DoFn from the given value, if possible. func NewDoFn(fn interface{}) (*DoFn, error) { ret, err := NewFn(fn) if err != nil { return nil, errors.WithContext(errors.Wrapf(err, "invalid DoFn"), "constructing DoFn") } - return AsDoFn(ret) + return AsDoFn(ret, unknownInNum) } -// AsDoFn converts a Fn to a DoFn, if possible. -func AsDoFn(fn *Fn) (*DoFn, error) { +// NewDoFnKv constructs a DoFn from the given value, if possible, with +// improved validation from knowing whether the DoFn's main input is a KV or +// single element. +func NewDoFnKv(fn interface{}, mainKv bool) (*DoFn, error) { + ret, err := NewFn(fn) + if err != nil { + return nil, errors.WithContext(errors.Wrapf(err, "invalid DoFn"), "constructing DoFn") + } + + if mainKv { + return AsDoFn(ret, kvInNum) + } else { + return AsDoFn(ret, singleInNum) + } +} + +// AsDoFn converts a Fn to a DoFn, if possible. numMainIn specifies how many +// main inputs are expected in the DoFn's method signatures. Valid values are +// -1 (unknown), 1 (single elements), or 2 (KVs). If the value is unknown then +// validation is done by best effort and may miss some edge cases. +func AsDoFn(fn *Fn, numMainIn int) (*DoFn, error) { addContext := func(err error, fn *Fn) error { return errors.WithContextf(err, "graph.AsDoFn: for Fn named %v", fn.Name()) } + // Validate numMainIn. This check should match this method's comment. + if numMainIn != unknownInNum && + numMainIn != singleInNum && + numMainIn != kvInNum { + err := errors.Errorf("invalid number of main inputs given. "+ + "Got: %v, Want: One of the following: %v", + processElementName, []int{unknownInNum, singleInNum, kvInNum}) + return nil, addContext(err, fn) + } Review comment: Consider a switch instead. ```suggestion switch numMainIn { case unknownInNum, singleInNum, kvInNum:// Valid default: // Invalid err := errors.Errorf("invalid number of main inputs given. "+ "Got: %v, Want: One of the following: %v", processElementName, []int{unknownInNum, singleInNum, kvInNum}) return nil, addContext(err, fn) } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 395344) Time Spent: 2h (was: 1h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=395343&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-395343 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 29/Feb/20 00:03 Start Date: 29/Feb/20 00:03 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#discussion_r385468257 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -209,21 +209,58 @@ func (f *DoFn) RestrictionT() *reflect.Type { // a KV or not based on the other signatures (unless we're more loose about which // sideinputs are present). Bind should respect that. +// Constants so we can avoid magic numbers in validation. Represent number of +// DoFn main inputs based on what kind of input the DoFn has. +const ( + unknownInNum = -1 // Used when we don't know the number of main inputs. + singleInNum = 1 + kvInNum = 2 +) + // NewDoFn constructs a DoFn from the given value, if possible. func NewDoFn(fn interface{}) (*DoFn, error) { ret, err := NewFn(fn) if err != nil { return nil, errors.WithContext(errors.Wrapf(err, "invalid DoFn"), "constructing DoFn") } - return AsDoFn(ret) + return AsDoFn(ret, unknownInNum) } -// AsDoFn converts a Fn to a DoFn, if possible. -func AsDoFn(fn *Fn) (*DoFn, error) { +// NewDoFnKv constructs a DoFn from the given value, if possible, with +// improved validation from knowing whether the DoFn's main input is a KV or +// single element. +func NewDoFnKv(fn interface{}, mainKv bool) (*DoFn, error) { Review comment: With the name NewDoFnKv, it sounds like it's already assuming that a DoFn KV is being passed in. It's OK for there to be special purpose methods that only do one thing. Another option to consider instead of having two (or N) methods, consider extending the current NewDoFn with a variadic an Option type. (eg. opts ...Option), this lets existing callers keep things the same, but allow for expanding things in the future. Option should probably be either a function type, or an interface type with private methods, and the options are provided by other methods in the package. eg. graph.NewDoFn(fn, graph.HasKVInput(), graph.HasRestriction()). This is valuable if we think being able to expand things in the future, but also lets us mix and match more easily later on. This way we can keep the existing behavior when there are no options, but keep the documentation of all the various uses in one place on the NewDoFn method referring to the option returning methods. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 395343) Time Spent: 2h (was: 1h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=394449&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394449 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 27/Feb/20 20:53 Start Date: 27/Feb/20 20:53 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-592171327 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 394449) Time Spent: 1h 50m (was: 1h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=394448&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394448 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 27/Feb/20 20:52 Start Date: 27/Feb/20 20:52 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991#issuecomment-592171160 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 394448) Time Spent: 1h 40m (was: 1.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=394447&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394447 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 27/Feb/20 20:50 Start Date: 27/Feb/20 20:50 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10991: [BEAM-3301] Refactor DoFn validation & allow specifying main inputs. URL: https://github.com/apache/beam/pull/10991 The current version of this validation is a bit permissive because it doesn't require the number of main inputs to be specified. This change allows specifying the number of main inputs, while also preserving the existing code path of not specifying it. Along with that change, I made some refactors to existing validation code to try to improve readability and make it more organized. This is filed under BEAM-3301 (SDF) because it is intended to enable validation for SDFs which is difficult without a known number of main inputs. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/bad
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=383908&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383908 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 08/Feb/20 01:08 Start Date: 08/Feb/20 01:08 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10801: [BEAM-3301] (Go SDK) Adding restriction plumbing to graph construction. URL: https://github.com/apache/beam/pull/10801 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383908) Time Spent: 1h 20m (was: 1h 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=383895&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383895 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 08/Feb/20 00:25 Start Date: 08/Feb/20 00:25 Worklog Time Spent: 10m Work Description: lostluck commented on issue #10801: [BEAM-3301] (Go SDK) Adding restriction plumbing to graph construction. URL: https://github.com/apache/beam/pull/10801#issuecomment-583672891 LGTM. Please merge if the tests pass. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383895) Time Spent: 1h 10m (was: 1h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=383891&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383891 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 08/Feb/20 00:07 Start Date: 08/Feb/20 00:07 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10801: [BEAM-3301] (Go SDK) Adding restriction plumbing to graph construction. URL: https://github.com/apache/beam/pull/10801#issuecomment-583668200 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383891) Time Spent: 1h (was: 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=383890&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383890 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 08/Feb/20 00:06 Start Date: 08/Feb/20 00:06 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10801: [BEAM-3301] (Go SDK) Adding restriction plumbing to graph construction. URL: https://github.com/apache/beam/pull/10801 This adds some plumbing that retrieves the restriction type from DoFns, gets a coder for that type, and stores it with ParDos, and ultimately it adds the restriction coder to the translated proto. However, the code for finding the restriction type is unimplemented since we don't have SDFs yet, so this change doesn't do anything concrete yet. Left TODOs as reminders of future work that needs to be done. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=383054&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383054 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Feb/20 19:00 Start Date: 06/Feb/20 19:00 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10778: [BEAM-3301] Small cleanup to KV Decoder. URL: https://github.com/apache/beam/pull/10778 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383054) Time Spent: 40m (was: 0.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=383030&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-383030 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 06/Feb/20 18:48 Start Date: 06/Feb/20 18:48 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #10778: [BEAM-3301] Small cleanup to KV Decoder. URL: https://github.com/apache/beam/pull/10778#discussion_r376014767 ## File path: sdks/go/pkg/beam/core/runtime/exec/coder.go ## @@ -304,8 +311,20 @@ func (c *kvDecoder) Decode(r io.Reader) (*FullValue, error) { if err != nil { return nil, err } - return &FullValue{Elm: key.Elm, Elm2: value.Elm}, nil + return &FullValue{Elm: elideSingleElmFV(key), Elm2: elideSingleElmFV(value)}, nil +} +// elideSingleElmFV elides a FullValue if it has only one element, returning +// the contents of the first element, but returning the FullValue unchanged +// if it has two elements. +// +// Technically drops window and timestamp info, so only use when those are Review comment: Great warning to include! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 383030) Time Spent: 0.5h (was: 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=382597&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382597 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Feb/20 23:26 Start Date: 05/Feb/20 23:26 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #10778: [BEAM-3301] Small cleanup to KV Decoder. URL: https://github.com/apache/beam/pull/10778 This enables nested KV coders to be decoded properly (the previous code would drop nested KV's value). Currently there are no nested KV coders present in the Go SDK, but it will be needed for SDF support. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apach
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=382598&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382598 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 05/Feb/20 23:26 Start Date: 05/Feb/20 23:26 Worklog Time Spent: 10m Work Description: youngoli commented on issue #10778: [BEAM-3301] Small cleanup to KV Decoder. URL: https://github.com/apache/beam/pull/10778#issuecomment-582662103 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 382598) Time Spent: 20m (was: 10m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)