[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444321#comment-16444321 ] Joar Wandborg commented on BEAM-115: [~kenn] There's a TODO left over from this issue at [https://github.com/apache/beam/blob/9c9f4ceceb87933da2b03304efd21edf55216937/sdks/python/apache_beam/pipeline.py#L835,] is there another ticket tracking this? > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056433#comment-16056433 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/3361 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049723#comment-16049723 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/3361 [BEAM-115] Port fn_api_runner to be able to use runner protos. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam fn-api-runner-protos Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/3361.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3361 commit fa8dd580405970f3a18310fe0562c16b68e813ce Author: Robert Bradshaw Date: 2017-06-14T21:52:43Z Port fn_api_runner to be able to use runner protos. commit 7131b231835f043398b135aeed660d599c9d643b Author: Robert Bradshaw Date: 2017-06-14T22:03:00Z fixup: cleanup > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034974#comment-16034974 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/3222 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026574#comment-16026574 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/3233 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025898#comment-16025898 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/3233 [BEAM-115] Runner API Translations for StateSpec and TimerSpec Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- R: @tgroh The awkward bit is that a combining `StateSpec` is statically typed to take a `CombineFn`. So when we get around to issuing Fn State API requests, we'll need to work around that. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam translate-StateSpec Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/3233.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3233 commit c87b72d5522e3369becd7fbe022824f4f223a9ae Author: Kenneth Knowles Date: 2017-05-25T14:12:08Z Flesh out TimerSpec and StateSpec in Runner API commit 55af992a3a465b333bc3ae51262bfc43474d90e8 Author: Kenneth Knowles Date: 2017-05-25T14:25:08Z Mark CombineFnWithContext StateSpecs internal commit 34eca25c5a6b6f3733f51fb6cb421cc755b7058c Author: Kenneth Knowles Date: 2017-05-25T14:27:52Z Add case dispatch to StateSpec This is different than a StateBinder: for a binder, the id is needed and the StateSpec controls the return type. For case dispatch, the dispatcher controls the type and it should just be reading the spec, which does not require the id. Eventually, StateBinder could be removed in favor of StateSpec.Cases>. commit 09aeab25a92ef961a8968a5a3e863786750dff46 Author: Kenneth Knowles Date: 2017-05-25T20:02:15Z Allow translation to throw IOException commit 9eb1ef070506e7b419bd388f2a0f9407056a8bcb Author: Kenneth Knowles Date: 2017-05-26T05:51:18Z Make Java serialized CombineFn URN public commit 5aa01d64ff50d9396137ada84b281700ce1d8d8d Author: Kenneth Knowles Date: 2017-05-25T14:12:29Z Implement TimerSpec and StateSpec translation > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023946#comment-16023946 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/3222 [BEAM-115] Unify Java and Python WindowingStragegy representations. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam runner-windows Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/3222.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3222 commit 5fb2043e7f25871f4e9ad7de0e6c3380290b0c21 Author: Robert Bradshaw Date: 2017-05-25T00:23:31Z Unify Java and Python WindowingStragegy representations. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991903#comment-15991903 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2644 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979640#comment-15979640 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/2644 [BEAM-115] Fn API support for Python Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam fn-api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2644.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2644 commit 4f5c5241949473d81b18cb034640189ea98c49df Author: Robert Bradshaw Date: 2017-04-21T16:59:52Z Add instructions to regenerate Python proto wrappers. commit f4f94db5bb3b64c015fd95de64bafd34770aaa21 Author: Robert Bradshaw Date: 2017-04-21T17:00:20Z Generate python proto wrappers for runner and fn API. commit 6f0e487749b777c5606a69865b7e47134d878a58 Author: Robert Bradshaw Date: 2017-04-21T18:04:03Z Ignore generated files for linter. commit aa1425c86f5def43ff9032b8666352b747d20440 Author: Robert Bradshaw Date: 2017-04-21T18:09:03Z Ignore generated files in rat plugin. commit 91428b06f8d4ced0bd3965801920cd94ee8298c6 Author: Robert Bradshaw Date: 2017-04-21T19:15:02Z Add apache licence to generated files. commit 27f718e89d1829b5b217f263d4b4deb8832c947d Author: Robert Bradshaw Date: 2017-04-20T20:59:25Z Add fn api runner. commit f987622d524fa4245089f82fbd150c0cf9ae380d Author: Robert Bradshaw Date: 2017-04-20T21:20:29Z Restore __init__.py commit 45ccb571aed5e51ef893428c2a5350ba01864aac Author: Robert Bradshaw Date: 2017-04-20T20:44:33Z Add runner core files. commit c6310ad8236578c05432244ced40fe572c2d71a7 Author: Robert Bradshaw Date: 2017-04-20T21:23:30Z move files around commit 0f9de53c7007e200e5f04c9a30349313b2bdff39 Author: Robert Bradshaw Date: 2017-04-20T21:27:14Z more moving commit 2f5c518bff028878b7d18a30141beea734cb36a0 Author: Robert Bradshaw Date: 2017-04-20T21:43:28Z Rename runners. commit 85b2172c1d8d6a5797015c15cd5a301de4a252c6 Author: Robert Bradshaw Date: 2017-04-20T21:53:26Z cythonization works commit 9f1177db48fa0230fab463cb7a5d784b96c9e084 Author: Robert Bradshaw Date: 2017-04-20T22:03:52Z test module import renames commit a0e8d792bbfac4550e4399b29c9004f67b82cbd0 Author: Robert Bradshaw Date: 2017-04-20T22:06:30Z remove google3s commit dfc0b2f3ca04594a2167eaadc7882ca257cdca5f Author: Robert Bradshaw Date: 2017-04-20T22:07:37Z move sdk_harness to sdk_worker commit 8b08216c54e65fd0eb5a6c9588405cb4301267b2 Author: Robert Bradshaw Date: 2017-04-20T22:09:33Z Remove unused end_time from statesampler. commit 011dec575a4066eb7e50c0bff2bad78a9cadd449 Author: Robert Bradshaw Date: 2017-04-21T16:59:52Z Add instructions to regenerate Python proto wrappers. commit 4a74cdd528c6e9800c66152f9498592f51b59248 Author: Robert Bradshaw Date: 2017-04-21T17:00:20Z Generate python proto wrappers for runner and fn API. commit 5dcd25d144f784a5b9bd6bd668c2ceb8bc2680f3 Author: Robert Bradshaw Date: 2017-04-21T18:04:03Z Ignore generated files for linter. commit 5697ecdca1c14ce926f0be63399df77609c12aaf Author: Robert Bradshaw Date: 2017-04-21T18:09:03Z Ignore generated files in rat plugin. commit 57a16aa810f5e4ef14c5f2ced901c3bb6fbc7b1a Author: Robert Bradshaw Date: 2017-04-21T19:15:02Z Add apache licence to generated files. commit a8456be3830869e5b51e24970655819bc832bd79 Author: Robert Bradshaw Date: 2017-04-21T19:36:27Z implement portpicker, fix more imports commit c4633f174d436304da142606145a4827c9da7f4e Author: Robert Bradshaw Date: 2017-04-21T20:04:22Z portpicker fixes commit f349f14e49e03b675ad7f7ab06e2c871c9d5eae4 Author: Robert Bradshaw Date: 2017-04-21T20:04:35Z more import fixes commit cc1dc88e83a5ed7ba0c04fbd10943ec98d920c73 Author: Robert Bradshaw Date: 2017-04-21T21:22:07Z Adapt to PR #2505 changes to protos. commit cb09e464c3da6358b203fad681bba41fb3d4a837 Author: Robert Bradshaw D
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967809#comment-15967809 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2505 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15966294#comment-15966294 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2511 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15966173#comment-15966173 ] ASF GitHub Bot commented on BEAM-115: - GitHub user lukecwik opened a pull request: https://github.com/apache/beam/pull/2511 [BEAM-115] Update timer/state fields on ParDoPayload to use a map field for consistent name usage Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [x Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/lukecwik/incubator-beam runner_api_proto Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2511.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2511 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965150#comment-15965150 ] ASF GitHub Bot commented on BEAM-115: - GitHub user tgroh opened a pull request: https://github.com/apache/beam/pull/2505 [BEAM-115] Represent a Pipeline via a list of Top-level Transforms Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- The root node is a synthetic transform which does not appear within the graph, as it never has any components of note. Instead of referring to a single "root node" in the Pipeline message, refer to the top-level nodes which do not have an enclosing PTransform. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tgroh/beam top_level_transforms Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2505.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2505 commit 31d5c272c4a6822ef35836be207224b557e122b4 Author: Thomas Groh Date: 2017-04-11T23:42:28Z Represent a Pipeline via a list of Top-level Transforms The root node is a synthetic transform which does not appear within the graph, as it never has any components of note. Instead of referring to a single "root node" in the Pipeline message, refer to the top-level nodes which do not have an enclosing PTransform. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959537#comment-15959537 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2452 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959305#comment-15959305 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2452 [BEAM-115] Rename FunctionSpec and UrnWithParameter to their (hopefully) final names Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- R: @tgroh I believe this is consistent with dev discussions and in PRs, etc. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam runner-api-touch-ups Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2452.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2452 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951270#comment-15951270 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2372 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949497#comment-15949497 ] ASF GitHub Bot commented on BEAM-115: - GitHub user tgroh opened a pull request: https://github.com/apache/beam/pull/2372 [BEAM-115] Include the creating PCollection in PCollectionView Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- This is available on the client that created the view, but may not be available elsewhere. Update signatures and callers to match. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tgroh/beam view_contains_pcollection Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2372.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2372 commit e0887a011484368139700259eb5af0363c5cbc4d Author: Thomas Groh Date: 2017-03-30T17:47:12Z Include the creating PCollection in PCollectionView This is available on the client that created the view, but may not be available elsewhere. Update signatures and callers to match. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949273#comment-15949273 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2296 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939528#comment-15939528 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2309 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939334#comment-15939334 ] ASF GitHub Bot commented on BEAM-115: - GitHub user tgroh opened a pull request: https://github.com/apache/beam/pull/2309 [BEAM-115] Make Canonical ViewFns Public Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Mark all of the ViewFns as both deprecated and experimental. These fns will all be removed once Runners have multimap support, and are not suitable for users to use explicitly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tgroh/beam public_view_fns Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2309.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2309 commit b3ea5b22f090958f470fb2a9996ea96d94de2a3c Author: Thomas Groh Date: 2017-03-23T22:23:40Z Make Canonical ViewFns Public Mark all of the ViewFns as both deprecated and experimental. These fns will all be removed once Runners have multimap support, and are not suitable for users to use explicitly. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938907#comment-15938907 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2299 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937655#comment-15937655 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2299 [BEAM-115] Make distinguished URNs public These URNs are in flux and will be relocated to some final good location as the Runner API and Fn API develop. For now, this change just makes them public in the place where they currently are defined. R: @tgroh You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam custom-windowfn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2299.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2299 commit 94814e88f6d6737f389c7b8e8727d3a4a78bba54 Author: Kenneth Knowles Date: 2017-03-23T03:33:24Z Make distinguished URNs public These URNs are in flux and will be relocated to some final good location as the Runner API and Fn API develop. For now, this change just makes them public in the place where they currently are defined. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937428#comment-15937428 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/2296 [BEAM-115] Translate pipeline graph to and from Runner API protos. There are some caveates: * Specific known transforms, with their payloads, are not yet translated. * Side inputs are not yet supported. All pipelines without side inputs are passed through this translation by default before execution. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam py-runner-api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2296.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2296 commit ebc74e9678050c632574b11afebf250b8332c25e Author: Robert Bradshaw Date: 2017-03-20T23:08:37Z Translate pipeline graph to and from Runner API protos. There are some caveates: * Specific known transforms, with their payloads, are not yet translated. * Side inputs are not yet supported. All pipelines without side inputs are passed through this translation by default before execution. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904419#comment-15904419 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2190 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900522#comment-15900522 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/2190 [BEAM-115] Runner API representation of windowing strategies for Python Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam py-runner-api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2190.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2190 commit 356266b26684ad7e6846eaba33f6744f365890cf Author: Robert Bradshaw Date: 2017-03-07T20:02:08Z Auto-generated runner api proto bindings. commit e18be9cb1dd665a10b7250209c28d10600614bdb Author: Robert Bradshaw Date: 2017-03-07T20:04:27Z Runner API context helper classes. commit 0624235719bb9f813e620939fc0e11ac713708cb Author: Robert Bradshaw Date: 2017-03-07T20:21:02Z Runner API encoding of WindowFns. commit 243ba920ee1682f8c7863c339b7d057c9fecb14c Author: Robert Bradshaw Date: 2017-03-08T00:18:02Z Runner API translation of triggers and windowing strategies. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890558#comment-15890558 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2131 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889140#comment-15889140 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/2131 [BEAM-115] Inline rather than reference FunctionSpecs. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam runner-protos Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2131.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2131 commit cbfce74aae64f9b353c07ced8427dc77ab1c31f4 Author: Robert Bradshaw Date: 2017-02-28T23:51:24Z Inline rather than reference FunctionSpecs. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887038#comment-15887038 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2106 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883785#comment-15883785 ] ASF GitHub Bot commented on BEAM-115: - GitHub user robertwb opened a pull request: https://github.com/apache/beam/pull/2106 [BEAM-115] More Runner API refinements. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam runner-protos Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2106.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2106 commit c2f5351e3731f29d024240016c553cecd3be3143 Author: Robert Bradshaw Date: 2017-02-24T23:53:39Z Make access_pattern a URN with params. > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883715#comment-15883715 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2094 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882880#comment-15882880 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2042 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881797#comment-15881797 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2094 [BEAM-115] Concretize generic bits of the Runner API graph structure Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- R: @dhalperi @robertwb This gets rid of the excessive generic design of `GraphNode` and restores it to the original design wherein each node is a `PTransform`. I have also merged the `bytes` of the SDK-specific data and the `Any` that is SDK-independent data, since as has been pointed out we won't need both. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam inline-runner-api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2094.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2094 commit cc46d194cacbfb2244fde837f01b2ba0f2cedcdb Author: Kenneth Knowles Date: 2017-02-24T01:51:10Z Inline PTransform to GraphNode, removing generic design The GraphNode structure was made more generic to allow the Runner API and Fn API to share the graph data structure while carrying distinct payloads on nodes and edges. It seems that the Runner API was already sufficiently flexible for the Fn API to use its existing payload design. commit 47289b5bc3a452e2866fb6515b55c7ef5d2835a8 Author: Kenneth Knowles Date: 2017-02-24T02:06:56Z Condense FunctionSpec, merging data and params > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881275#comment-15881275 ] ASF GitHub Bot commented on BEAM-115: - Github user kennknowles closed the pull request at: https://github.com/apache/beam/pull/2065 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877407#comment-15877407 ] ASF GitHub Bot commented on BEAM-115: - Github user kennknowles closed the pull request at: https://github.com/apache/beam/pull/2011 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877400#comment-15877400 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2065 [BEAM-115,BEAM-1348] Unify Runner API and Fn API coders Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Just shares `Coder`. Its a hack - it is really crufty to have side-by-side int and string registries. But coders need to be by-reference. Options, off the top of my head: 1. Make everything keyed on int64 (distasteful for debugging, but not horrid). 2. Make everything keyed on string. 3. Make Fn API and harness really embrace separate registries by type. 4. Write 3 lines of conversion code, since sharing this one trivial proto has only unclear benefit, while avoiding a hard dependency is almost always a win. R: @dhalperi You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam fn-api-coders-only Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2065.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2065 commit 3d4aed8fa4dfc9f1c69bb39692e92c16b3002df6 Author: Kenneth Knowles Date: 2017-02-22T03:11:19Z Unify Runner API and Fn API coders > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876299#comment-15876299 ] ASF GitHub Bot commented on BEAM-115: - Github user kennknowles closed the pull request at: https://github.com/apache/beam/pull/662 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872932#comment-15872932 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2042 [BEAM-115,BEAM-1195] Build trigger state machine from Runner API Trigger proto directly Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- This replaces the case dispatch on the Java SDK's `Trigger` type with a case dispatch on the Runner API's `Trigger` type. This is a step towards `runners/core-java` executing triggers from any language. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam trigger-proto-state-machines Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2042.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2042 commit 7e4e37d0d2cb2fca1c12f67dedbf7b96cce3c6e5 Author: Kenneth Knowles Date: 2017-02-18T00:05:13Z Build trigger state machine from Runner API Trigger proto directly > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871158#comment-15871158 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2030 [BEAM-115,BEAM-1328] Convert to/from WindowingStrategy proto in Java SDK Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Some aspects of this PR are sort of hacks, but the kind that might be forgiven in order to make rapid progress. I am opening the PR for comment anyhow, but there are few changes that I might now make underneath this PR: 1. Move Java SDK `ClosingBehavior` to top level and rename. Incidentally related to [BEAM-210](https://issues.apache.org/jira/browse/BEAM-210) only because that is another reason this should just be a top-level concept. Not sure there's much benefit to putting public enums inside other misc classes when there's no real natural home for them. 2. Move Java SDK `AccumulationMode` to top level and put its to/from proto there. In particular, it should also come out of `util`. 3. Maybe actually try to move from `OutputTimeFn` to a Java SDK `OutputTime` prior to this PR. But actually converting this to proto, then converting the runners to use that will make the latter migration easier. Some cruft is introduced in the meantime, though. 4. If `WindowingStrategy` is going to continue to be a thing that is prominent all over our public SDK surface and in the runner API whether it really belongs in `util`. And I'll want to flesh out the list of test cases. Given how generic the logic is, I don't expect many surprises. R: @tgroh You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam WindowingStrategy-from-proto Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2030.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2030 commit e7171b8fcd8068e5b0ec0c660d16d104bb1c334a Author: Kenneth Knowles Date: 2017-02-16T22:45:05Z Make SDK-specific serialized blob really a blob commit b3b3ba5cdd5559685208b0cbbd45071bc862a7b8 Author: Kenneth Knowles Date: 2017-02-17T04:26:39Z Add closing behavior to Runner API proto commit fd995928091582487e14cb1af128354b5c9fadbe Author: Kenneth Knowles Date: 2017-02-17T04:26:45Z Add conversion to/from Runner API proto for WindowingStrategy > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870858#comment-15870858 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/2023 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870807#comment-15870807 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2023 [BEAM-115] Add convenience for wrapping up a self contained message that has references in it Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam Pipeline-components Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2023.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2023 commit 30897bedaabd3a30200d62d82d269ac27fa79f9b Author: Kenneth Knowles Date: 2017-02-16T22:07:40Z Add optional components field pervasively in Runner API > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867257#comment-15867257 ] ASF GitHub Bot commented on BEAM-115: - Github user kennknowles closed the pull request at: https://github.com/apache/beam/pull/2000 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867182#comment-15867182 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2011 [BEAM-115,BEAM-1348] Unify Fn API and Runner API coder specs Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- This includes the entirety of #2000, which unified `FunctionSpec`. Feel free to review that first or, if you prefer, review just this since the diff is small. The unification here was uninteresting on top of #2000. Summary of changes: - Moved a coder's local id out of the `Coder` itself and into the key of a map in `ProcessBundleDescriptor`. Philosophically, the id is an essential aspect of `ProcessBundleDescriptor` (or `Pipeline`) but not not an essential aspect of a coder. Pragmatically, this allows the Runner API and the Fn API to key the map on different types (`string` and `int64` respectively). Prospectively, it makes it easy to construct instances of the message that are "just values" without any id, which is aesthetically pleasing and more flexible to more uses. - Inlined the `SdkFunctionSpec` in `Coder` in the Runner API. Having it by reference introduces a needless sharing of key type and adds needless overhead, since coders are already stored by reference, as are environments. R: @dhalperi You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam fn-api-coders Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2011.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2011 commit 2b55a7f303ab0fea58ad279dd214253b4fe69565 Author: Kenneth Knowles Date: 2017-02-14T20:33:43Z Remove underscore from Runner API proto Java package commit 4e7865b828eae962532f1759833eed8b0e769cc9 Author: Kenneth Knowles Date: 2017-02-13T16:38:40Z Unify Fn API and Runner API FunctionSpec commit 5b5e6290e893385c47799cf5523c29be64c102fd Author: Kenneth Knowles Date: 2017-02-15T03:51:58Z Unify Fn API and Runner API coder spec > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864586#comment-15864586 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/2000 [BEAM-115] Unify Fn API and Runner API FunctionSpec Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Do not review yet - utilizing Jenkins You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam fn-api-functionspec Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2000.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2000 commit 90ed46256915d726b537454b1239cb70767b2e2d Author: Kenneth Knowles Date: 2017-02-13T16:38:40Z Unify Fn API and Runner API FunctionSpec > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860380#comment-15860380 ] ASF GitHub Bot commented on BEAM-115: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/1946 > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-115) Beam Runner API
[ https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857474#comment-15857474 ] ASF GitHub Bot commented on BEAM-115: - GitHub user kennknowles opened a pull request: https://github.com/apache/beam/pull/1946 [BEAM-115] Add proto definition for Runner API Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [x] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [x] Replace `` in the title with the actual Jira issue number, if there is one. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- These are the protocol buffers definitions corresponding to the Avro schema proposed in https://s.apache.org/beam-runner-api. Differences from the schema there: - Graph structure is decoupled from what data annotations the nodes - Adopted names from the Fn API's proto for things that overlap - Added explicit URNs and payloads for primitives - Added outline of state and timers - Added Environment URLs I would like to merge this and separately port the Fn API to use shared structures as appropriate, since that may involve more extensive coding work. Unlike #662, this is _not_ a WIP just for discussion. This is intended for immediate use as the serialization format for various pieces of the pipeline to unblock work on the Fn API. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kennknowles/beam pipeline-proto Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1946.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1946 commit 7bdd2869053bd3af7afdf68e36e1d54f868419ae Author: Kenneth Knowles Date: 2017-02-07T23:25:32Z Add proto definition for Runner API > Beam Runner API > --- > > Key: BEAM-115 > URL: https://issues.apache.org/jira/browse/BEAM-115 > Project: Beam > Issue Type: Improvement > Components: beam-model-runner-api >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > > The PipelineRunner API from the SDK is not ideal for the Beam technical > vision. > It has technical limitations: > - The user's DAG (even including library expansions) is never explicitly > represented, so it cannot be analyzed except incrementally, and cannot > necessarily be reconstructed (for example, to display it!). > - The flattened DAG of just primitive transforms isn't well-suited for > display or transform override. > - The TransformHierarchy isn't well-suited for optimizations. > - The user must realistically pre-commit to a runner, and its configuration > (batch vs streaming) prior to graph construction, since the runner will be > modifying the graph as it is built. > - It is fairly language- and SDK-specific. > It has usability issues (these are not from intuition, but derived from > actual cases of failure to use according to the design) > - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner > is confusing. > - The TransformHierarchy, accessible only via visitor traversals, is > cumbersome. > - The staging of construction-time vs run-time is not always obvious. > These are just examples. This ticket tracks designing, coming to consensus, > and building an API that more simply and directly supports the technical > vision. -- This message was sent by Atlassian JIRA (v6.3.15#6346)