[GitHub] beam pull request #1782: [BEAM-886] Implement annotation based NewDoFn

2017-01-16 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/1782

[BEAM-886] Implement annotation based NewDoFn

- Implement the new annotation based DoFn in python see 
https://s.apache.org/a-new-dofn 
- Migrate all the cases of DoFn to NewDoFn
- CallableWrapperDoFn now extends NewDoFn
- Typechecking is also handled using the NewDoFn

Future Work:
- Remove OldDoFn completely from the code
- Deprecate the use of Context and pass state directly to the process method

Open question:
- This changes the semantics of process being called with a single window 
instead of a windowSet which is why 
`apache_beam.transforms.sideinputs_test:SideInputsTest.test_sliding_windows` is 
failing. 

R: @robertwb PTAL

---

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/incubator-beam BEAM-prototype-new-dofn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1782.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1782


commit 6d4e2539f9f2f42bff45b1861f9d3f67d915bd30
Author: Sourabh Bajaj 
Date:   2017-01-16T23:50:17Z

Implement annotation based NewDoFn




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-886) Support new DoFn in Python SDK

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824738#comment-15824738
 ] 

ASF GitHub Bot commented on BEAM-886:
-

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/1782

[BEAM-886] Implement annotation based NewDoFn

- Implement the new annotation based DoFn in python see 
https://s.apache.org/a-new-dofn 
- Migrate all the cases of DoFn to NewDoFn
- CallableWrapperDoFn now extends NewDoFn
- Typechecking is also handled using the NewDoFn

Future Work:
- Remove OldDoFn completely from the code
- Deprecate the use of Context and pass state directly to the process method

Open question:
- This changes the semantics of process being called with a single window 
instead of a windowSet which is why 
`apache_beam.transforms.sideinputs_test:SideInputsTest.test_sliding_windows` is 
failing. 

R: @robertwb PTAL

---

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/incubator-beam BEAM-prototype-new-dofn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1782.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1782


commit 6d4e2539f9f2f42bff45b1861f9d3f67d915bd30
Author: Sourabh Bajaj 
Date:   2017-01-16T23:50:17Z

Implement annotation based NewDoFn




> Support new DoFn in Python SDK
> --
>
> Key: BEAM-886
> URL: https://issues.apache.org/jira/browse/BEAM-886
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Sourabh Bajaj
>  Labels: backward-incompatible, sdk-consistency
>
> Figure out what is needed for supporting new DoFns, add support and removed 
> old DoFns.
> Related Docs from Java:
> Original Proposal email:
> https://lists.apache.org/thread.html/2abf32d528dbb64b79853552c5d10c217e2194f0685af21aeb4635dd@%3Cdev.beam.apache.org%3E
> Presentation & Doc (with short Python sections):
> https://s.apache.org/presenting-a-new-dofn
> https://s.apache.org/a-new-dofn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-475) High-quality javadoc for Beam

2017-01-16 Thread Benson Margulies (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824620#comment-15824620
 ] 

Benson Margulies commented on BEAM-475:
---

I could replace all those settings of beam.top with 
https://github.com/scijava/scijava-maven-plugin -- the set-rootdir goal

> High-quality javadoc for Beam
> -
>
> Key: BEAM-475
> URL: https://issues.apache.org/jira/browse/BEAM-475
> Project: Beam
>  Issue Type: Improvement
>  Components: project-management
>Reporter: Daniel Halperin
>Assignee: Benson Margulies
> Fix For: Not applicable
>
>
> We should have good Javadoc for Beam!
> Current snapshot: http://beam.incubator.apache.org/javadoc/0.1.0-incubating/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-475) High-quality javadoc for Beam

2017-01-16 Thread Benson Margulies (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824458#comment-15824458
 ] 

Benson Margulies commented on BEAM-475:
---

https://github.com/apache/beam/pull/1781 

This exercise in duplication of information results from two apparent 
maven-javadoc-plugin bugs.

First, the aggregate goal seems completely broken when used in the lifecycle. 
The plugin has no integration tests, and github seems to have no examples of 
anyone using it. All the variations I tried based on the official doc failed, 
by failing to pass any sources to the javadoc command.

I tried to fall back to making it merely work to say {{mvn javadoc:aggregate}} 
from command line and spare typing all those options. That didn't help, 
however, because excludepackagenames does not work. The reason is a rather old 
and deep bug in the plugin -- it always calls javadoc with a long list of 
individual java source files. The javadoc option for excluding packages does 
not work in this case.

So, as you will see, there is a bunch of configuration in the top pom to 
improve the javadoc jar files that are attached for each component, and then 
some of the same information is in the new directory, javadoc-aggregate, which 
uses ant to do the actual javadocing.

Normally, I'd offer the idea of fixing the javadoc plugin and then using the 
results, but my personal history in trying to repair that thing was not very 
successful.


> High-quality javadoc for Beam
> -
>
> Key: BEAM-475
> URL: https://issues.apache.org/jira/browse/BEAM-475
> Project: Beam
>  Issue Type: Improvement
>  Components: project-management
>Reporter: Daniel Halperin
>Assignee: Benson Margulies
> Fix For: Not applicable
>
>
> We should have good Javadoc for Beam!
> Current snapshot: http://beam.incubator.apache.org/javadoc/0.1.0-incubating/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] beam pull request #1781: [BEAM-475] Improve Javadoc Aggregation

2017-01-16 Thread bimargulies-google
GitHub user bimargulies-google opened a pull request:

https://github.com/apache/beam/pull/1781

[BEAM-475] Improve Javadoc Aggregation

* Not ready for merging *

The changes here make it possible to generate the desired aggregate 
javadoc. See JIRA for discussion.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bimargulies-google/beam beam-475-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1781.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1781


commit 6b6286114f86bbc8c26d147db9865337b8fe7370
Author: bimargulies-google 
Date:   2017-01-13T23:24:30Z

First pass complete.

commit 1157217c732cf8f0c0cdb9040b239bdf43fd3e36
Author: Benson Margulies 
Date:   2017-01-16T17:12:28Z

Make this work as mvn javadoc:aggregate
from command line.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (BEAM-883) Make ApiSurfaceTest fail if something whitelisted is _not_ exposed

2017-01-16 Thread Stas Levin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stas Levin reassigned BEAM-883:
---

Assignee: Stas Levin

> Make ApiSurfaceTest fail if something whitelisted is _not_ exposed
> --
>
> Key: BEAM-883
> URL: https://issues.apache.org/jira/browse/BEAM-883
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Affects Versions: Not applicable
>Reporter: Kenneth Knowles
>Assignee: Stas Levin
>Priority: Minor
>  Labels: easy, easyfix, newbie, starter
> Fix For: Not applicable
>
>
> {{ApiSurfaceTest}} in the {{sdks/java/core}} is the class responsible for 
> protecting our public API surface.
> This test walks the public signatures of all modules and explicitly verifies 
> that everything is on a whitelist. This is how we control what dependencies 
> we expose to our users, so that Beam can keep a tight, stable API surface.
> It fails if anything not whitelisted is exposed. It would be nice if it also 
> something whitelisted is _not_ exposed, to make sure the test stays 
> informative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-882) Make ApiSurfaceTest detect the java package/module under test

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824029#comment-15824029
 ] 

ASF GitHub Bot commented on BEAM-882:
-

GitHub user staslev opened a pull request:

https://github.com/apache/beam/pull/1780

[BEAM-882,BEAM-883,BEAM-878] Simplified the creation of new API surface 
tests.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
R: @dhalperi @kennknowles 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/staslev/beam 
BEAM-882-878-883-fixing-ApiSurfaceTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1780.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1780


commit 31a7864ba5d7636f3d120540d201441498e6c4b8
Author: Stas Levin 
Date:   2017-01-16T14:20:25Z

[BEAM-882,BEAM-883,BEAM-878] Simplified creating new API surface 
verifications.




> Make ApiSurfaceTest detect the java package/module under test
> -
>
> Key: BEAM-882
> URL: https://issues.apache.org/jira/browse/BEAM-882
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Affects Versions: Not applicable
>Reporter: Kenneth Knowles
>Priority: Trivial
>  Labels: easy, easyfix, newbie, starter
> Fix For: Not applicable
>
>
> {{ApiSurfaceTest}} in the {{sdks/java/core}} is the class responsible for 
> protecting our public API surface.
> This test walks the public signatures of all modules and explicitly verifies 
> that everything is on a whitelist. This is how we control what dependencies 
> we expose to our users, so that Beam can keep a tight, stable API surface.
> Today you must indicate what Java package to scan, such as 
> `org.apache.beam.sdk`. It would be nice for this to be automatically 
> determined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-878) Allow usage of ApiSurfaceTest providing nothing but a whitelist

2017-01-16 Thread Stas Levin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stas Levin reassigned BEAM-878:
---

Assignee: Stas Levin

> Allow usage of ApiSurfaceTest providing nothing but a whitelist
> ---
>
> Key: BEAM-878
> URL: https://issues.apache.org/jira/browse/BEAM-878
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Stas Levin
>Priority: Minor
>  Labels: easy, easyfix, newbie, starter
> Fix For: Not applicable
>
>
> {{ApiSurfaceTest}} in the {{sdks/java/core}} is the class responsible for 
> protecting our public API surface.
> This test walks the public signatures of all modules and explicitly verifies 
> that everything is on a whitelist. This is how we control what dependencies 
> we expose to our users, so that Beam can keep a tight, stable API surface.
> We should improve this functionality to be reusable across modules. Ideally, 
> there would be only 2 things in the file: a whitelist and a ~1-line test that 
> passes the whitelist as a parameter to {{ApiSurfaceTest}}.
> As an example of what you have to do without this functionality, see 
> https://github.com/apache/incubator-beam/pull/1183



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-882) Make ApiSurfaceTest detect the java package/module under test

2017-01-16 Thread Stas Levin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stas Levin reassigned BEAM-882:
---

Assignee: Stas Levin

> Make ApiSurfaceTest detect the java package/module under test
> -
>
> Key: BEAM-882
> URL: https://issues.apache.org/jira/browse/BEAM-882
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Affects Versions: Not applicable
>Reporter: Kenneth Knowles
>Assignee: Stas Levin
>Priority: Trivial
>  Labels: easy, easyfix, newbie, starter
> Fix For: Not applicable
>
>
> {{ApiSurfaceTest}} in the {{sdks/java/core}} is the class responsible for 
> protecting our public API surface.
> This test walks the public signatures of all modules and explicitly verifies 
> that everything is on a whitelist. This is how we control what dependencies 
> we expose to our users, so that Beam can keep a tight, stable API surface.
> Today you must indicate what Java package to scan, such as 
> `org.apache.beam.sdk`. It would be nice for this to be automatically 
> determined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] beam pull request #1780: [BEAM-882,BEAM-883,BEAM-878] Simplified the creatio...

2017-01-16 Thread staslev
GitHub user staslev opened a pull request:

https://github.com/apache/beam/pull/1780

[BEAM-882,BEAM-883,BEAM-878] Simplified the creation of new API surface 
tests.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
R: @dhalperi @kennknowles 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/staslev/beam 
BEAM-882-878-883-fixing-ApiSurfaceTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1780.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1780


commit 31a7864ba5d7636f3d120540d201441498e6c4b8
Author: Stas Levin 
Date:   2017-01-16T14:20:25Z

[BEAM-882,BEAM-883,BEAM-878] Simplified creating new API surface 
verifications.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1273) Error with FlinkPipelineOptions serialization after setStateBackend

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823769#comment-15823769
 ] 

ASF GitHub Bot commented on BEAM-1273:
--

GitHub user xhumanoid opened a pull request:

https://github.com/apache/beam/pull/1779

[BEAM-1273] Error with FlinkPipelineOptions serialization after 
setStateBackend

Because value of StateBackend required only on stage setup stream 
environment we don't need serialize his. 

Other problem: implementations of AbstractStateBackend can't be 
serialize-deserialize because can't pass Serializer.ensureSerializable. On 
deserialization step of validation we try read serialized value like type 
defined in interface.

ProxyInvocationHandler.java:706



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xhumanoid/incubator-beam 
BEAM-1273-FlinkPipelineOptions-serialization-problem

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1779


commit 360f58d8bf00d35c57abea902d86f5a48e965ac2
Author: Alexey Diomin 
Date:   2017-01-16T10:46:08Z

[BEAM-1273] Error with FlinkPipelineOptions serialization after 
setStateBackend




> Error with FlinkPipelineOptions serialization after setStateBackend
> ---
>
> Key: BEAM-1273
> URL: https://issues.apache.org/jira/browse/BEAM-1273
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Alexey Diomin
>Assignee: Alexey Diomin
>
> Trying setup FlinkPipelineOptions.setStateBackend cause error:
> {code}
> Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not 
> construct instance of org.apache.flink.runtime.state.AbstractStateBackend: 
> abstract types either need to be mapped to concrete types, have custom 
> deserializer, or contain additional type information.
> {code}
> Exception was thrown in SerializedPipelineOptions.
> Main problem then AbstractStateBackend and their implementation can't be 
> mapped in JSON schema for serialization.
> Error starting after:
> [BEAM-617][flink] introduce option to set state backend



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-1273) Error with FlinkPipelineOptions serialization after setStateBackend

2017-01-16 Thread Alexey Diomin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Diomin reassigned BEAM-1273:
---

Assignee: Alexey Diomin  (was: Aljoscha Krettek)

> Error with FlinkPipelineOptions serialization after setStateBackend
> ---
>
> Key: BEAM-1273
> URL: https://issues.apache.org/jira/browse/BEAM-1273
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Alexey Diomin
>Assignee: Alexey Diomin
>
> Trying setup FlinkPipelineOptions.setStateBackend cause error:
> {code}
> Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not 
> construct instance of org.apache.flink.runtime.state.AbstractStateBackend: 
> abstract types either need to be mapped to concrete types, have custom 
> deserializer, or contain additional type information.
> {code}
> Exception was thrown in SerializedPipelineOptions.
> Main problem then AbstractStateBackend and their implementation can't be 
> mapped in JSON schema for serialization.
> Error starting after:
> [BEAM-617][flink] introduce option to set state backend



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1268) Add integration tests for CassandraIO

2017-01-16 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823707#comment-15823707
 ] 

Etienne Chauchot commented on BEAM-1268:


Just needed a ticket number to move already started IT material to other branch 
than the PR branch :)

> Add integration tests for CassandraIO
> -
>
> Key: BEAM-1268
> URL: https://issues.apache.org/jira/browse/BEAM-1268
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-1184, we should add 
> integration tests to CassandraIO (https://github.com/apache/beam/pull/592) 
> and other IOs. We are currently working on infrastructure and java dedicated 
> module to support all integration tests. In the meantime, this jira is there 
> to track progress on integration tests related to Cassandra IO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)