[jira] [Created] (BEAM-5097) Increment counter for "small words" in go SDK example

2018-08-06 Thread holdenk (JIRA)
holdenk created BEAM-5097:
-

 Summary: Increment counter for "small words" in go SDK example
 Key: BEAM-5097
 URL: https://issues.apache.org/jira/browse/BEAM-5097
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: holdenk
Assignee: holdenk


Increment counter for "small words" in go SDK example



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5094) go generate specialize isn't super clear where to find specialize

2018-08-06 Thread holdenk (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk closed BEAM-5094.
-
   Resolution: Not A Problem
Fix Version/s: Not applicable

nvm, was in sdk/go/cmd but I was looking in sdk/go/pkg

> go generate specialize isn't super clear where to find specialize
> -
>
> Key: BEAM-5094
> URL: https://issues.apache.org/jira/browse/BEAM-5094
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: Henning Rohde
>Priority: Trivial
> Fix For: Not applicable
>
>
> If I try and update the templates go generate tells me the specialize command 
> is not found. Searching online we seem to be the only project which uses it ( 
> [https://www.google.com/search?ei=m7xoW9TpJY6O_wS1k4WoCQ&q=%22go%3Agenerate+specialize%22&oq=%22go%3Agenerate+specialize%22&gs_l=psy-ab.3...703840.705307.0.705535.3.3.0.0.0.0.50.139.3.3.00...1c.1.64.psy-ab..0.0.00.oe1VY3mk99c]
>  ) so I'm not entirely sure where to get specialize from. Is it a perhaps a 
> tool that we forgot to check in or is it a build artifact that I need to put 
> in my path?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3878) Improve error reporting in calls.go

2018-08-06 Thread holdenk (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570815#comment-16570815
 ] 

holdenk commented on BEAM-3878:
---

I started working on this, but then I ran into some confusion with the go 
generate command we are using (see 
[https://jira.apache.org/jira/browse/BEAM-5094] ).

> Improve error reporting in calls.go
> ---
>
> Key: BEAM-3878
> URL: https://issues.apache.org/jira/browse/BEAM-3878
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Priority: Minor
>
> The error messages generated in calls.go are not as helpful as they could be.
> Instead of simply reporting "incompatible func type" it would be great if 
> they reported the topology of the actual function supplied versus what is 
> expected. That would make debugging a lot easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5094) go generate specialize isn't super clear where to find specialize

2018-08-06 Thread holdenk (JIRA)
holdenk created BEAM-5094:
-

 Summary: go generate specialize isn't super clear where to find 
specialize
 Key: BEAM-5094
 URL: https://issues.apache.org/jira/browse/BEAM-5094
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: holdenk
Assignee: Henning Rohde


If I try and update the templates go generate tells me the specialize command 
is not found. Searching online we seem to be the only project which uses it ( 
[https://www.google.com/search?ei=m7xoW9TpJY6O_wS1k4WoCQ&q=%22go%3Agenerate+specialize%22&oq=%22go%3Agenerate+specialize%22&gs_l=psy-ab.3...703840.705307.0.705535.3.3.0.0.0.0.50.139.3.3.00...1c.1.64.psy-ab..0.0.00.oe1VY3mk99c]
 ) so I'm not entirely sure where to get specialize from. Is it a perhaps a 
tool that we forgot to check in or is it a build artifact that I need to put in 
my path?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5076) In Python worker docker file use a requirements.txt file

2018-08-05 Thread holdenk (JIRA)
holdenk created BEAM-5076:
-

 Summary: In Python worker docker file use a requirements.txt file
 Key: BEAM-5076
 URL: https://issues.apache.org/jira/browse/BEAM-5076
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-harness
Reporter: holdenk
Assignee: holdenk


See the comment in the Dockerfile for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5047) Make clean cleanup go vendor directories for Beam

2018-07-30 Thread holdenk (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk closed BEAM-5047.
-
   Resolution: Cannot Reproduce
Fix Version/s: Not applicable

> Make clean cleanup go vendor directories for Beam
> -
>
> Key: BEAM-5047
> URL: https://issues.apache.org/jira/browse/BEAM-5047
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: Henning Rohde
>Priority: Trivial
> Fix For: Not applicable
>
>
> I got into a state when building that it cached an old version of the file I 
> was working on and this was super confusing since clean didn't clean that up. 
> The gradle go plugin seems to have an idea of clean we could try?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5047) Make clean cleanup go vendor directories for Beam

2018-07-30 Thread holdenk (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562607#comment-16562607
 ] 

holdenk commented on BEAM-5047:
---

So it seems to be working now, but it wasn't before. I'm going to close this 
issue for now and if I run into again I'll re-open and take a look.

> Make clean cleanup go vendor directories for Beam
> -
>
> Key: BEAM-5047
> URL: https://issues.apache.org/jira/browse/BEAM-5047
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: Henning Rohde
>Priority: Trivial
> Fix For: Not applicable
>
>
> I got into a state when building that it cached an old version of the file I 
> was working on and this was super confusing since clean didn't clean that up. 
> The gradle go plugin seems to have an idea of clean we could try?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5048) Document common build pattern for SDKs

2018-07-30 Thread holdenk (JIRA)
holdenk created BEAM-5048:
-

 Summary: Document common build pattern for SDKs
 Key: BEAM-5048
 URL: https://issues.apache.org/jira/browse/BEAM-5048
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go, sdk-py-core
Reporter: holdenk
Assignee: Henning Rohde


It appears some of the devs don't build there sdks with gradle during their 
normal development cycle. It would be good to have the customary dev build 
instructions in the respective README.md files in the SDK directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5047) Make clean cleanup go vendor directories for Beam

2018-07-30 Thread holdenk (JIRA)
holdenk created BEAM-5047:
-

 Summary: Make clean cleanup go vendor directories for Beam
 Key: BEAM-5047
 URL: https://issues.apache.org/jira/browse/BEAM-5047
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: holdenk
Assignee: Henning Rohde


I got into a state when building that it cached an old version of the file I 
was working on and this was super confusing since clean didn't clean that up. 
The gradle go plugin seems to have an idea of clean we could try?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-4865) run_validatescontainer.sh in Python sdk is likely not run in jenkins

2018-07-25 Thread holdenk (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk closed BEAM-4865.
-
   Resolution: Not A Problem
Fix Version/s: Not applicable

> run_validatescontainer.sh in Python sdk is likely not run in jenkins
> 
>
> Key: BEAM-4865
> URL: https://issues.apache.org/jira/browse/BEAM-4865
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: holdenk
>Assignee: Robert Bradshaw
>Priority: Trivial
> Fix For: Not applicable
>
>
> We have a nice script to validate the Python container, however it's hard 
> coded to a bucket (sadness to be fixed separately in 
> https://issues.apache.org/jira/browse/BEAM-4864)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4865) run_validatescontainer.sh in Python sdk is likely not run in jenkins

2018-07-25 Thread holdenk (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556037#comment-16556037
 ] 

holdenk commented on BEAM-4865:
---

nvm, we just have the credentials hard coded in jenkins.

> run_validatescontainer.sh in Python sdk is likely not run in jenkins
> 
>
> Key: BEAM-4865
> URL: https://issues.apache.org/jira/browse/BEAM-4865
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: holdenk
>Assignee: Robert Bradshaw
>Priority: Trivial
> Fix For: Not applicable
>
>
> We have a nice script to validate the Python container, however it's hard 
> coded to a bucket (sadness to be fixed separately in 
> https://issues.apache.org/jira/browse/BEAM-4864)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4865) run_validatescontainer.sh in Python sdk is likely not run in jenkins

2018-07-25 Thread holdenk (JIRA)
holdenk created BEAM-4865:
-

 Summary: run_validatescontainer.sh in Python sdk is likely not run 
in jenkins
 Key: BEAM-4865
 URL: https://issues.apache.org/jira/browse/BEAM-4865
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-harness
Reporter: holdenk
Assignee: Robert Bradshaw


We have a nice script to validate the Python container, however it's hard coded 
to a bucket (sadness to be fixed separately in 
https://issues.apache.org/jira/browse/BEAM-4864)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4864) run_validatescontainer.sh in Python sdk has hard-coded bucket of sadness

2018-07-25 Thread holdenk (JIRA)
holdenk created BEAM-4864:
-

 Summary: run_validatescontainer.sh in Python sdk has hard-coded 
bucket of sadness
 Key: BEAM-4864
 URL: https://issues.apache.org/jira/browse/BEAM-4864
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: holdenk
Assignee: holdenk


The run_validatescontainer.sh script looks amazing! However I could not 
validate my container, and this made me sad. We can make it configurable and 
then people can validate their container changes more easily :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4833) Add support for users specifying a requirements.txt for their Python portable container

2018-07-19 Thread holdenk (JIRA)
holdenk created BEAM-4833:
-

 Summary: Add support for users specifying a requirements.txt for 
their Python portable container
 Key: BEAM-4833
 URL: https://issues.apache.org/jira/browse/BEAM-4833
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: holdenk
Assignee: holdenk


It's pretty common that Python scripts require extra dependencies, even the 
tensorflow model analysis TFMA example requires a different version of TF than 
the one we install by default. While users can roll their own container or edit 
the Dockerfile, it would probably be useful to provide an easier path to 
integrating their dependencies.

While we support automatically installing the dependencies at runtime on the 
workers, this can be very slow, especially for things like tensorflow, arrow, 
or other numeric heavy code.

Another alternative could be a simple script to augment the existing base image.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3142) Fix proto generation in Python 3

2018-07-02 Thread holdenk (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530430#comment-16530430
 ] 

holdenk commented on BEAM-3142:
---

It should be, but I can't assign it to myself.

> Fix proto generation in Python 3
> 
>
> Key: BEAM-3142
> URL: https://issues.apache.org/jira/browse/BEAM-3142
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: holdenk
>Priority: Major
>
> The generated Python code uses relative imports, fix this to be usable in 
> Python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3985) Update developer guide to reference new Python linting enviroments

2018-04-02 Thread holdenk (JIRA)
holdenk created BEAM-3985:
-

 Summary: Update developer guide to reference new Python linting 
enviroments
 Key: BEAM-3985
 URL: https://issues.apache.org/jira/browse/BEAM-3985
 Project: Beam
  Issue Type: Task
  Components: sdk-py-core, website
Reporter: holdenk
Assignee: Ahmet Altay


tox.ini changed, but [https://beam.apache.org/contribute/contribution-guide/] 
still references the told envs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3984) Add dependencies to tox.ini file

2018-04-02 Thread holdenk (JIRA)
holdenk created BEAM-3984:
-

 Summary: Add dependencies to tox.ini file
 Key: BEAM-3984
 URL: https://issues.apache.org/jira/browse/BEAM-3984
 Project: Beam
  Issue Type: Task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


Right now if someone outside of Jenkins wants to run the tests our dev guide 
has them run the tests in their current Python env ( 
[https://beam.apache.org/contribute/contribution-guide/] ). However as we move 
to supporting Python 3 & 2 it would be good if they could use tox in the same 
way we do for linting. On Jenkins the requirements seems to already be 
installed, but for other folks this may not be the case so listing the 
requirements in the deps= would be good.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3781) Figure out min supported Python 3 version

2018-03-23 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411920#comment-16411920
 ] 

holdenk commented on BEAM-3781:
---

So I'd like to encourage 3.4 since PySpark currently supports 3.4 and if we 
want folks to be able to (more) easily port pipelines it would be good to 
support 3.4. (See [https://github.com/apache/spark/blob/master/python/setup.py] 
)

> Figure out min supported Python 3 version
> -
>
> Key: BEAM-3781
> URL: https://issues.apache.org/jira/browse/BEAM-3781
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Minor
>
> We have 3.4.3 installed on Jenkins workers. We could target that as we add 
> support, but in th long run we will need to figure out the supported version 
> story for python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3850) I/O transform for HDH5 files with extension H5

2018-03-14 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399026#comment-16399026
 ] 

holdenk commented on BEAM-3850:
---

Thanks for the issue!

 

So for some additional context for folks thinking about this, there is Java 
support in Spatrk [https://github.com/LLNL/spark-hdf5] and Python support in 
Dask [https://github.com/dask/dask/issues/963] 

> I/O transform for HDH5 files with extension H5
> --
>
> Key: BEAM-3850
> URL: https://issues.apache.org/jira/browse/BEAM-3850
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas, sdk-py-core
> Environment: Python (could be Java) if it does not work
>Reporter: Eila Arich-Landkof
>Assignee: Eugene Kirpichov
>Priority: Major
>
> Following the great summit today, I would like to ask for I/O transform for 
> H5 file in python. If impossible, Java will work as well.
> The HDF5 group is very accessible: [https://support.hdfgroup.org/HDF5/]
> Example for H5 file can be found @ Mount Sinai Maayan lab: 
> [https://amp.pharm.mssm.edu/archs4/download.html]
> I am using it as part of Oriel Research genomic and clinical processing 
> workflow.
> I am available for any question.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Description: 
Various functions don't exist in Python 3 that did in python 2. This Jira is to 
fix the use of cmp (which often will involve rewriting __cmp__ as well).

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )

 

Note once all of the missing names/functions are fixed we can enable F821 in 
falke8 python 3.

  was:
Various functions don't exist in Python 3 that did in python 2. This Jira is to 
fix the use of cmp (which often will involve rewriting __cmp__ as well).

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )


> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )
>  
> Note once all of the missing names/functions are fixed we can enable F821 in 
> falke8 python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Summary: Fix Python 3 cmp function  (was: Fix Python 3 missing functions)

> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> cmp & file is no longer defined in Python 3. We can catch regressions of this 
> using flake8 f821 (although this catches some additional things as well)
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Description: 
Various functions don't exist in Python 3 that did in python 2. This Jira is to 
fix the use of cmp (which often will involve rewriting __cmp__ as well).

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )

  was:
cmp & file is no longer defined in Python 3. We can catch regressions of this 
using flake8 f821 (although this catches some additional things as well)

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )


> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 missing functions

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Issue Type: Improvement  (was: Bug)

> Fix Python 3 missing functions
> --
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> cmp & file is no longer defined in Python 3. We can catch regressions of this 
> using flake8 f821 (although this catches some additional things as well)
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 missing functions

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: BEAM-1251)

> Fix Python 3 missing functions
> --
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> cmp & file is no longer defined in Python 3. We can catch regressions of this 
> using flake8 f821 (although this catches some additional things as well)
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3761) Fix Python 3 missing functions

2018-02-27 Thread holdenk (JIRA)
holdenk created BEAM-3761:
-

 Summary: Fix Python 3 missing functions
 Key: BEAM-3761
 URL: https://issues.apache.org/jira/browse/BEAM-3761
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


cmp & file is no longer defined in Python 3. We can catch regressions of this 
using flake8 f821 (although this catches some additional things as well)

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3738) Enable Py3 linting in Jenkins

2018-02-22 Thread holdenk (JIRA)
holdenk created BEAM-3738:
-

 Summary: Enable Py3 linting in Jenkins
 Key: BEAM-3738
 URL: https://issues.apache.org/jira/browse/BEAM-3738
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, testing
Reporter: holdenk
Assignee: Ahmet Altay


After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3444) Fix flake8 detected errors E999 (AST compile error)

2018-01-09 Thread holdenk (JIRA)
holdenk created BEAM-3444:
-

 Summary: Fix flake8 detected errors E999 (AST compile error)
 Key: BEAM-3444
 URL: https://issues.apache.org/jira/browse/BEAM-3444
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


Fix flake8 detected errors E999 (AST compile error) so that we can run flake8 
to catch potential python3 breaking issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3143) Fix type inference in Python 3 for generators

2018-01-08 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315893#comment-16315893
 ] 

holdenk commented on BEAM-3143:
---

So I don't have permission to assign this issue, maybe [~altay] can do this and 
resolve it?

> Fix type inference in Python 3 for generators
> -
>
> Key: BEAM-3143
> URL: https://issues.apache.org/jira/browse/BEAM-3143
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: holdenk
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3290) Construct iterators directly if possible to allow spilling to disk

2017-12-05 Thread holdenk (JIRA)
holdenk created BEAM-3290:
-

 Summary: Construct iterators directly if possible to allow 
spilling to disk
 Key: BEAM-3290
 URL: https://issues.apache.org/jira/browse/BEAM-3290
 Project: Beam
  Issue Type: Improvement
  Components: runner-spark
Reporter: holdenk
Assignee: Amit Sela


When you construct a collection first and convert it to an iterator you force 
Spark to evaluate the entire input partition before it can get the first 
element off the output. This breaks some of the spilling to disk Spark can do 
otherwise. Instead chain operations on Iterators.

This is only possible in the Java API for Spark 2 and above (and that's my 
fault from back in my work in the Spark project).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3233) Fix Py3 list comprehension type inferance

2017-11-21 Thread holdenk (JIRA)
holdenk created BEAM-3233:
-

 Summary: Fix Py3 list comprehension type inferance
 Key: BEAM-3233
 URL: https://issues.apache.org/jira/browse/BEAM-3233
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3185) Build blocks on parsing long as int from github status json

2017-11-13 Thread holdenk (JIRA)
holdenk created BEAM-3185:
-

 Summary: Build blocks on parsing long as int from github status 
json
 Key: BEAM-3185
 URL: https://issues.apache.org/jira/browse/BEAM-3185
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: holdenk
Assignee: Davor Bonaci
Priority: Blocker


(e.g. see 
https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/818/console )
`Caused by: com.fasterxml.jackson.databind.JsonMappingException: Numeric value 
(4313677368) out of range of int`

Assuming IDs are monotonically increasing this might impact all new PRs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3174) Master python sdk seems broken with test_harness_override_present_in_dataflow_distributions on Py 2.7.6

2017-11-12 Thread holdenk (JIRA)
holdenk created BEAM-3174:
-

 Summary: Master python sdk seems broken with 
test_harness_override_present_in_dataflow_distributions on Py 2.7.6
 Key: BEAM-3174
 URL: https://issues.apache.org/jira/browse/BEAM-3174
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3173) Proto code generation fails in Python on Jenkins

2017-11-10 Thread holdenk (JIRA)
holdenk created BEAM-3173:
-

 Summary: Proto code generation fails in Python on Jenkins
 Key: BEAM-3173
 URL: https://issues.apache.org/jira/browse/BEAM-3173
 Project: Beam
  Issue Type: Bug
  Components: build-system, sdk-py-core
Reporter: holdenk
Assignee: Davor Bonaci


Related to BEAM-3164, proto code generation has been failing in Jenkins :(



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3164) Capture stderr logs during gen proto

2017-11-08 Thread holdenk (JIRA)
holdenk created BEAM-3164:
-

 Summary: Capture stderr logs during gen proto
 Key: BEAM-3164
 URL: https://issues.apache.org/jira/browse/BEAM-3164
 Project: Beam
  Issue Type: Bug
  Components: build-system, sdk-py-core
Reporter: holdenk
Assignee: Davor Bonaci


Currently python PRs are failing with gen-proto failures, but these are 
difficult to debug because we don't capture the information (see 
https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/727/console ).
cc [~altay] [~robertwb]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3143) Fix type inference in Python 3 for generators

2017-11-04 Thread holdenk (JIRA)
holdenk created BEAM-3143:
-

 Summary: Fix type inference in Python 3 for generators
 Key: BEAM-3143
 URL: https://issues.apache.org/jira/browse/BEAM-3143
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3142) Fix proto generation in Python 3

2017-11-04 Thread holdenk (JIRA)
holdenk created BEAM-3142:
-

 Summary: Fix proto generation in Python 3
 Key: BEAM-3142
 URL: https://issues.apache.org/jira/browse/BEAM-3142
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


The generated Python code uses relative imports, fix this to be usable in 
Python 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3141) Make coders & streams work in Python 3

2017-11-04 Thread holdenk (JIRA)
holdenk created BEAM-3141:
-

 Summary: Make coders & streams work in Python 3
 Key: BEAM-3141
 URL: https://issues.apache.org/jira/browse/BEAM-3141
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


Fix coders & streams support in Python 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3058) Python futurize stage 2

2017-10-13 Thread holdenk (JIRA)
holdenk created BEAM-3058:
-

 Summary: Python futurize stage 2
 Key: BEAM-3058
 URL: https://issues.apache.org/jira/browse/BEAM-3058
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


Apply futurize stage 2 and fix the issues so it continues to run in Python 2.X



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2836) Apply futurize stage 1 ("safe")

2017-09-01 Thread holdenk (JIRA)
holdenk created BEAM-2836:
-

 Summary: Apply futurize stage 1 ("safe")
 Key: BEAM-2836
 URL: https://issues.apache.org/jira/browse/BEAM-2836
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py
Reporter: holdenk
Assignee: Ahmet Altay


Futurize has two stages: stage 1 & stage 2. In theory futurize stage 1 should 
be safe, try and apply stage 1 on its own.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2821) isort and autopep8 the current Python code base

2017-09-01 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151380#comment-16151380
 ] 

holdenk commented on BEAM-2821:
---

Would it be ok to close this and assign to me then?

> isort and autopep8 the current Python code base
> ---
>
> Key: BEAM-2821
> URL: https://issues.apache.org/jira/browse/BEAM-2821
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py
>Reporter: holdenk
>Assignee: Ahmet Altay
>
> As part of preparing for automated code conversion of the Apache BEAM code 
> base we should pre-sort the imports and apply some basic autopep8 changes. 
> This are useful since we will want to apply them again after futurize.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2821) isort and autopep8 the current Python code base

2017-08-29 Thread holdenk (JIRA)
holdenk created BEAM-2821:
-

 Summary: isort and autopep8 the current Python code base
 Key: BEAM-2821
 URL: https://issues.apache.org/jira/browse/BEAM-2821
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py
Reporter: holdenk
Assignee: Ahmet Altay


As part of preparing for automated code conversion of the Apache BEAM code base 
we should pre-sort the imports and apply some basic autopep8 changes. This are 
useful since we will want to apply them again after futurize.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2784) Fix issues from automated conversion to allow Python 2 functionality

2017-08-22 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136481#comment-16136481
 ] 

holdenk commented on BEAM-2784:
---

I've started working on this (it's down under a 100 errors :p)  
(https://github.com/holdenk/beam/tree/py2t3-plus-minal-fixes has the WIP 
branch).

> Fix issues from automated conversion to allow Python 2 functionality
> 
>
> Key: BEAM-2784
> URL: https://issues.apache.org/jira/browse/BEAM-2784
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: holdenk
>Assignee: Ahmet Altay
>
> As part of BEAM-1251 we want to move to support a Python2/3 code base. To do 
> this we can use futurize but futurize will break some Python2 elements. A 
> good intermediate checkpoint is contiuing to support Python 2 after 
> futurization from which we can build Python 3 support on top of.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2786) Update jenkins test scripts to test with Py2 & Py3

2017-08-21 Thread holdenk (JIRA)
holdenk created BEAM-2786:
-

 Summary: Update jenkins test scripts to test with Py2 & Py3
 Key: BEAM-2786
 URL: https://issues.apache.org/jira/browse/BEAM-2786
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py, testing
Reporter: holdenk
Assignee: Ahmet Altay


After BEAM-1373 and as part of BEAM-1251 we should make sure the automated 
tests also run against Py3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2784) Fix issues from automated conversion to allow Python 2 functionality

2017-08-21 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-2784:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: BEAM-1251)

> Fix issues from automated conversion to allow Python 2 functionality
> 
>
> Key: BEAM-2784
> URL: https://issues.apache.org/jira/browse/BEAM-2784
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: holdenk
>Assignee: Ahmet Altay
>
> As part of BEAM-1251 we want to move to support a Python2/3 code base. To do 
> this we can use futurize but futurize will break some Python2 elements. A 
> good intermediate checkpoint is contiuing to support Python 2 after 
> futurization from which we can build Python 3 support on top of.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2784) Fix issues from automated conversion to allow Python 2 functionality

2017-08-21 Thread holdenk (JIRA)
holdenk created BEAM-2784:
-

 Summary: Fix issues from automated conversion to allow Python 2 
functionality
 Key: BEAM-2784
 URL: https://issues.apache.org/jira/browse/BEAM-2784
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py
Reporter: holdenk
Assignee: Ahmet Altay


As part of BEAM-1251 we want to move to support a Python2/3 code base. To do 
this we can use futurize but futurize will break some Python2 elements. A good 
intermediate checkpoint is contiuing to support Python 2 after futurization 
from which we can build Python 3 support on top of.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-1373) Update Python SDK code to support both Python 2 and 3

2017-08-20 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134582#comment-16134582
 ] 

holdenk commented on BEAM-1373:
---

I've started looking at the converting again. It looks like all of our explicit 
dependencies should work in Python 3 (yay) -- that being said the scope of the 
errors left after using futurize is pretty large (this is the maintain 2/3 
support in the same codebase approach). Currently I've gotten down to Test 
failed: `` with 
hopefully some common roots but it's probably going to take some time to move 
this forward.

My branch is at https://github.com/holdenk/beam/tree/BEAM-1373-py2t3-support if 
anyone wants to take a look.

> Update Python SDK code to support both Python 2 and 3
> -
>
> Key: BEAM-1373
> URL: https://issues.apache.org/jira/browse/BEAM-1373
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Holden Karau
>
> This can be performed by following standard Py2 -> Py2/3 conversion process 
> defined in the following document.
> http://python-future.org/automatic_conversion.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-981) Not possible to directly submit a pipeline on spark cluster

2017-05-06 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999608#comment-15999608
 ] 

holdenk commented on BEAM-981:
--

Sure thing, no mean to pressure - but let me know if I can help or answer any 
questions on the Spark side of things :)

> Not possible to directly submit a pipeline on spark cluster
> ---
>
> Key: BEAM-981
> URL: https://issues.apache.org/jira/browse/BEAM-981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.6.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Kobi Salant
> Fix For: 2.0.0
>
>
> It's not possible to directly run a pipeline on the spark runner (for 
> instance using {{mvn exec:java}}. It fails with:
> {code}
> [appclient-register-master-threadpool-0] INFO 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to 
> master spark://10.200.118.197:7077...
> [shuffle-client-0] ERROR org.apache.spark.network.client.TransportClient - 
> Failed to send RPC 6813731522650020739 to /10.200.118.197:7077: 
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
> at 
> io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:820)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1101)
> at 
> io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1148)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1090)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.safeExecute(SingleThreadEventExecutor.java:451)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:401)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
> at java.lang.Thread.run(Thread.java:745)
> [appclient-register-master-threadpool-0] WARN 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect 
> to master 10.200.118.197:7077
> java.io.IOException: Failed to send RPC 6813731522650020739 to 
> /10.200.118.197:7077: java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:507)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:427)
> at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
> at 
> io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext

[jira] [Commented] (BEAM-981) Not possible to directly submit a pipeline on spark cluster

2017-05-01 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992305#comment-15992305
 ] 

holdenk commented on BEAM-981:
--

I can take a look at this later on this week if no one else is.

> Not possible to directly submit a pipeline on spark cluster
> ---
>
> Key: BEAM-981
> URL: https://issues.apache.org/jira/browse/BEAM-981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.6.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Kobi Salant
> Fix For: First stable release
>
>
> It's not possible to directly run a pipeline on the spark runner (for 
> instance using {{mvn exec:java}}. It fails with:
> {code}
> [appclient-register-master-threadpool-0] INFO 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to 
> master spark://10.200.118.197:7077...
> [shuffle-client-0] ERROR org.apache.spark.network.client.TransportClient - 
> Failed to send RPC 6813731522650020739 to /10.200.118.197:7077: 
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
> at 
> io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:820)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1101)
> at 
> io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1148)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1090)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.safeExecute(SingleThreadEventExecutor.java:451)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:401)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
> at java.lang.Thread.run(Thread.java:745)
> [appclient-register-master-threadpool-0] WARN 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect 
> to master 10.200.118.197:7077
> java.io.IOException: Failed to send RPC 6813731522650020739 to 
> /10.200.118.197:7077: java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:507)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:427)
> at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
> at 
> io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:845)
> at 
> io.netty.chann

[jira] [Commented] (BEAM-2087) conda python breaks the python tests

2017-04-28 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989396#comment-15989396
 ] 

holdenk commented on BEAM-2087:
---

So is [~altay] planning on working on this or would it be an OK thing for me to 
take a crack at? (I'd like to get more familiar with the developer getting 
started docs more generally and see if there are other ways to improve them but 
if its already been started no stress) :)

> conda python breaks the python tests
> 
>
> Key: BEAM-2087
> URL: https://issues.apache.org/jira/browse/BEAM-2087
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Trivial
>
> For a user running through 
> https://beam.apache.org/contribute/contribution-guide/#one-time-setup, the 
> first time they run mvn verify the Python tests will fail if they have a 
> conda python instance on their path.
> To make the getting started experience easier for new devs we should call out 
> that conda is not supported given its popularity.
> cc [~altay]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1451) Java Objects have bad toStrings

2017-04-28 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989394#comment-15989394
 ] 

holdenk commented on BEAM-1451:
---

It seems like this issue should be marked as resolved? What do you think 
[~tgroh]?

> Java Objects have bad toStrings
> ---
>
> Key: BEAM-1451
> URL: https://issues.apache.org/jira/browse/BEAM-1451
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Priority: Trivial
>
> This can complicate debugging



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2087) conda python breaks the python tests

2017-04-26 Thread holdenk (JIRA)
holdenk created BEAM-2087:
-

 Summary: conda python breaks the python tests
 Key: BEAM-2087
 URL: https://issues.apache.org/jira/browse/BEAM-2087
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: holdenk
Assignee: Davor Bonaci
Priority: Trivial


For a user running through 
https://beam.apache.org/contribute/contribution-guide/#one-time-setup, the 
first time they run mvn verify the Python tests will fail if they have a conda 
python instance on their path.

To make the getting started experience easier for new devs we should call out 
that conda is not supported given its popularity.

cc [~altay]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2057) Test metrics are reported to Spark Metrics sink.

2017-04-26 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985401#comment-15985401
 ] 

holdenk commented on BEAM-2057:
---

Giving this a quick attempt for the hackathon :)

> Test metrics are reported to Spark Metrics sink.
> 
>
> Key: BEAM-2057
> URL: https://issues.apache.org/jira/browse/BEAM-2057
> Project: Beam
>  Issue Type: Test
>  Components: runner-spark
>Reporter: Aviem Zur
>Assignee: Holden Karau
>  Labels: newbie, starter
>
> Test that metrics are reported to Spark's metric sink.
> Use {{InMemoryMetrics}} and {{InMemoryMetricsSinkRule}} similarly to the 
> {{NamedAggregatorsTest}} which tests that aggregators are reported to Spark's 
> metrics sink (Aggregators are being removed so this test should be in a 
> separate class).
> For an example on how to create a pipeline with metrics take a look at 
> {{MetricsTest}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)