[jira] [Work logged] (BEAM-8350) Upgrade to pylint 2.4

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8350?focusedWorklogId=326679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326679
 ]

ASF GitHub Bot logged work on BEAM-8350:


Author: ASF GitHub Bot
Created on: 11/Oct/19 02:58
Start Date: 11/Oct/19 02:58
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9725: [BEAM-8350] Upgrade 
to Pylint 2.4
URL: https://github.com/apache/beam/pull/9725#issuecomment-538169256
 
 
   Here's a breakdown of the changes required to get to pylint 2.4:
   
   - fix a bunch of warnings about deprecated methods.  mostly `logger.warn` 
and various unittest methods
   - update the names of a few error codes: `unused-import` and 
`possibly-unused-variable`
   - ignore a bunch of newly introduced style warnings that did not seem 
important
   - run the lint using python-3.7: this ensures that it can run on test files 
that only work on python-37 due to syntax features
   - merge the lint tests into one test:
 - `run_pylint_2to3.sh` was just testing futurization.  seems fine to do 
this all the time now that our code is python3 compliant
 - there was a "mini" test just for python3-compatibility.  not needed 
anymore now that everything is running on python3
 - stop running `pycodestyle`: it's run as part of `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326679)
Time Spent: 2h 50m  (was: 2h 40m)

> Upgrade to pylint 2.4
> -
>
> Key: BEAM-8350
> URL: https://issues.apache.org/jira/browse/BEAM-8350
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> pylint 2.4 provides a number of new features and fixes, but the most 
> important/pressing one for me is that 2.4 adds support for understanding 
> python type annotations, which fixes a bunch of spurious unused import errors 
> in the PR I'm working on for BEAM-7746.
> As of 2.0, pylint dropped support for running tests in python2, so to make 
> the upgrade we have to move our lint jobs to python3.  Doing so will put 
> pylint into "python3-mode" and there is not an option to run in 
> python2-compatible mode.  That said, the beam code is intended to be python3 
> compatible, so in practice, performing a python3 lint on the Beam code-base 
> is perfectly safe.  The primary risk of doing this is that someone introduces 
> a python-3 only change that breaks python2, but these would largely be syntax 
> errors that would be immediately caught by the unit and integration tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8350) Upgrade to pylint 2.4

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8350?focusedWorklogId=326678=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326678
 ]

ASF GitHub Bot logged work on BEAM-8350:


Author: ASF GitHub Bot
Created on: 11/Oct/19 02:57
Start Date: 11/Oct/19 02:57
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9725: [BEAM-8350] Upgrade 
to Pylint 2.4
URL: https://github.com/apache/beam/pull/9725#issuecomment-538169256
 
 
   Here's a breakdown of the changes required to get to pylint 2.4:
   
   - fix a bunch of warnings about deprecated methods.  mostly `logger.warn` 
and various unittest methods
   - update the names of a few error codes: `disable=unused-import` and 
`possibly-unused-variable`
   - ignore a bunch of newly introduced style warnings that did not seem 
important
   - run the lint using python-3.7: this ensures that it can run on test files 
that only work on python-37 due to syntax features
   - merge the lint tests into one test:
 - `run_pylint_2to3.sh` was just testing futurization.  seems fine to do 
this all the time now that our code is python3 compliant
 - there was a "mini" test just for python3-compatibility.  not needed 
anymore now that everything is running on python3
 - stop running `pycodestyle`: it's run as part of `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326678)
Time Spent: 2h 40m  (was: 2.5h)

> Upgrade to pylint 2.4
> -
>
> Key: BEAM-8350
> URL: https://issues.apache.org/jira/browse/BEAM-8350
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> pylint 2.4 provides a number of new features and fixes, but the most 
> important/pressing one for me is that 2.4 adds support for understanding 
> python type annotations, which fixes a bunch of spurious unused import errors 
> in the PR I'm working on for BEAM-7746.
> As of 2.0, pylint dropped support for running tests in python2, so to make 
> the upgrade we have to move our lint jobs to python3.  Doing so will put 
> pylint into "python3-mode" and there is not an option to run in 
> python2-compatible mode.  That said, the beam code is intended to be python3 
> compatible, so in practice, performing a python3 lint on the Beam code-base 
> is perfectly safe.  The primary risk of doing this is that someone introduces 
> a python-3 only change that breaks python2, but these would largely be syntax 
> errors that would be immediately caught by the unit and integration tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8350) Upgrade to pylint 2.4

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8350?focusedWorklogId=326677=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326677
 ]

ASF GitHub Bot logged work on BEAM-8350:


Author: ASF GitHub Bot
Created on: 11/Oct/19 02:54
Start Date: 11/Oct/19 02:54
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9725: [BEAM-8350] Upgrade 
to Pylint 2.4
URL: https://github.com/apache/beam/pull/9725#issuecomment-540878722
 
 
   Run Python2_PVR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326677)
Time Spent: 2.5h  (was: 2h 20m)

> Upgrade to pylint 2.4
> -
>
> Key: BEAM-8350
> URL: https://issues.apache.org/jira/browse/BEAM-8350
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> pylint 2.4 provides a number of new features and fixes, but the most 
> important/pressing one for me is that 2.4 adds support for understanding 
> python type annotations, which fixes a bunch of spurious unused import errors 
> in the PR I'm working on for BEAM-7746.
> As of 2.0, pylint dropped support for running tests in python2, so to make 
> the upgrade we have to move our lint jobs to python3.  Doing so will put 
> pylint into "python3-mode" and there is not an option to run in 
> python2-compatible mode.  That said, the beam code is intended to be python3 
> compatible, so in practice, performing a python3 lint on the Beam code-base 
> is perfectly safe.  The primary risk of doing this is that someone introduces 
> a python-3 only change that breaks python2, but these would largely be syntax 
> errors that would be immediately caught by the unit and integration tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7939) Document ZetaSQL dialect in Beam SQL

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7939?focusedWorklogId=326668=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326668
 ]

ASF GitHub Bot logged work on BEAM-7939:


Author: ASF GitHub Bot
Created on: 11/Oct/19 02:15
Start Date: 11/Oct/19 02:15
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9306: 
[BEAM-7939] ZetaSQL dialect documentation
URL: https://github.com/apache/beam/pull/9306
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326668)
Time Spent: 2h 50m  (was: 2h 40m)

> Document ZetaSQL dialect in Beam SQL
> 
>
> Key: BEAM-7939
> URL: https://issues.apache.org/jira/browse/BEAM-7939
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Cyrus Maden
>Assignee: Cyrus Maden
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Blocked by BEAM-7832. ZetaSQL dialect source will be merged from #9210.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7939) Document ZetaSQL dialect in Beam SQL

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7939?focusedWorklogId=326667=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326667
 ]

ASF GitHub Bot logged work on BEAM-7939:


Author: ASF GitHub Bot
Created on: 11/Oct/19 02:14
Start Date: 11/Oct/19 02:14
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9306: 
[BEAM-7939] ZetaSQL dialect documentation
URL: https://github.com/apache/beam/pull/9306#discussion_r333802917
 
 

 ##
 File path: website/src/documentation/dsls/sql/zetasql/overview.md
 ##
 @@ -0,0 +1,67 @@
+---
+layout: section
+title: "Beam ZetaSQL overview"
+section_menu: section-menu/sdks.html
+permalink: /documentation/dsls/sql/zetasql/overview/
+---
+
+# Beam ZetaSQL overview
+Beam SQL supports a varient of the 
[ZetaSQL](https://github.com/google/zetasql) language. ZetaSQL is similar to 
the language in BigQuery's SQL framework. This Beam SQL dialect is especially 
useful in pipelines that [write to or read from BigQuery 
tables](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.html).
+
+Beam SQL has additional extensions leveraging Beam’s unified batch/streaming 
model and processing complex data types. You can use these extensions with all 
Beam SQL dialects, including Beam ZetaSQL.
+
+## Query syntax
+Query statements scan tables or expressions and return the computed result 
rows. For more information about query statements in Beam ZetaSQL, see the 
[Query syntax]({{ site.baseurl
+}}/documentation/dsls/sql/zetasql/query-syntax) reference and [Function call 
rules]({{ site.baseurl
+}}/documentation/dsls/sql/zetasql/syntax).
+
+## Lexical structure 
+A Beam SQL statement comprises a series of tokens. For more information about 
tokens in Beam ZetaSQL, see the [Lexical structure]({{ site.baseurl
+}}/documentation/dsls/sql/zetasql/lexical) reference.
+
+## Data types
+Beam SQL supports standard SQL scalar data types as well as extensions 
including arrays, maps, and nested rows. For more information about scalar data 
in Beam ZetaSQL, see the [Data types]({{ site.baseurl 
}}/documentation/dsls/sql/zetasql/data-types) reference.
+
+## Functions and operators
+The following table summarizes the [ZetaSQL functions and 
operators](https://github.com/google/zetasql/blob/master/docs/functions-and-operators.md)
 supported by Beam ZetaSQL.
+
+  Operators and functionsBeam ZetaSQL support
+  https://github.com/google/zetasql/blob/master/docs/conversion_rules.md;>Type
 conversionYes
+  https://github.com/google/zetasql/blob/master/docs/aggregate_functions.md;>Aggregate
 functionsSee Beam SQL aggregate 
functions
 
 Review comment:
   Since "Beam SQL" is ambiguous and maybe also redundant, maybe just "See 
aggregate functions", etc. The context makes clear that it is ZetaSQL.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326667)
Time Spent: 2h 40m  (was: 2.5h)

> Document ZetaSQL dialect in Beam SQL
> 
>
> Key: BEAM-7939
> URL: https://issues.apache.org/jira/browse/BEAM-7939
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Cyrus Maden
>Assignee: Cyrus Maden
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Blocked by BEAM-7832. ZetaSQL dialect source will be merged from #9210.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=326662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326662
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 11/Oct/19 02:08
Start Date: 11/Oct/19 02:08
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9756: [BEAM-3713] 
Add pytest for unit tests
URL: https://github.com/apache/beam/pull/9756#discussion_r333802078
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
 ##
 @@ -228,6 +229,8 @@ def test_biqquery_read_streaming_fail(self):
  r'source is not currently available'):
   p.run()
 
+  @pytest.mark.skipif(sys.version_info >= (3, 7),
+  reason='TODO(BEAM-8095): Segfaults in Python 3.7')
 
 Review comment:
   hmmm... this only segfaults when using pytest?   what happens if you disable 
xdist?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326662)
Time Spent: 8h 10m  (was: 8h)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8382) Add polling interval to KinesisIO.Read

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8382?focusedWorklogId=326657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326657
 ]

ASF GitHub Bot logged work on BEAM-8382:


Author: ASF GitHub Bot
Created on: 11/Oct/19 01:24
Start Date: 11/Oct/19 01:24
Worklog Time Spent: 10m 
  Work Description: jfarr commented on pull request #9765: [BEAM-8382] Add 
polling interval to KinesisIO.Read
URL: https://github.com/apache/beam/pull/9765
 
 
   This PR adds an optional polling interval for KinesisIO.Read. If the polling 
interval is not set then the existing behavior of polling getRecords() as fast 
as possible is preserved.
R: @rtshadow @iemejia
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949039#comment-16949039
 ] 

Brian Hulette commented on BEAM-8368:
-

I sent a message to the arrow ML to see if anyone has used pyarrow 0.15 on 
10.15 but so far no response: 
https://lists.apache.org/thread.html/ead3bc069742437695465f9c8c1631f98dd8a5a6de056c47fc093c7d@%3Cdev.arrow.apache.org%3E

Yeah adding a known issue with the 0.13 workaround makes sense

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8382) Add polling interval to KinesisIO.Read

2019-10-10 Thread Jonothan Farr (Jira)
Jonothan Farr created BEAM-8382:
---

 Summary: Add polling interval to KinesisIO.Read
 Key: BEAM-8382
 URL: https://issues.apache.org/jira/browse/BEAM-8382
 Project: Beam
  Issue Type: Improvement
  Components: io-java-kinesis
Affects Versions: 2.15.0, 2.14.0, 2.13.0
Reporter: Jonothan Farr
Assignee: Jonothan Farr


With the current implementation we are observing Kinesis throttling due to 
ReadProvisionedThroughputExceeded on the order of hundreds of times per second, 
regardless of the actual Kinesis throughput. This is because the 
ShardReadersPool readLoop() method is polling getRecords() as fast as possible.

>From the KDS documentation:
{quote}Each shard can support up to five read transactions per second.
{quote}
and
{quote}For best results, sleep for at least 1 second (1,000 milliseconds) 
between calls to getRecords to avoid exceeding the limit on getRecords 
frequency.
{quote}
[https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html]

[https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-sdk.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8356) Update Java Katas to Gradle 5.0

2019-10-10 Thread Leonardo Miguel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonardo Miguel updated BEAM-8356:
--
Component/s: (was: build-system)

> Update Java Katas to Gradle 5.0
> ---
>
> Key: BEAM-8356
> URL: https://issues.apache.org/jira/browse/BEAM-8356
> Project: Beam
>  Issue Type: Task
>  Components: katas
> Environment: I'm using gradle Gradle 5.5.1-20190724234647+
>Reporter: Leonardo Miguel
>Assignee: Leonardo Miguel
>Priority: Minor
>
> Running gradle build on learning/katas/java/ using gradle 5 gives the 
> following error:
> {code:java}
> FAILURE: Build failed with an exception.
> * Where:
> Build file 
> '/home/leonardo/IdeaProjects/beam/learning/katas/java/build.gradle' line: 116
> * What went wrong:
> A problem occurred evaluating root project 'Beam_Kata'.
> > Cannot add task 'wrapper' as a task with that name already exists.{code}
> I found out that it's related to "Overriding built-in tasks [deprecated in 
> 4.8|https://docs.gradle.org/5.2.1/userguide/upgrading_version_4.html#deprecations_4.8]
>  now produces an error" from gradle 4.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8356) Update Java Katas to Gradle 5.0

2019-10-10 Thread Leonardo Miguel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonardo Miguel reassigned BEAM-8356:
-

Assignee: Leonardo Miguel

> Update Java Katas to Gradle 5.0
> ---
>
> Key: BEAM-8356
> URL: https://issues.apache.org/jira/browse/BEAM-8356
> Project: Beam
>  Issue Type: Task
>  Components: build-system, katas
> Environment: I'm using gradle Gradle 5.5.1-20190724234647+
>Reporter: Leonardo Miguel
>Assignee: Leonardo Miguel
>Priority: Minor
>
> Running gradle build on learning/katas/java/ using gradle 5 gives the 
> following error:
> {code:java}
> FAILURE: Build failed with an exception.
> * Where:
> Build file 
> '/home/leonardo/IdeaProjects/beam/learning/katas/java/build.gradle' line: 116
> * What went wrong:
> A problem occurred evaluating root project 'Beam_Kata'.
> > Cannot add task 'wrapper' as a task with that name already exists.{code}
> I found out that it's related to "Overriding built-in tasks [deprecated in 
> 4.8|https://docs.gradle.org/5.2.1/userguide/upgrading_version_4.html#deprecations_4.8]
>  now produces an error" from gradle 4.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949013#comment-16949013
 ] 

Ahmet Altay commented on BEAM-8368:
---

/cc [~kamilwu]

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949012#comment-16949012
 ] 

Ahmet Altay commented on BEAM-8368:
---

- Do we need to document this as a known issue? Starting from Beam 2.14,  Beam 
depends on pyarrow versions <0.15.0. 2.14.0, 2.15.0, 2.16.0 will be impacted 
from this problem. [~rtnguyen] could you add this known issue to the docs?
- Both of [~bhulette] suggestions are reasonable to me. I do not have a device 
with macos 10.15 to test this. Could someone test it with pyarrow 0.15.0. After 
that we can decide what to do (downgrade or upgrade).

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-8368:
-

Assignee: Ahmet Altay

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8368:
--
Fix Version/s: 2.17.0

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Priority: Major
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8368:
--
Priority: Blocker  (was: Major)

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8368:
--
Affects Version/s: 2.17.0

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Priority: Major
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6816) More PR information in the code velocity dashboard?

2019-10-10 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-6816:

Component/s: community-metrics

> More PR information in the code velocity dashboard?
> ---
>
> Key: BEAM-6816
> URL: https://issues.apache.org/jira/browse/BEAM-6816
> Project: Beam
>  Issue Type: Improvement
>  Components: community-metrics, website
>Reporter: Pablo Estrada
>Priority: Major
>
> I've been using this dashboard: 
> http://104.154.241.245/d/code_velocity/code-velocity?orgId=1
> To triage and get PRs reviewed. Some of them have DO NOT REVIEW / or special 
> instructions in the title.
> Perhaps it would be nice to have more information displayed on the dashboard, 
> so we can skip PRs that don't need to be reviewed.
> As you rightly mentioned, another idea is to exclude PRs that have a specific 
> tag - and I think that's a good idea in fact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6710) Add Landing page to community metrics dashboard

2019-10-10 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-6710:

Component/s: community-metrics

> Add Landing page to  community metrics dashboard
> 
>
> Key: BEAM-6710
> URL: https://issues.apache.org/jira/browse/BEAM-6710
> Project: Beam
>  Issue Type: New Feature
>  Components: community-metrics, project-management
>Reporter: Mikhail Gryzykhin
>Priority: Major
>
> Community metrics dashboard sends user to list of recently opened dashboards, 
> that's empty. This confuses new users. 
> We want to add landing page with links to relevant dashboard.
> Link: ttp://104.154.241.245/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-5923) Utilize common .pylintrc with python sdk for .test-infra/metrics

2019-10-10 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-5923:

Component/s: community-metrics

> Utilize common .pylintrc with python sdk for .test-infra/metrics
> 
>
> Key: BEAM-5923
> URL: https://issues.apache.org/jira/browse/BEAM-5923
> Project: Beam
>  Issue Type: Sub-task
>  Components: community-metrics, project-management
>Reporter: Mikhail Gryzykhin
>Priority: Major
>
> Add common linter and formatter to metrics code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-5863) Automate Community Metrics infrastructure deployment

2019-10-10 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-5863:

Component/s: community-metrics

> Automate Community Metrics infrastructure deployment
> 
>
> Key: BEAM-5863
> URL: https://issues.apache.org/jira/browse/BEAM-5863
> Project: Beam
>  Issue Type: Sub-task
>  Components: community-metrics, project-management
>Reporter: Scott Wegner
>Priority: Minor
>  Labels: community-metrics
>
> Currently the deployment process for the production Community Metrics stack 
> is manual (documented 
> [here|https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics]). 
> If we end up having to deploy more than a few times a year, it would be nice 
> to automate these steps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8328) remove :beam-test-infra-metrics:test from build target.

2019-10-10 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-8328:

Component/s: community-metrics

> remove :beam-test-infra-metrics:test from build target.
> ---
>
> Key: BEAM-8328
> URL: https://issues.apache.org/jira/browse/BEAM-8328
> Project: Beam
>  Issue Type: Bug
>  Components: community-metrics, project-management
>Reporter: Mikhail Gryzykhin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8327) beam_Prober_CommunityMetrics hits cache giving wrong results

2019-10-10 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-8327:

Labels: Triaged  (was: )

> beam_Prober_CommunityMetrics hits cache giving wrong results
> 
>
> Key: BEAM-8327
> URL: https://issues.apache.org/jira/browse/BEAM-8327
> Project: Beam
>  Issue Type: Bug
>  Components: community-metrics, project-management, testing
>Reporter: Mikhail Gryzykhin
>Priority: Major
>  Labels: Triaged
>
> We need to fix beam_Prober_CommunityMetrics target to not hit cache. It 
> always fetches fresh data from website even though binaries are the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326640
 ]

ASF GitHub Bot logged work on BEAM-8365:


Author: ASF GitHub Bot
Created on: 10/Oct/19 23:48
Start Date: 10/Oct/19 23:48
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] 
[WIP] Project push-down for TestTableProvider
URL: https://github.com/apache/beam/pull/9764
 
 
   1. Added an option to `InMemoryTable`, which enables choosing what to 
push-down: none (default), projects, filters, both. With the purpose to 
simplify unit testing.
   2. Create a rule to push fields used by a `Calc` (projects and a condition) 
down into `InMemoryTable` IO.
   3. Updating that same `Calc`  (from previous step) to have a proper input 
and output schemes, remove unused fields.
   - Remove `Calc` completely when it only renames field names and update 
`RowType` of the `IOSourceRel`.
   4. Update cost model to favor IO with projects pushed-down.
   - Right now it is accomplished by multiplying row count by the number of 
projected fields.
   
   Still needs to be done:
   1. Refactoring (currently in progress).
   2. Add JavaDoc comments.
   3. Potentially add more test (ex: select id+1).
   4. Break this PR into 2 or more (currently it is pretty large).
   
   Based on top of #9743 
   
   Design doc 
[link](https://docs.google.com/document/d/1-ysD7U7qF3MAmSfkbXZO_5PLJBevAL9bktlLCerd_jE/edit?usp=sharing).
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 

[jira] [Assigned] (BEAM-4032) Support staging binary distributions of dependency packages.

2019-10-10 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-4032:
-

Assignee: (was: Valentyn Tymofieiev)

> Support staging binary distributions of dependency packages.
> 
>
> Key: BEAM-4032
> URL: https://issues.apache.org/jira/browse/BEAM-4032
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> requirements.txt only supports source-distribution dependencies [1].
> --extra_packages does not officially support wheel files [2].
> It is possible to expand this to support binary distributions as long as we 
> have the knowledge of the target platform.
> We should take into consideration the mechanisms of staging dependencies 
> through portability framework, and perhaps consolidate some of the existing 
> options.
> [https://github.com/apache/beam/blob/a79d1b4fc27eb81db0d9a773047820a206f3d238/sdks/python/apache_beam/runners/dataflow/internal/dependency.py#L260]
> [https://github.com/apache/beam/blob/a79d1b4fc27eb81db0d9a773047820a206f3d238/sdks/python/apache_beam/runners/dataflow/internal/dependency.py#L188]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-4032) Support staging binary distributions of dependency packages.

2019-10-10 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949007#comment-16949007
 ] 

Valentyn Tymofieiev commented on BEAM-4032:
---

Unassigning for now since I am not actively working on this issue. 

> Support staging binary distributions of dependency packages.
> 
>
> Key: BEAM-4032
> URL: https://issues.apache.org/jira/browse/BEAM-4032
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> requirements.txt only supports source-distribution dependencies [1].
> --extra_packages does not officially support wheel files [2].
> It is possible to expand this to support binary distributions as long as we 
> have the knowledge of the target platform.
> We should take into consideration the mechanisms of staging dependencies 
> through portability framework, and perhaps consolidate some of the existing 
> options.
> [https://github.com/apache/beam/blob/a79d1b4fc27eb81db0d9a773047820a206f3d238/sdks/python/apache_beam/runners/dataflow/internal/dependency.py#L260]
> [https://github.com/apache/beam/blob/a79d1b4fc27eb81db0d9a773047820a206f3d238/sdks/python/apache_beam/runners/dataflow/internal/dependency.py#L188]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8381) Typehints on DoFn subclass are incorrect set on superclass

2019-10-10 Thread Yifan Mai (Jira)
Yifan Mai created BEAM-8381:
---

 Summary: Typehints on DoFn subclass are incorrect set on superclass
 Key: BEAM-8381
 URL: https://issues.apache.org/jira/browse/BEAM-8381
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Yifan Mai


Suppose a parent DoFn with typehints is subclassed into a child DoFn with 
different typehints. The typehints will be incorrectly set on the parent DoFn 
class instead of the child DoFn class, causing type checking errors. This only 
happens if the parent already has typehints; if the parent has no typehints, 
then the typehints are correctly set on the child.

Here's a example test case. I would expect this example to run successfully, 
but instead it fails with {{apache_beam.typehints.decorators.TypeCheckError: 
Type hint violation for 'AddOneAndStringify': requires  but got 
 for element}}.

{code}
def test_do_fn_pipeline_pipeline_type_check_satisfied_with_subclassing(self):
  @with_input_types(int)
  @with_output_types(int)
  class AddOne(beam.DoFn):
def add_one(elements):
  return element + 1

def process(self, element):
  return [add_one(element)]

  @with_input_types(int)
  @with_output_types(str)
  class AddOneAndStringify(AddOne):
def process(self, element):
  return [str(add_one(element))]

  d = (self.p
   | 'T' >> beam.Create([1, 2, 3]).with_output_types(int)
   | 'AddOne' >> beam.ParDo(AddOne())
   | 'AddOneAndStringify' >> beam.ParDo(AddOneAndStringify()))

  assert_that(d, equal_to(['3', '4', '5']))
  self.p.run()
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326628
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 23:17
Start Date: 10/Oct/19 23:17
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #9752: [BEAM-8183] 
restructure Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540836704
 
 
   @ibzib I don't think with the previous code there was a guarantee that you 
would get the manifest from the same jar: 
`PortablePipelineJarUtils.class.getClassLoader().getResourceAsStream(resourcePath);`
   
   You would probably have to loop through the manifests available on the 
classpath until you find the attribute:
   
   
https://stackoverflow.com/questions/3777055/reading-manifest-mf-file-from-jar-file-using-java/18741685
 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326628)
Time Spent: 2h 40m  (was: 2.5h)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8168) Python GCSFileSystem failing with gzip content encoding

2019-10-10 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948993#comment-16948993
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-8168:
-

Yeah, this is a known issue. See https://issues.apache.org/jira/browse/BEAM-1874

 

BTW why do you need to set content-encoding for these files ? GCS tries to 
automatically decompress these files and Beam Python does not properly handle 
this (and we've run into data loss issues in Java previously for such files). 
It's much safer (and most probably enough) to just store files as gzip 
(content-type: gzip) and let Beam unzip when reading.

 

[1] [https://cloud.google.com/storage/docs/transcoding]

> Python GCSFileSystem failing with gzip content encoding
> ---
>
> Key: BEAM-8168
> URL: https://issues.apache.org/jira/browse/BEAM-8168
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Affects Versions: 2.15.0
>Reporter: Daniel Ecer
>Priority: Major
>
> Google Storage supports gzip content encoding.
>  
> While Apache Beam (Python) can correctly work with .gz files without content 
> encoding.
> It however fails to handle .gz files that have content encoding applied.
> e.g. (the following would work run in a Jupyer notebook)
> {code:python}
> file_url_1 = 'gs://some-bucket/test1.gz'
> file_url_2 = 'gs://some-bucket/test2.gz'
> !echo 'my content' > /tmp/test
> # file 1 without content encoding
> !cat /tmp/test | gzip | gsutil cp - "{file_url_1}"
> # file 2 with content encoding
> !gsutil cp -Z /tmp/test "{file_url_2}"
> !gsutil cat "{file_url_1}" | zcat -
> # output: my content
> !gsutil cat "{file_url_2}" | zcat -
> # output: my content
> import apache_beam as beam
> from apache_beam.io.filesystem import CompressionTypes
> from apache_beam.io.filesystems import FileSystems
> print(beam.__version__)
> # output: 2.15.0
> with FileSystems.open(file_url_1, 
> compression_type=CompressionTypes.UNCOMPRESSED) as fp:
> print(fp.read(10))
> # output: b'\x1f\x8b\x08\x00\x10\xd6r]\x00\x03'
> with FileSystems.open(file_url_1) as fp:
> print(fp.read(10))
> # output: b'my content'
> with FileSystems.open(file_url_2, 
> compression_type=CompressionTypes.UNCOMPRESSED) as fp:
> print(fp.read(10))
> # output: b'my content'
> # (here I would expect the gzipped byte code)
> with FileSystems.open(file_url_2) as fp:
> print(fp.read(10))
> # exception: FailedToDecompressContent: Content purported to be compressed 
> with gzip but failed to decompress.
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8380) Use persistent caching with pip

2019-10-10 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8380:
--
Parent: BEAM-8193
Issue Type: Sub-task  (was: Bug)

> Use persistent caching with pip
> ---
>
> Key: BEAM-8380
> URL: https://issues.apache.org/jira/browse/BEAM-8380
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Ahmet Altay
>Assignee: Kyle Weaver
>Priority: Major
>
> Use a persistent cache directory for pip install calls (at 
> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1771
>  and some more in that file.)
> Pip does support caching 
> (https://pip.pypa.io/en/stable/reference/pip_install/#caching) but the 
> default directory may not be persistent across jobs.
> [~ibzib] you mentioned that this might help with container build times. 
> Containers are build by runnin pip inside the container, I am not sure if 
> that will be possible to use the same shared cache for that process or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8380) Use persistent caching with pip

2019-10-10 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-8380:
-

 Summary: Use persistent caching with pip
 Key: BEAM-8380
 URL: https://issues.apache.org/jira/browse/BEAM-8380
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Ahmet Altay
Assignee: Kyle Weaver


Use a persistent cache directory for pip install calls (at 
https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1771
 and some more in that file.)

Pip does support caching 
(https://pip.pypa.io/en/stable/reference/pip_install/#caching) but the default 
directory may not be persistent across jobs.

[~ibzib] you mentioned that this might help with container build times. 
Containers are build by runnin pip inside the container, I am not sure if that 
will be possible to use the same shared cache for that process or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326618
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 22:47
Start Date: 10/Oct/19 22:47
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540765396
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326618)
Time Spent: 2.5h  (was: 2h 20m)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326617=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326617
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 22:47
Start Date: 10/Oct/19 22:47
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540702363
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326617)
Time Spent: 2h 20m  (was: 2h 10m)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326616=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326616
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 22:46
Start Date: 10/Oct/19 22:46
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540664171
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326616)
Time Spent: 2h 10m  (was: 2h)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326615
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 22:46
Start Date: 10/Oct/19 22:46
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540829657
 
 
   @tweise I couldn't figure out what was going on with the jar manifests (the 
manifest it was fetching from the classpath was somehow different than the 
manifest I was writing) so I gave up and decided to add my own 
`pipeline-manifest.json` file that just contains the default job name as its 
only field.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326615)
Time Spent: 2h  (was: 1h 50m)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8376) Add FirestoreIO connector to Java SDK

2019-10-10 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948973#comment-16948973
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-8376:
-

Thanks for filing the JIRA. This was discussed few times but we don't have an 
ETA yet.

> Add FirestoreIO connector to Java SDK
> -
>
> Key: BEAM-8376
> URL: https://issues.apache.org/jira/browse/BEAM-8376
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Stefan Djelekar
>Priority: Major
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google 
> Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8379) Cache Eviction for Interactive Beam

2019-10-10 Thread Ning Kang (Jira)
Ning Kang created BEAM-8379:
---

 Summary: Cache Eviction for Interactive Beam
 Key: BEAM-8379
 URL: https://issues.apache.org/jira/browse/BEAM-8379
 Project: Beam
  Issue Type: New Feature
  Components: runner-py-interactive
Reporter: Ning Kang
Assignee: Ning Kang


Evicts cache created by Interactive Beam when an IPython kernel is restarted or 
terminated to release the resource usage that is no longer needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8151) Allow the Python SDK to use many many threads

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8151?focusedWorklogId=326587=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326587
 ]

ASF GitHub Bot logged work on BEAM-8151:


Author: ASF GitHub Bot
Created on: 10/Oct/19 21:23
Start Date: 10/Oct/19 21:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9477: [BEAM-8151, 
BEAM-7848] Up the max number of threads inside the SDK harness to a default of 
10k
URL: https://github.com/apache/beam/pull/9477#issuecomment-540801889
 
 
   The issue is that the collapsing-thread-pool-executor has a bug where during 
shutdown it tries to join worker threads but the worker threads have a default 
30 second timeout waiting for work.
   
   Filed 
https://github.com/ftpsolutions/collapsing-thread-pool-executor/issues/3
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326587)
Time Spent: 7.5h  (was: 7h 20m)

> Allow the Python SDK to use many many threads
> -
>
> Key: BEAM-8151
> URL: https://issues.apache.org/jira/browse/BEAM-8151
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> We need to use a thread pool which shrinks the number of active threads when 
> they are not being used.
>  
> This is to prevent any stuckness issues related to a runner scheduling more 
> work items then there are "work" threads inside the SDK harness.
>  
> By default the control plane should have all "requests" being processed in 
> parallel and the runner is responsible for not overloading the SDK with too 
> much work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=326586=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326586
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 10/Oct/19 21:22
Start Date: 10/Oct/19 21:22
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9756: [BEAM-3713] Add 
pytest for unit tests
URL: https://github.com/apache/beam/pull/9756#issuecomment-540804507
 
 
   Sounds good.
   
   On Thu, Oct 10, 2019 at 2:26 PM Udi Meiri  wrote:
   
   > I will first merge the PR as is with separate targets for nose and pytest.
   > Once we are satisfied that we have no missing tests and people have had the
   > chance to try it, we'll switch over to pytest by default.
   >
   > On Thu, Oct 10, 2019, 09:16 Chad Dombrova 
   > wrote:
   >
   > > This looks great. What's the plan for deploying it? Do you plan to merge
   > > it as is to get it in front of more users for testing, or will you
   > replace
   > > nose with pytest before the merge?
   > >
   > > —
   > > You are receiving this because you authored the thread.
   > > Reply to this email directly, view it on GitHub
   > > <
   > 
https://github.com/apache/beam/pull/9756?email_source=notifications_token=AAA7FTZ3FUXXGSQBLHP7KOLQN5IOXA5CNFSM4I7GBNFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA45P7Y#issuecomment-540661759
   > >,
   > > or unsubscribe
   > > <
   > 
https://github.com/notifications/unsubscribe-auth/AAA7FT4SQAAYJNUQCX4QDI3QN5IOXANCNFSM4I7GBNFA
   > >
   > > .
   > >
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or unsubscribe
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326586)
Time Spent: 8h  (was: 7h 50m)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8287) Update documentation for Python 3 support after Beam 2.16.0.

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8287?focusedWorklogId=326585=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326585
 ]

ASF GitHub Bot logged work on BEAM-8287:


Author: ASF GitHub Bot
Created on: 10/Oct/19 21:22
Start Date: 10/Oct/19 21:22
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9700: [BEAM-8287] 
Python 3 docs updates for 2.16.0
URL: https://github.com/apache/beam/pull/9700
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326585)
Time Spent: 5h 10m  (was: 5h)

> Update documentation for Python 3 support after Beam 2.16.0.
> 
>
> Key: BEAM-8287
> URL: https://issues.apache.org/jira/browse/BEAM-8287
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Valentyn Tymofieiev
>Assignee: Cyrus Maden
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326584
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 21:22
Start Date: 10/Oct/19 21:22
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540804176
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326584)
Time Spent: 1h 50m  (was: 1h 40m)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8151) Allow the Python SDK to use many many threads

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8151?focusedWorklogId=326583=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326583
 ]

ASF GitHub Bot logged work on BEAM-8151:


Author: ASF GitHub Bot
Created on: 10/Oct/19 21:17
Start Date: 10/Oct/19 21:17
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9477: [BEAM-8151, 
BEAM-7848] Up the max number of threads inside the SDK harness to a default of 
10k
URL: https://github.com/apache/beam/pull/9477#issuecomment-540801889
 
 
   The issue is that the collapsing-thread-pool-executor has a bug where during 
shutdown it tries to join worker threads but the worker threads have a default 
30 second timeout waiting for work.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326583)
Time Spent: 7h 20m  (was: 7h 10m)

> Allow the Python SDK to use many many threads
> -
>
> Key: BEAM-8151
> URL: https://issues.apache.org/jira/browse/BEAM-8151
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> We need to use a thread pool which shrinks the number of active threads when 
> they are not being used.
>  
> This is to prevent any stuckness issues related to a runner scheduling more 
> work items then there are "work" threads inside the SDK harness.
>  
> By default the control plane should have all "requests" being processed in 
> parallel and the runner is responsible for not overloading the SDK with too 
> much work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326553=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326553
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 20:08
Start Date: 10/Oct/19 20:08
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540765396
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326553)
Time Spent: 1h 40m  (was: 1.5h)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6646) Beam Dependency Update Request: com.gradle.build-scan

2019-10-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948909#comment-16948909
 ] 

Luke Cwik commented on BEAM-6646:
-

2.4 is broken, thread with details:

[https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]

> Beam Dependency Update Request: com.gradle.build-scan
> -
>
> Key: BEAM-6646
> URL: https://issues.apache.org/jira/browse/BEAM-6646
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>
>  - 2019-02-11 12:12:25.062529 
> -
> Please consider upgrading the dependency com.gradle.build-scan. 
> The current version is None. The latest version is None 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8378?focusedWorklogId=326529=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326529
 ]

ASF GitHub Bot logged work on BEAM-8378:


Author: ASF GitHub Bot
Created on: 10/Oct/19 19:49
Start Date: 10/Oct/19 19:49
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9762: [BEAM-8378] 
Downgrade build-scan plugin to 2.3
URL: https://github.com/apache/beam/pull/9762
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326529)
Time Spent: 0.5h  (was: 20m)

> Downgrade build-scan plugin to 2.3 so that build-scans can appear on 
> scan.gradle.org
> 
>
> Key: BEAM-8378
> URL: https://issues.apache.org/jira/browse/BEAM-8378
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-8378.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> Downgrade build-scan plugin to 2.3 so that build-scans can appear on 
> scan.gradle.org
> 
>
> Key: BEAM-8378
> URL: https://issues.apache.org/jira/browse/BEAM-8378
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8372) Allow submission of Flink UberJar directly to flink cluster.

2019-10-10 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948884#comment-16948884
 ] 

Robert Bradshaw commented on BEAM-8372:
---

This is a feature for Python's FlinkRunner. The java/Flink side of this already 
exists. 

> Allow submission of Flink UberJar directly to flink cluster.
> 
>
> Key: BEAM-8372
> URL: https://issues.apache.org/jira/browse/BEAM-8372
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8140) Python API: PTransform should be immutable

2019-10-10 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948881#comment-16948881
 ] 

Robert Bradshaw commented on BEAM-8140:
---

I know we made it possible to apply the same PTransform as many times as you 
want within a pipeline, but don't recall why it cares about (or stores a 
reference to) the pipeline itself. This code was intended to prohibit mixing 
values across pipelines (e.g. flatting a PCollection with one pipeline to a 
PCollection of another). This should be fixed. 

> Python API: PTransform should be immutable
> --
>
> Key: BEAM-8140
> URL: https://issues.apache.org/jira/browse/BEAM-8140
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chris Suchandk
>Assignee: Robert Bradshaw
>Priority: Major
>
> While the Java API seems fine the Python API is (at least) counterintuitive.
> Let's see the following example:
> {code:python}
> p1 = beam.Pipeline()
> p2 = beam.Pipeline()
> node = 'ReadTrainData' >> beam.io.ReadFromText("/tmp/aaa.txt")
> p1 | node 
> p2 | node //fails here {code}
> The code above will fail because the _node_ somehow remembers that it was 
> already attached to _p1_. In fact, unlike in Java, the | (apply) method is 
> defined on the _PTransform_.
> If any, only the pipeline object should be mutable here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8140) Python API: PTransform should be immutable

2019-10-10 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw reassigned BEAM-8140:
-

Assignee: Robert Bradshaw

> Python API: PTransform should be immutable
> --
>
> Key: BEAM-8140
> URL: https://issues.apache.org/jira/browse/BEAM-8140
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chris Suchandk
>Assignee: Robert Bradshaw
>Priority: Major
>
> While the Java API seems fine the Python API is (at least) counterintuitive.
> Let's see the following example:
> {code:python}
> p1 = beam.Pipeline()
> p2 = beam.Pipeline()
> node = 'ReadTrainData' >> beam.io.ReadFromText("/tmp/aaa.txt")
> p1 | node 
> p2 | node //fails here {code}
> The code above will fail because the _node_ somehow remembers that it was 
> already attached to _p1_. In fact, unlike in Java, the | (apply) method is 
> defined on the _PTransform_.
> If any, only the pipeline object should be mutable here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326498
 ]

ASF GitHub Bot logged work on BEAM-8365:


Author: ASF GitHub Bot
Created on: 10/Oct/19 18:41
Start Date: 10/Oct/19 18:41
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on issue #9743: [BEAM-8365] Project 
push-down for test table provider
URL: https://github.com/apache/beam/pull/9743#issuecomment-540719113
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326498)
Time Spent: 1.5h  (was: 1h 20m)

> Add project push-down capability to IO APIs
> ---
>
> Key: BEAM-8365
> URL: https://issues.apache.org/jira/browse/BEAM-8365
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> * InMemoryTable should implement a following method:
> {code:java}
> public PCollection buildIOReader(
> PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code}
> Which should return a `PCollection` with fields specified in `fieldNames` 
> list.
>  * Create a rule to push fields used by a Calc (in projects and in a 
> condition) down into TestTable IO.
>  * Updating that same Calc  (from previous step) to have a proper input and 
> output schemes, remove unused fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-2960) Missing type parameter in some AvroIO.Write API

2019-10-10 Thread Neville Li (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neville Li resolved BEAM-2960.
--
Fix Version/s: 2.2.0
   Resolution: Fixed

> Missing type parameter in some AvroIO.Write API
> ---
>
> Key: BEAM-2960
> URL: https://issues.apache.org/jira/browse/BEAM-2960
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-avro
>Affects Versions: 2.1.0
>Reporter: Neville Li
>Assignee: Neville Li
>Priority: Minor
> Fix For: 2.2.0
>
>
> Like
> {{public Write to(DynamicAvroDestinations dynamicDestinations)}}
> {{public Write withSchema(Schema schema)}}
> {{public Write withWindowedWrites()}}
> {{public Write withMetadata(Map metadata)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-2960) Missing type parameter in some AvroIO.Write API

2019-10-10 Thread Neville Li (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948854#comment-16948854
 ] 

Neville Li commented on BEAM-2960:
--

Looks fixed as of master today, {{0cb56a2c94}}. Closing.

> Missing type parameter in some AvroIO.Write API
> ---
>
> Key: BEAM-2960
> URL: https://issues.apache.org/jira/browse/BEAM-2960
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-avro
>Affects Versions: 2.1.0
>Reporter: Neville Li
>Assignee: Neville Li
>Priority: Minor
>
> Like
> {{public Write to(DynamicAvroDestinations dynamicDestinations)}}
> {{public Write withSchema(Schema schema)}}
> {{public Write withWindowedWrites()}}
> {{public Write withMetadata(Map metadata)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=326492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326492
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 10/Oct/19 18:29
Start Date: 10/Oct/19 18:29
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #9741: [BEAM-7926] Visualize 
PCollection
URL: https://github.com/apache/beam/pull/9741#issuecomment-540713909
 
 
   > I will wait for @rohdesamuel to make the first review pass. Could you also 
check the failing tests please?
   
   Sure! I'll fix the import errors from Python2 tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326492)
Time Spent: 1h  (was: 50m)

> Visualize PCollection with Interactive Beam
> ---
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
> p = create_pipeline()
> pcoll = p | 'Transform' >> transform()
> The use can call a single function and get auto-magical charting of the data 
> as materialized pcoll.
> e.g., visualize(pcoll)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8364) SchemaCoder inconsistent equality behavior for POJO

2019-10-10 Thread Neville Li (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948851#comment-16948851
 ] 

Neville Li commented on BEAM-8364:
--

Yeah I agree it's hard to guarantee that {{consistentWithEquals}} reports 
correctly. Always return {{false}} seems reasonable. What about making 
{{structuralValue}} always return {{Row}} also, converting with 
{{toRowFunction(T)}} if necessary?

> SchemaCoder inconsistent equality behavior for POJO
> ---
>
> Key: BEAM-8364
> URL: https://issues.apache.org/jira/browse/BEAM-8364
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql, sdk-java-core
>Affects Versions: 2.16.0
>Reporter: Neville Li
>Assignee: Brian Hulette
>Priority: Minor
>
> One can create a {{SchemaCoder}} for arbitrary type {{T}} with 
> {{SchemaCoder.of(schema, toRowFunction, fromRowFunction)}}. However, in cases 
> where {{T}} lacks proper equality behavior, i.e. POJO, the result coder still 
> returns true for {{consistentWithEquals}} and {{structuralValue}}s that fail 
> equality check.
> This test reproduces the issue.
> {code:java}
> import org.apache.beam.sdk.schemas.Schema;
> import org.apache.beam.sdk.schemas.SchemaCoder;
> import org.apache.beam.sdk.values.Row;
> import org.junit.Test;
> import org.junit.runner.RunWith;
> import org.junit.runners.JUnit4;
> import java.nio.charset.Charset;
> import static org.junit.Assert.*;
> @RunWith(JUnit4.class)
> public class SchemaCoderTest {
>   public static class Pojo {
> private final byte[] bytes;
> private final String id;
> public Pojo(byte[] bytes, String id) {
>   this.bytes = bytes;
>   this.id = id;
> }
> public byte[] getBytes() {
>   return bytes;
> }
> public String getId() {
>   return id;
> }
>   }
>   @Test
>   public void testCoder() {
> Schema schema = 
> Schema.builder().addByteArrayField("bytes").addStringField("id").build();
> SchemaCoder coder = SchemaCoder.of(
> schema,
> t -> Row.withSchema(schema).addValues(t.getBytes(), 
> t.getId()).build(),
> r -> new Pojo(r.getBytes("bytes"), r.getString("id")));
> Pojo p1 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> Pojo p2 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> assertNotEquals(p1, p2); // EXPECTED, p1.equals(p2) == false
> assertFalse(coder.consistentWithEquals()); // FAIL, returns true
> assertEquals(coder.structuralValue(p1), coder.structuralValue(p2)); // 
> FAIL
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=326491=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326491
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 10/Oct/19 18:25
Start Date: 10/Oct/19 18:25
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9756: [BEAM-3713] Add pytest 
for unit tests
URL: https://github.com/apache/beam/pull/9756#issuecomment-540712434
 
 
   I will first merge the PR as is with separate targets for nose and pytest.
   Once we are satisfied that we have no missing tests and people have had the
   chance to try it, we'll switch over to pytest by default.
   
   On Thu, Oct 10, 2019, 09:16 Chad Dombrova  wrote:
   
   > This looks great. What's the plan for deploying it? Do you plan to merge
   > it as is to get it in front of more users for testing, or will you replace
   > nose with pytest before the merge?
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or unsubscribe
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326491)
Time Spent: 7h 50m  (was: 7h 40m)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326485
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 18:00
Start Date: 10/Oct/19 18:00
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540702363
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326485)
Time Spent: 1.5h  (was: 1h 20m)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5967?focusedWorklogId=326483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326483
 ]

ASF GitHub Bot logged work on BEAM-5967:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:59
Start Date: 10/Oct/19 17:59
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #8496: [BEAM-5967] Add 
handling of DynamicMessage in ProtoCoder
URL: https://github.com/apache/beam/pull/8496#issuecomment-540701831
 
 
   I'll fix the comments tommorow (it's evening now)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326483)
Time Spent: 4h 40m  (was: 4.5h)

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8364) SchemaCoder inconsistent equality behavior for POJO

2019-10-10 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948826#comment-16948826
 ] 

Brian Hulette commented on BEAM-8364:
-

I'm not sure what the appropriate fix is here. Should we always return 
{{false}} from {{consistentWithEquals}}, unless we can be 100% sure that the 
encoded type has a good equals that we can be consistent with (e.g. 
{{SchemaCoder}} generated with {{AutoValueSchema}})?

That seems like a reasonable approach.. but I'm not familiar enough with 
consistentWithEquals/structuralValue to know what sort of impact that would 
have.

> SchemaCoder inconsistent equality behavior for POJO
> ---
>
> Key: BEAM-8364
> URL: https://issues.apache.org/jira/browse/BEAM-8364
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql, sdk-java-core
>Affects Versions: 2.16.0
>Reporter: Neville Li
>Assignee: Brian Hulette
>Priority: Minor
>
> One can create a {{SchemaCoder}} for arbitrary type {{T}} with 
> {{SchemaCoder.of(schema, toRowFunction, fromRowFunction)}}. However, in cases 
> where {{T}} lacks proper equality behavior, i.e. POJO, the result coder still 
> returns true for {{consistentWithEquals}} and {{structuralValue}}s that fail 
> equality check.
> This test reproduces the issue.
> {code:java}
> import org.apache.beam.sdk.schemas.Schema;
> import org.apache.beam.sdk.schemas.SchemaCoder;
> import org.apache.beam.sdk.values.Row;
> import org.junit.Test;
> import org.junit.runner.RunWith;
> import org.junit.runners.JUnit4;
> import java.nio.charset.Charset;
> import static org.junit.Assert.*;
> @RunWith(JUnit4.class)
> public class SchemaCoderTest {
>   public static class Pojo {
> private final byte[] bytes;
> private final String id;
> public Pojo(byte[] bytes, String id) {
>   this.bytes = bytes;
>   this.id = id;
> }
> public byte[] getBytes() {
>   return bytes;
> }
> public String getId() {
>   return id;
> }
>   }
>   @Test
>   public void testCoder() {
> Schema schema = 
> Schema.builder().addByteArrayField("bytes").addStringField("id").build();
> SchemaCoder coder = SchemaCoder.of(
> schema,
> t -> Row.withSchema(schema).addValues(t.getBytes(), 
> t.getId()).build(),
> r -> new Pojo(r.getBytes("bytes"), r.getString("id")));
> Pojo p1 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> Pojo p2 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> assertNotEquals(p1, p2); // EXPECTED, p1.equals(p2) == false
> assertFalse(coder.consistentWithEquals()); // FAIL, returns true
> assertEquals(coder.structuralValue(p1), coder.structuralValue(p2)); // 
> FAIL
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8147) Net bind flake in Python precommit

2019-10-10 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948823#comment-16948823
 ] 

Kyle Weaver commented on BEAM-8147:
---

I never did anything about this, but nor have I seen this issue pop up again.

Thanks for the tip Kenn. I should at least copied the relevant error to JIRA. 
Seeing as I don't even recall precisely what the error was here, I'm going to 
close this.

> Net bind flake in Python precommit
> --
>
> Key: BEAM-8147
> URL: https://issues.apache.org/jira/browse/BEAM-8147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: flake
>
> [https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5807/consoleFull]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8147) Net bind flake in Python precommit

2019-10-10 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-8147.
---
Fix Version/s: Not applicable
   Resolution: Cannot Reproduce

> Net bind flake in Python precommit
> --
>
> Key: BEAM-8147
> URL: https://issues.apache.org/jira/browse/BEAM-8147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: flake
> Fix For: Not applicable
>
>
> [https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5807/consoleFull]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8183?focusedWorklogId=326475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326475
 ]

ASF GitHub Bot logged work on BEAM-8183:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:43
Start Date: 10/Oct/19 17:43
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9752: [BEAM-8183] restructure 
Flink portable jars to support multiple pipel…
URL: https://github.com/apache/beam/pull/9752#issuecomment-540695633
 
 
   Huh, it's failing to get the default job name on Jenkins too. Weird. Works 
on my machine.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326475)
Time Spent: 1h 20m  (was: 1h 10m)

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-10 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948813#comment-16948813
 ] 

Brian Hulette commented on BEAM-8368:
-

Sorry about this issue [~ubaierbhat]. It sounds like just pinning to arrow 
0.13.0 is a workaround for you now though.

I know the arrow community has had a lot of trouble maintaining wheels for 
releasing to pypi, which we're depending on for our ParquetIO. They've been 
[looking for 
assistance|https://lists.apache.org/thread.html/128a2bec285ad45aa4189ebb39a15b39dcf6d91c4ab0278ff4f7cdea@%3Cdev.arrow.apache.org%3E]
 with it.

pyarrow 0.15.0 was just released, does that work on 10.15? If it does, maybe we 
could resolve this just by bumping up our lower bound, if it doesn't, we should 
file a jira with arrow and maybe add a <0.14 bound until it's resolved.

FWIW it looks like arrow's nightly wheel builds are using OSX 10.9: 
https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/travis.osx.yml#L28

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0
>Reporter: Ubaier Bhat
>Priority: Major
> Attachments: error_log.txt
>
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326470=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326470
 ]

ASF GitHub Bot logged work on BEAM-8365:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:30
Start Date: 10/Oct/19 17:30
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on issue #9743: [BEAM-8365] Project 
push-down for test table provider
URL: https://github.com/apache/beam/pull/9743#issuecomment-540690358
 
 
   > You probably want a few more test cases here:
   > 
   > 1. Empty list.
   > 2. All the columns.
   > 3. Duplicate columns.
   > 4. Invalid columns.
   > 
   > Otherwise LGTM
   
   Added tests for 1-3. Passing invalid columns beaks things and is hard to 
test. A list of selected columns should be generated by the rule and passed to 
the table provider via BeamIOSourceRel. It should never get into an invalid 
state, where false column names are being extracted.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326470)
Time Spent: 1h 20m  (was: 1h 10m)

> Add project push-down capability to IO APIs
> ---
>
> Key: BEAM-8365
> URL: https://issues.apache.org/jira/browse/BEAM-8365
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> * InMemoryTable should implement a following method:
> {code:java}
> public PCollection buildIOReader(
> PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code}
> Which should return a `PCollection` with fields specified in `fieldNames` 
> list.
>  * Create a rule to push fields used by a Calc (in projects and in a 
> condition) down into TestTable IO.
>  * Updating that same Calc  (from previous step) to have a proper input and 
> output schemes, remove unused fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7939) Document ZetaSQL dialect in Beam SQL

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7939?focusedWorklogId=326468=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326468
 ]

ASF GitHub Bot logged work on BEAM-7939:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:28
Start Date: 10/Oct/19 17:28
Worklog Time Spent: 10m 
  Work Description: soyrice commented on issue #9306: [BEAM-7939] ZetaSQL 
dialect documentation
URL: https://github.com/apache/beam/pull/9306#issuecomment-540689878
 
 
   > What about "Beam Calcite SQL" versus "Beam Calcite"
   
   I like it. Adds a bit more clarity. Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326468)
Time Spent: 2.5h  (was: 2h 20m)

> Document ZetaSQL dialect in Beam SQL
> 
>
> Key: BEAM-7939
> URL: https://issues.apache.org/jira/browse/BEAM-7939
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Cyrus Maden
>Assignee: Cyrus Maden
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Blocked by BEAM-7832. ZetaSQL dialect source will be merged from #9210.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8378?focusedWorklogId=326460=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326460
 ]

ASF GitHub Bot logged work on BEAM-8378:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:14
Start Date: 10/Oct/19 17:14
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9762: [BEAM-8378] 
Downgrade build-scan plugin to 2.3
URL: https://github.com/apache/beam/pull/9762#issuecomment-540684203
 
 
   R: @tvalentyn @lgajowy 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326460)
Time Spent: 20m  (was: 10m)

> Downgrade build-scan plugin to 2.3 so that build-scans can appear on 
> scan.gradle.org
> 
>
> Key: BEAM-8378
> URL: https://issues.apache.org/jira/browse/BEAM-8378
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8378?focusedWorklogId=326459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326459
 ]

ASF GitHub Bot logged work on BEAM-8378:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:13
Start Date: 10/Oct/19 17:13
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9762: [BEAM-8378] 
Downgrade build-scan plugin to 2.3
URL: https://github.com/apache/beam/pull/9762
 
 
   Suggested as a solution in 
https://discuss.gradle.org/t/your-build-scan-could-not-be-displayed-what-does-this-mean/33302
   Tested locally and it worked.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-8140) Python API: PTransform should be immutable

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948801#comment-16948801
 ] 

Kenneth Knowles commented on BEAM-8140:
---

I don't use the Python SDK much, but perhaps 
{{beam.io.ReadFromText("/tmp/aaa.txt")}} is re-usable and immutable?

> Python API: PTransform should be immutable
> --
>
> Key: BEAM-8140
> URL: https://issues.apache.org/jira/browse/BEAM-8140
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chris Suchandk
>Priority: Major
>
> While the Java API seems fine the Python API is (at least) counterintuitive.
> Let's see the following example:
> {code:python}
> p1 = beam.Pipeline()
> p2 = beam.Pipeline()
> node = 'ReadTrainData' >> beam.io.ReadFromText("/tmp/aaa.txt")
> p1 | node 
> p2 | node //fails here {code}
> The code above will fail because the _node_ somehow remembers that it was 
> already attached to _p1_. In fact, unlike in Java, the | (apply) method is 
> defined on the _PTransform_.
> If any, only the pipeline object should be mutable here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8140) Python API: PTransform should be immutable

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948799#comment-16948799
 ] 

Kenneth Knowles commented on BEAM-8140:
---

In Java you can apply a transform as many times as you want. But perhaps it is 
adding the name to the node that makes it mutable here?

> Python API: PTransform should be immutable
> --
>
> Key: BEAM-8140
> URL: https://issues.apache.org/jira/browse/BEAM-8140
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chris Suchandk
>Priority: Major
>
> While the Java API seems fine the Python API is (at least) counterintuitive.
> Let's see the following example:
> {code:python}
> p1 = beam.Pipeline()
> p2 = beam.Pipeline()
> node = 'ReadTrainData' >> beam.io.ReadFromText("/tmp/aaa.txt")
> p1 | node 
> p2 | node //fails here {code}
> The code above will fail because the _node_ somehow remembers that it was 
> already attached to _p1_. In fact, unlike in Java, the | (apply) method is 
> defined on the _PTransform_.
> If any, only the pipeline object should be mutable here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8140) Python API: PTransform should be immutable

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948798#comment-16948798
 ] 

Kenneth Knowles commented on BEAM-8140:
---

[~robertwb] seems like your sort of ticket?

> Python API: PTransform should be immutable
> --
>
> Key: BEAM-8140
> URL: https://issues.apache.org/jira/browse/BEAM-8140
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Chris Suchandk
>Priority: Major
>
> While the Java API seems fine the Python API is (at least) counterintuitive.
> Let's see the following example:
> {code:python}
> p1 = beam.Pipeline()
> p2 = beam.Pipeline()
> node = 'ReadTrainData' >> beam.io.ReadFromText("/tmp/aaa.txt")
> p1 | node 
> p2 | node //fails here {code}
> The code above will fail because the _node_ somehow remembers that it was 
> already attached to _p1_. In fact, unlike in Java, the | (apply) method is 
> defined on the _PTransform_.
> If any, only the pipeline object should be mutable here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8140) Python API: PTransform should be immutable

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8140:
--
Component/s: (was: beam-model)
 sdk-py-core

> Python API: PTransform should be immutable
> --
>
> Key: BEAM-8140
> URL: https://issues.apache.org/jira/browse/BEAM-8140
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chris Suchandk
>Priority: Major
>
> While the Java API seems fine the Python API is (at least) counterintuitive.
> Let's see the following example:
> {code:python}
> p1 = beam.Pipeline()
> p2 = beam.Pipeline()
> node = 'ReadTrainData' >> beam.io.ReadFromText("/tmp/aaa.txt")
> p1 | node 
> p2 | node //fails here {code}
> The code above will fail because the _node_ somehow remembers that it was 
> already attached to _p1_. In fact, unlike in Java, the | (apply) method is 
> defined on the _PTransform_.
> If any, only the pipeline object should be mutable here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8147) Net bind flake in Python precommit

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948797#comment-16948797
 ] 

Kenneth Knowles commented on BEAM-8147:
---

The console is gone now and the related one is resolved. Is this done? 
Incidentally, anyone with a Jenkins account may be able to click this button 
"Save Build Forever" so it is available when linked from the bug.

> Net bind flake in Python precommit
> --
>
> Key: BEAM-8147
> URL: https://issues.apache.org/jira/browse/BEAM-8147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: flake
>
> [https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5807/consoleFull]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread Luke Cwik (Jira)
Luke Cwik created BEAM-8378:
---

 Summary: Downgrade build-scan plugin to 2.3 so that build-scans 
can appear on scan.gradle.org
 Key: BEAM-8378
 URL: https://issues.apache.org/jira/browse/BEAM-8378
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Luke Cwik
Assignee: Luke Cwik


[https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8168) Python GCSFileSystem failing with gzip content encoding

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8168:
--
Status: Open  (was: Triage Needed)

> Python GCSFileSystem failing with gzip content encoding
> ---
>
> Key: BEAM-8168
> URL: https://issues.apache.org/jira/browse/BEAM-8168
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Affects Versions: 2.15.0
>Reporter: Daniel Ecer
>Priority: Major
>
> Google Storage supports gzip content encoding.
>  
> While Apache Beam (Python) can correctly work with .gz files without content 
> encoding.
> It however fails to handle .gz files that have content encoding applied.
> e.g. (the following would work run in a Jupyer notebook)
> {code:python}
> file_url_1 = 'gs://some-bucket/test1.gz'
> file_url_2 = 'gs://some-bucket/test2.gz'
> !echo 'my content' > /tmp/test
> # file 1 without content encoding
> !cat /tmp/test | gzip | gsutil cp - "{file_url_1}"
> # file 2 with content encoding
> !gsutil cp -Z /tmp/test "{file_url_2}"
> !gsutil cat "{file_url_1}" | zcat -
> # output: my content
> !gsutil cat "{file_url_2}" | zcat -
> # output: my content
> import apache_beam as beam
> from apache_beam.io.filesystem import CompressionTypes
> from apache_beam.io.filesystems import FileSystems
> print(beam.__version__)
> # output: 2.15.0
> with FileSystems.open(file_url_1, 
> compression_type=CompressionTypes.UNCOMPRESSED) as fp:
> print(fp.read(10))
> # output: b'\x1f\x8b\x08\x00\x10\xd6r]\x00\x03'
> with FileSystems.open(file_url_1) as fp:
> print(fp.read(10))
> # output: b'my content'
> with FileSystems.open(file_url_2, 
> compression_type=CompressionTypes.UNCOMPRESSED) as fp:
> print(fp.read(10))
> # output: b'my content'
> # (here I would expect the gzipped byte code)
> with FileSystems.open(file_url_2) as fp:
> print(fp.read(10))
> # exception: FailedToDecompressContent: Content purported to be compressed 
> with gzip but failed to decompress.
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8168) Python GCSFileSystem failing with gzip content encoding

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948793#comment-16948793
 ] 

Kenneth Knowles commented on BEAM-8168:
---

[~chamikara] do you know about this?

> Python GCSFileSystem failing with gzip content encoding
> ---
>
> Key: BEAM-8168
> URL: https://issues.apache.org/jira/browse/BEAM-8168
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Affects Versions: 2.15.0
>Reporter: Daniel Ecer
>Priority: Major
>
> Google Storage supports gzip content encoding.
>  
> While Apache Beam (Python) can correctly work with .gz files without content 
> encoding.
> It however fails to handle .gz files that have content encoding applied.
> e.g. (the following would work run in a Jupyer notebook)
> {code:python}
> file_url_1 = 'gs://some-bucket/test1.gz'
> file_url_2 = 'gs://some-bucket/test2.gz'
> !echo 'my content' > /tmp/test
> # file 1 without content encoding
> !cat /tmp/test | gzip | gsutil cp - "{file_url_1}"
> # file 2 with content encoding
> !gsutil cp -Z /tmp/test "{file_url_2}"
> !gsutil cat "{file_url_1}" | zcat -
> # output: my content
> !gsutil cat "{file_url_2}" | zcat -
> # output: my content
> import apache_beam as beam
> from apache_beam.io.filesystem import CompressionTypes
> from apache_beam.io.filesystems import FileSystems
> print(beam.__version__)
> # output: 2.15.0
> with FileSystems.open(file_url_1, 
> compression_type=CompressionTypes.UNCOMPRESSED) as fp:
> print(fp.read(10))
> # output: b'\x1f\x8b\x08\x00\x10\xd6r]\x00\x03'
> with FileSystems.open(file_url_1) as fp:
> print(fp.read(10))
> # output: b'my content'
> with FileSystems.open(file_url_2, 
> compression_type=CompressionTypes.UNCOMPRESSED) as fp:
> print(fp.read(10))
> # output: b'my content'
> # (here I would expect the gzipped byte code)
> with FileSystems.open(file_url_2) as fp:
> print(fp.read(10))
> # exception: FailedToDecompressContent: Content purported to be compressed 
> with gzip but failed to decompress.
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8147) Net bind flake in Python precommit

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8147:
--
Labels: flake  (was: )

> Net bind flake in Python precommit
> --
>
> Key: BEAM-8147
> URL: https://issues.apache.org/jira/browse/BEAM-8147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: flake
>
> [https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5807/consoleFull]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-8378:

Priority: Minor  (was: Major)

> Downgrade build-scan plugin to 2.3 so that build-scans can appear on 
> scan.gradle.org
> 
>
> Key: BEAM-8378
> URL: https://issues.apache.org/jira/browse/BEAM-8378
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>
> [https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8378) Downgrade build-scan plugin to 2.3 so that build-scans can appear on scan.gradle.org

2019-10-10 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-8378:

Status: Open  (was: Triage Needed)

> Downgrade build-scan plugin to 2.3 so that build-scans can appear on 
> scan.gradle.org
> 
>
> Key: BEAM-8378
> URL: https://issues.apache.org/jira/browse/BEAM-8378
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>
> [https://lists.apache.org/thread.html/7e00bbcd0b0520d954221bd93d31629c039be95c7b42890d27678b31@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948794#comment-16948794
 ] 

Kenneth Knowles commented on BEAM-8166:
---

Looks done?

> Support Graceful shutdown of worker harness.
> 
>
> Key: BEAM-8166
> URL: https://issues.apache.org/jira/browse/BEAM-8166
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-go
>Reporter: Robert Burke
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ideally there should be a clear Shutdown control RPC a runner can send a 
> worker harness to trigger an orderly shutdown.
> Absent that, errors on the runner side shouldn't manifest as SDK worker 
> harness errors. SDKs should log, and gracefully shutdown from GRPC errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8147) Net bind flake in Python precommit

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8147:
--
Status: Open  (was: Triage Needed)

> Net bind flake in Python precommit
> --
>
> Key: BEAM-8147
> URL: https://issues.apache.org/jira/browse/BEAM-8147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>
> [https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5807/consoleFull]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8152) Provide a way to better control minor+patch versions of Python 3.x interpreters used to run Beam tests locally and on Jenkins.

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8152:
--
Status: Open  (was: Triage Needed)

> Provide a way to better control minor+patch versions of Python 3.x 
> interpreters used to run Beam tests locally and on Jenkins.
> --
>
> Key: BEAM-8152
> URL: https://issues.apache.org/jira/browse/BEAM-8152
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Currently, Beam Python test infrastructure does not provide fine-grained way 
> to control Python interpreter version. The major+minor version are typically 
> selected by virtual environment, and the patch version of interpreter is 
> defined by the version of python package available on machine that is running 
> the tests. 
> For example, Jenkins ubuntu-based machines, use Python 3.5.2 for python 3.5 
> test suites, while debian-based SDK harness containers for Python 3.5 come 
> with Python 3.5.6, and the python3.5 package available on my dev machine is 
> Python 3.5.4. 
> Throughout development of Python 3.5.x, Cpython implementation details that 
> have changed in and these changes affect certain codepaths in Beam, such as 
> type inference. 
>  
> When we encounter such issues, it is difficult for Beam developers to test 
> their changes against a particular patch version of Python interpreter both 
> remotely and locally. Opening this issue to make it simpler.
> cc: [~markflyhigh] [~yifanzou] [~udim] [~altay] who may have opinions and 
> ideas about how to make this simpler.
> Note that there are separate questions: 
>   1) which patch versions of Python we should test against on Jenkins  
>   2) which patch versions of Python Beam should claim to support. 
> Regardless of the answers to those questions, we may want to make it easier 
> for an engineer to run a test suite against a particular patch version of 
> Python, and/or make it easier to switch which patch version is used by 
> Jenkins. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8127) Beam Dependency Update Request: google-cloud-bigtable

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8127:
--
Status: Open  (was: Triage Needed)

> Beam Dependency Update Request: google-cloud-bigtable
> -
>
> Key: BEAM-8127
> URL: https://issues.apache.org/jira/browse/BEAM-8127
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>
>  - 2019-09-02 12:02:43.329304 
> -
> Please consider upgrading the dependency google-cloud-bigtable. 
> The current version is 0.32.2. The latest version is 1.0.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8158) Loosen Python dependency restrictions: httplib2, oauth2client

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8158:
--
Status: Open  (was: Triage Needed)

> Loosen Python dependency restrictions: httplib2, oauth2client
> -
>
> Key: BEAM-8158
> URL: https://issues.apache.org/jira/browse/BEAM-8158
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Jonathan Jin
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> The Beam Python SDK's current pinned dependencies create dependency conflict 
> issues for my team at Twitter.
> I'd like the following expansions of the Python SDK's dependency ranges:
>  * oauth2client>=2.0.1,<*4* to at least oauth2client>=2.0.1,<=*4.1.2*
>  * httplib2>=0.8,<=*0.12.0* to at least httplib2>=0.8,<=*0.12.3*
> I understand, from pull request 
> [8653|https://github.com/apache/beam/pull/8653] by [~altay], that the latter 
> is blocked in turn by googledatastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8137) Worker pool option for Java SDK container

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8137:
--
Status: Open  (was: Triage Needed)

> Worker pool option for Java SDK container
> -
>
> Key: BEAM-8137
> URL: https://issues.apache.org/jira/browse/BEAM-8137
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Thomas Weise
>Priority: Major
>
> The worker pool option was added to the Python SDK container in BEAM-7980. 
> Support in the Java SDK container is simpler since it can rely on threading 
> and it should be added for feature parity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8184) Allow asynchronous execution in Go SDK

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8184:
--
Affects Version/s: (was: 2.15.0)

> Allow asynchronous execution in Go SDK
> --
>
> Key: BEAM-8184
> URL: https://issues.apache.org/jira/browse/BEAM-8184
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Jack Whelpton
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When developing streaming pipelines, it would be useful to have a means of 
> deploying a pipeline and exiting, without blocking on completion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8184) Allow asynchronous execution in Go SDK

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948791#comment-16948791
 ] 

Kenneth Knowles commented on BEAM-8184:
---

I notice this is merged. Can it be closed? I think the version does not apply 
since the Go SDK is not yet released.

> Allow asynchronous execution in Go SDK
> --
>
> Key: BEAM-8184
> URL: https://issues.apache.org/jira/browse/BEAM-8184
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When developing streaming pipelines, it would be useful to have a means of 
> deploying a pipeline and exiting, without blocking on completion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8189) Python DataflowRunner fails when using a Shared VPC from another project

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948792#comment-16948792
 ] 

Kenneth Knowles commented on BEAM-8189:
---

[~altay] do you know who would be the expert?

> Python DataflowRunner fails when using a Shared VPC from another project
> 
>
> Key: BEAM-8189
> URL: https://issues.apache.org/jira/browse/BEAM-8189
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.15.0
>Reporter: Miles Edwards
>Priority: Major
>
> h1. The Setup:
> I have two Projects on the Google Cloud Platform
> 1) Service Project for my Dataflow jobs
> 2) Host Project for Shared VPC & Subnetworks
> The Host Project has configured Firewall Rules for the Dataflow job. ie. 
> allow all traffic, allow all internal traffic, allow all traffic tagged with 
> 'dataflow' etc
>  
> h1. The Args
> {code:java}
> --project 
> --network 
> --subnetwork "https://www.googleapis.com/compute/v1/projects/ project name>/regions/ project>/subnetworks/"
> --service_account_email= for both projects, shared vpc network & subnetwork>
> {code}
> h1. The Problem
> The job will hang when performing shuffle operations. I will also see the 
> following warning:
> {code:java}
> The network miles-qa-vpc doesn't have rules that open TCP ports 1-65535 for 
> internal connection with other VMs. Only rules with a target tag 'dataflow' 
> or empty target tags set apply. If you don't specify such a rule, any 
> pipeline with more than one worker that shuffles data will hang. Causes: No 
> firewall rules associated with your network.
> {code}
>  
> h1. What I've Tried
> [StackOverflow|[https://stackoverflow.com/questions/57868089/google-dataflow-warnings-when-using-service-host-projects-shared-vpcs-firew]]
> 1. Only passing "subnetwork" arg without "network" but that only modifies the 
> warning to state "default" instead of "miles-qa-vpc", which sounds like a 
> logging error to me.
> 2. Firewall rules have been configured to:
>  - allow all traffic
>  - allow all internal traffic
>  - allow all traffic with the source tag 'dataflow'
>  - allow all traffic with the target tag 'dataflow'
> 3. Service Account has been configured to have Compute Network User 
> permissions in both projects.
> 4. Ensured subnetwork is in the same region as the job.
> 5. Network in the service project is happily serving a dedicated cluster for 
> other purposes in the host project.
> It genuinely seems like the spawned Compute Instances are not gaining the 
> configuration.
> I expect the Dataflow job not to report the firewall issue and successfully 
> deal with shuffling (GroupBys etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8184) Allow asynchronous execution in Go SDK

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8184:
--
Summary: Allow asynchronous execution in Go SDK  (was: Allow asynchronous 
execution)

> Allow asynchronous execution in Go SDK
> --
>
> Key: BEAM-8184
> URL: https://issues.apache.org/jira/browse/BEAM-8184
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When developing streaming pipelines, it would be useful to have a means of 
> deploying a pipeline and exiting, without blocking on completion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8184) Allow asynchronous execution in Go SDK

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8184:
--
Status: Open  (was: Triage Needed)

> Allow asynchronous execution in Go SDK
> --
>
> Key: BEAM-8184
> URL: https://issues.apache.org/jira/browse/BEAM-8184
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When developing streaming pipelines, it would be useful to have a means of 
> deploying a pipeline and exiting, without blocking on completion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8205) AvroSchemaTest failed on FlinkRunner

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8205:
--
Status: Open  (was: Triage Needed)

> AvroSchemaTest failed on FlinkRunner
> 
>
> Key: BEAM-8205
> URL: https://issues.apache.org/jira/browse/BEAM-8205
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, test-failures
>Reporter: Yueyang Qiu
>Assignee: Brian Hulette
>Priority: Major
>
> Jenkins link:
> [https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/testReport/]
>  
> Initial investigation:
> [https://github.com/apache/beam/pull/9454] added new ValidatesRunner tests. 
> They have been tested on Dataflow runner, but are failing on Flink runner.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8364) SchemaCoder inconsistent equality behavior for POJO

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-8364:
-

Assignee: Brian Hulette

> SchemaCoder inconsistent equality behavior for POJO
> ---
>
> Key: BEAM-8364
> URL: https://issues.apache.org/jira/browse/BEAM-8364
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.16.0
>Reporter: Neville Li
>Assignee: Brian Hulette
>Priority: Minor
>
> One can create a {{SchemaCoder}} for arbitrary type {{T}} with 
> {{SchemaCoder.of(schema, toRowFunction, fromRowFunction)}}. However, in cases 
> where {{T}} lacks proper equality behavior, i.e. POJO, the result coder still 
> returns true for {{consistentWithEquals}} and {{structuralValue}}s that fail 
> equality check.
> This test reproduces the issue.
> {code:java}
> import org.apache.beam.sdk.schemas.Schema;
> import org.apache.beam.sdk.schemas.SchemaCoder;
> import org.apache.beam.sdk.values.Row;
> import org.junit.Test;
> import org.junit.runner.RunWith;
> import org.junit.runners.JUnit4;
> import java.nio.charset.Charset;
> import static org.junit.Assert.*;
> @RunWith(JUnit4.class)
> public class SchemaCoderTest {
>   public static class Pojo {
> private final byte[] bytes;
> private final String id;
> public Pojo(byte[] bytes, String id) {
>   this.bytes = bytes;
>   this.id = id;
> }
> public byte[] getBytes() {
>   return bytes;
> }
> public String getId() {
>   return id;
> }
>   }
>   @Test
>   public void testCoder() {
> Schema schema = 
> Schema.builder().addByteArrayField("bytes").addStringField("id").build();
> SchemaCoder coder = SchemaCoder.of(
> schema,
> t -> Row.withSchema(schema).addValues(t.getBytes(), 
> t.getId()).build(),
> r -> new Pojo(r.getBytes("bytes"), r.getString("id")));
> Pojo p1 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> Pojo p2 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> assertNotEquals(p1, p2); // EXPECTED, p1.equals(p2) == false
> assertFalse(coder.consistentWithEquals()); // FAIL, returns true
> assertEquals(coder.structuralValue(p1), coder.structuralValue(p2)); // 
> FAIL
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8364) SchemaCoder inconsistent equality behavior for POJO

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8364:
--
Status: Open  (was: Triage Needed)

> SchemaCoder inconsistent equality behavior for POJO
> ---
>
> Key: BEAM-8364
> URL: https://issues.apache.org/jira/browse/BEAM-8364
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.16.0
>Reporter: Neville Li
>Assignee: Brian Hulette
>Priority: Minor
>
> One can create a {{SchemaCoder}} for arbitrary type {{T}} with 
> {{SchemaCoder.of(schema, toRowFunction, fromRowFunction)}}. However, in cases 
> where {{T}} lacks proper equality behavior, i.e. POJO, the result coder still 
> returns true for {{consistentWithEquals}} and {{structuralValue}}s that fail 
> equality check.
> This test reproduces the issue.
> {code:java}
> import org.apache.beam.sdk.schemas.Schema;
> import org.apache.beam.sdk.schemas.SchemaCoder;
> import org.apache.beam.sdk.values.Row;
> import org.junit.Test;
> import org.junit.runner.RunWith;
> import org.junit.runners.JUnit4;
> import java.nio.charset.Charset;
> import static org.junit.Assert.*;
> @RunWith(JUnit4.class)
> public class SchemaCoderTest {
>   public static class Pojo {
> private final byte[] bytes;
> private final String id;
> public Pojo(byte[] bytes, String id) {
>   this.bytes = bytes;
>   this.id = id;
> }
> public byte[] getBytes() {
>   return bytes;
> }
> public String getId() {
>   return id;
> }
>   }
>   @Test
>   public void testCoder() {
> Schema schema = 
> Schema.builder().addByteArrayField("bytes").addStringField("id").build();
> SchemaCoder coder = SchemaCoder.of(
> schema,
> t -> Row.withSchema(schema).addValues(t.getBytes(), 
> t.getId()).build(),
> r -> new Pojo(r.getBytes("bytes"), r.getString("id")));
> Pojo p1 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> Pojo p2 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> assertNotEquals(p1, p2); // EXPECTED, p1.equals(p2) == false
> assertFalse(coder.consistentWithEquals()); // FAIL, returns true
> assertEquals(coder.structuralValue(p1), coder.structuralValue(p2)); // 
> FAIL
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8364) SchemaCoder inconsistent equality behavior for POJO

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8364:
--
Component/s: sdk-java-core

> SchemaCoder inconsistent equality behavior for POJO
> ---
>
> Key: BEAM-8364
> URL: https://issues.apache.org/jira/browse/BEAM-8364
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql, sdk-java-core
>Affects Versions: 2.16.0
>Reporter: Neville Li
>Assignee: Brian Hulette
>Priority: Minor
>
> One can create a {{SchemaCoder}} for arbitrary type {{T}} with 
> {{SchemaCoder.of(schema, toRowFunction, fromRowFunction)}}. However, in cases 
> where {{T}} lacks proper equality behavior, i.e. POJO, the result coder still 
> returns true for {{consistentWithEquals}} and {{structuralValue}}s that fail 
> equality check.
> This test reproduces the issue.
> {code:java}
> import org.apache.beam.sdk.schemas.Schema;
> import org.apache.beam.sdk.schemas.SchemaCoder;
> import org.apache.beam.sdk.values.Row;
> import org.junit.Test;
> import org.junit.runner.RunWith;
> import org.junit.runners.JUnit4;
> import java.nio.charset.Charset;
> import static org.junit.Assert.*;
> @RunWith(JUnit4.class)
> public class SchemaCoderTest {
>   public static class Pojo {
> private final byte[] bytes;
> private final String id;
> public Pojo(byte[] bytes, String id) {
>   this.bytes = bytes;
>   this.id = id;
> }
> public byte[] getBytes() {
>   return bytes;
> }
> public String getId() {
>   return id;
> }
>   }
>   @Test
>   public void testCoder() {
> Schema schema = 
> Schema.builder().addByteArrayField("bytes").addStringField("id").build();
> SchemaCoder coder = SchemaCoder.of(
> schema,
> t -> Row.withSchema(schema).addValues(t.getBytes(), 
> t.getId()).build(),
> r -> new Pojo(r.getBytes("bytes"), r.getString("id")));
> Pojo p1 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> Pojo p2 = new Pojo("hello".getBytes(Charset.forName("UTF-8")), "world");
> assertNotEquals(p1, p2); // EXPECTED, p1.equals(p2) == false
> assertFalse(coder.consistentWithEquals()); // FAIL, returns true
> assertEquals(coder.structuralValue(p1), coder.structuralValue(p2)); // 
> FAIL
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8372) Allow submission of Flink UberJar directly to flink cluster.

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948787#comment-16948787
 ] 

Kenneth Knowles commented on BEAM-8372:
---

This is filed under python SDK so I think I don't understand it. Can you add 
more information? Is this actually a runners-flink feature (or both components)?

> Allow submission of Flink UberJar directly to flink cluster.
> 
>
> Key: BEAM-8372
> URL: https://issues.apache.org/jira/browse/BEAM-8372
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8373) Update TestStreamPayload to use timestamp protos.

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948785#comment-16948785
 ] 

Kenneth Knowles commented on BEAM-8373:
---

Want to just do this? I think I scratched those out long ago. And indeed I 
didn't use the standard proto timestamp libs.

> Update TestStreamPayload to use timestamp protos.
> -
>
> Key: BEAM-8373
> URL: https://issues.apache.org/jira/browse/BEAM-8373
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Labels: newbie
>
> Currently the timestamp fields are all int64, with no indication of whether 
> the units are seconds, milliseconds, or microseconds. 
> https://github.com/apache/beam/blob/release-2.16.0/model/pipeline/src/main/proto/beam_runner_api.proto#L502



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8373) Update TestStreamPayload to use timestamp protos.

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-8373:
-

Assignee: Robert Bradshaw

> Update TestStreamPayload to use timestamp protos.
> -
>
> Key: BEAM-8373
> URL: https://issues.apache.org/jira/browse/BEAM-8373
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Labels: newbie
>
> Currently the timestamp fields are all int64, with no indication of whether 
> the units are seconds, milliseconds, or microseconds. 
> https://github.com/apache/beam/blob/release-2.16.0/model/pipeline/src/main/proto/beam_runner_api.proto#L502



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8373) Update TestStreamPayload to use timestamp protos.

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8373:
--
Status: Open  (was: Triage Needed)

> Update TestStreamPayload to use timestamp protos.
> -
>
> Key: BEAM-8373
> URL: https://issues.apache.org/jira/browse/BEAM-8373
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, testing
>Reporter: Robert Bradshaw
>Priority: Major
>  Labels: newbie
>
> Currently the timestamp fields are all int64, with no indication of whether 
> the units are seconds, milliseconds, or microseconds. 
> https://github.com/apache/beam/blob/release-2.16.0/model/pipeline/src/main/proto/beam_runner_api.proto#L502



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8374) PublishResult returned by SnsIO is missing sdkResponseMetadata and sdkHttpMetadata

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8374:
--
Status: Open  (was: Triage Needed)

> PublishResult returned by SnsIO is missing sdkResponseMetadata and 
> sdkHttpMetadata
> --
>
> Key: BEAM-8374
> URL: https://issues.apache.org/jira/browse/BEAM-8374
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.13.0, 2.14.0, 2.15.0
>Reporter: Jonothan Farr
>Assignee: Jonothan Farr
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the PublishResultCoder in SnsIO only serializes the messageId field 
> so the PublishResult returned by Beam returns null for 
> getSdkResponseMetadata() and getSdkHttpMetadata(). This makes it impossible 
> to check the HTTP status for errors, which is necessary since this is not 
> handled in SnsIO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8374) PublishResult returned by SnsIO is missing sdkResponseMetadata and sdkHttpMetadata

2019-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8374?focusedWorklogId=326453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326453
 ]

ASF GitHub Bot logged work on BEAM-8374:


Author: ASF GitHub Bot
Created on: 10/Oct/19 17:01
Start Date: 10/Oct/19 17:01
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9758: [BEAM-8374] Fixes 
bug in SnsIO PublishResultCoder
URL: https://github.com/apache/beam/pull/9758#issuecomment-540679564
 
 
   @jhalaria @iemejia 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 326453)
Time Spent: 20m  (was: 10m)

> PublishResult returned by SnsIO is missing sdkResponseMetadata and 
> sdkHttpMetadata
> --
>
> Key: BEAM-8374
> URL: https://issues.apache.org/jira/browse/BEAM-8374
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.13.0, 2.14.0, 2.15.0
>Reporter: Jonothan Farr
>Assignee: Jonothan Farr
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the PublishResultCoder in SnsIO only serializes the messageId field 
> so the PublishResult returned by Beam returns null for 
> getSdkResponseMetadata() and getSdkHttpMetadata(). This makes it impossible 
> to check the HTTP status for errors, which is necessary since this is not 
> handled in SnsIO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8374) PublishResult returned by SnsIO is missing sdkResponseMetadata and sdkHttpMetadata

2019-10-10 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948784#comment-16948784
 ] 

Kenneth Knowles commented on BEAM-8374:
---

[~jhalaria] [~iemejia] do you have ability to triage and code review? pinged on 
PR too

> PublishResult returned by SnsIO is missing sdkResponseMetadata and 
> sdkHttpMetadata
> --
>
> Key: BEAM-8374
> URL: https://issues.apache.org/jira/browse/BEAM-8374
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.13.0, 2.14.0, 2.15.0
>Reporter: Jonothan Farr
>Assignee: Jonothan Farr
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the PublishResultCoder in SnsIO only serializes the messageId field 
> so the PublishResult returned by Beam returns null for 
> getSdkResponseMetadata() and getSdkHttpMetadata(). This makes it impossible 
> to check the HTTP status for errors, which is necessary since this is not 
> handled in SnsIO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8375) Correct the command for Start a Flink job server in `environments.md`

2019-10-10 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8375:
--
Status: Open  (was: Triage Needed)

> Correct the command for Start a Flink job server in `environments.md`
> -
>
> Key: BEAM-8375
> URL: https://issues.apache.org/jira/browse/BEAM-8375
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Flink 1.5 and 1.6 have been drop in BEAM-7962, So we should correct the 
> command `
> ./gradlew :runners:flink:1.5:job-server:runShadow
> ` in environments.md.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >