[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239593=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239593
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 09/May/19 05:27
Start Date: 09/May/19 05:27
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#discussion_r282342465
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py
 ##
 @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self):
 native_type_compatibility.convert_to_beam_types(typing_types),
 beam_types)
 
+  def test_is_sub_class(self):
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Dict,
+derived=typing.Dict[bytes, int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.List[bytes]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Set,
+derived=typing.Set[int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Set[float]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Tuple,
+derived=typing.Tuple[int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Tuple[bytes]))
+
+  @unittest.skipIf(sys.version_info >= (2, 7, 0),
 
 Review comment:
   To be honest I don't know if it is an advantage or not. I realised there was 
a difference and made the test to point it out. Maybe I also should have 
pointed it out more clearly in the PR as well. Based upon your comment I assume 
that we want to keep the Py2 behaviour and will investigate further if I can 
achieve that with minimal changes. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239593)
Time Spent: 3.5h  (was: 3h 20m)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239592
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 09/May/19 05:23
Start Date: 09/May/19 05:23
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#discussion_r282341953
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py
 ##
 @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self):
 native_type_compatibility.convert_to_beam_types(typing_types),
 beam_types)
 
+  def test_is_sub_class(self):
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Dict,
+derived=typing.Dict[bytes, int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.List[bytes]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
 
 Review comment:
   True, removed the duplicate
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239592)
Time Spent: 3h 20m  (was: 3h 10m)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7255) UNEST with JOIN

2019-05-08 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7255:
--

 Summary: UNEST with JOIN
 Key: BEAM-7255
 URL: https://issues.apache.org/jira/browse/BEAM-7255
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Reporter: Rui Wang


UNNEST with JOIN does not work well. see: 
https://stackoverflow.com/questions/56028038/unnest-the-nested-pcollection-using-beamsql



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7254) UNEST with JOIN

2019-05-08 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7254:
--

 Summary: UNEST with JOIN
 Key: BEAM-7254
 URL: https://issues.apache.org/jira/browse/BEAM-7254
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Reporter: Rui Wang


UNNEST with JOIN does not work well. see: 
https://stackoverflow.com/questions/56028038/unnest-the-nested-pcollection-using-beamsql



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer

2019-05-08 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836026#comment-16836026
 ] 

Ankur Goenka commented on BEAM-7252:


me or [~robertwb] Can take a look but this is not of high priority.

> "beam:java:boundedsource" not supported with python optimizer
> -
>
> Key: BEAM-7252
> URL: https://issues.apache.org/jira/browse/BEAM-7252
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
>
> python pipeline optimizer does not handle external transforms.
>  
> Relevant error stack
> ==
> ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized)
> --
> Traceback (most recent call last):
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
>  line 174, in test_external_transforms
>  assert_that(res, equal_to([i for i in range(1, 10)]))
>  File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in 
> __exit__
>  self.run().wait_until_finish()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 436, in wait_until_finish
>  self._job_id, self._state, self._last_error_message()))
> RuntimeError: Pipeline 
> test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 
> failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 4: Traceback (most recent call last):
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 157, in _execute
>  response = task()
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 190, in 
>  self._execute(lambda: worker.do_instruction(work), work)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 333, in do_instruction
>  request.instruction_id)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 353, in process_bundle
>  instruction_id, request.process_bundle_descriptor_reference)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 305, in get
>  self.data_channel_factory)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 501, in __init__
>  self.ops = self.create_execution_tree(self.process_bundle_descriptor)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 545, in create_execution_tree
>  descriptor.transforms, key=topological_height, reverse=True)])
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 528, in get_operation
>  in descriptor.transforms[transform_id].outputs.items()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 527, in 
>  for tag, pcoll_id
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 531, in get_operation
>  transform_id, transform_consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 790, in create_operation
>  return creator(self, transform_id, transform_proto, payload, consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 957, in create
>  parameter.source, factory.context),
>  File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in 
> from_runner_api
>  parameter_type, constructor = cls._known_urns[fn_proto.spec.urn]
> KeyError: u'urn:beam:java:boundedsource:v1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239561
 ]

ASF GitHub Bot logged work on BEAM-7253:


Author: ASF GitHub Bot
Created on: 09/May/19 01:50
Start Date: 09/May/19 01:50
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8537: [BEAM-7253] 
test_with_jar_packages_invalid_file_name test fails on Windows
URL: https://github.com/apache/beam/pull/8537#discussion_r282314288
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/stager.py
 ##
 @@ -200,10 +201,11 @@ def stage_job_resources(self,
 # Handle jar packages that should be staged for Java SDK Harness.
 jar_packages = options.view_as(
 DebugOptions).lookup_experiment('jar_packages')
+classpath_separator = ':' if platform.system() != 'Windows' else ';'
 
 Review comment:
   I think there’s no issue to use ‘;’ on all platforms. ‘:’ is a standard 
classpath separator on Linux systems so some people on Linux may find that 
using ‘;’ for separating jar files makes them feel awkward but it’s not a 
classpath anyway. Do you think it would be better to use a single character of 
choice for all platforms?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239561)
Time Spent: 40m  (was: 0.5h)

> test_with_jar_packages_invalid_file_name test fails on windows
> --
>
> Key: BEAM-7253
> URL: https://issues.apache.org/jira/browse/BEAM-7253
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Heejong Lee
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> test_with_jar_packages_invalid_file_name test fails on windows. possibly 
> different class path separator on windows ";" as compared to linux ":".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7246) Create a Spanner IO for Python

2019-05-08 Thread Shehzaad Nakhoda (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shehzaad Nakhoda updated BEAM-7246:
---
Description: 
Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
Testing in this work item will be in the form of DirectRunner tests and manual 
testing.

Integration and performance tests are a separate work item (not included here).


See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
Google Clound Spanner to the Database column for the Python/Batch row.

  was:
Add I/O support for Google Cloud Spanner for the Python SDK.
Testing in this work item will be in the form of DirectRunner tests and manual 
testing.

Integration and performance tests are a separate work item (not included here).


See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
Google Clound Spanner to the Database column for the Python/Batch row.


> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7246) Create a Spanner IO for Python

2019-05-08 Thread Shehzaad Nakhoda (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shehzaad Nakhoda updated BEAM-7246:
---
Description: 
Add I/O support for Google Cloud Spanner for the Python SDK.
Testing in this work item will be in the form of DirectRunner tests and manual 
testing.

Integration and performance tests are a separate work item (not included here).


See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
Google Clound Spanner to the Database column for the Python/Batch row.

  was:
Add I/O support for Google Cloud Spanner for the Python SDK.
Integration and performance tests are a separate work item (not included here).

See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
Google Clound Spanner to the Database column for the Python/Batch row.


> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>
> Add I/O support for Google Cloud Spanner for the Python SDK.
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7103) Adding AvroGenericCoder for simple dict type cross-language data transfer

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7103?focusedWorklogId=239558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239558
 ]

ASF GitHub Bot logged work on BEAM-7103:


Author: ASF GitHub Bot
Created on: 09/May/19 01:26
Start Date: 09/May/19 01:26
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8342: [BEAM-7103] Adding 
AvroGenericCoder for cross-language data transfer
URL: https://github.com/apache/beam/pull/8342#issuecomment-490708363
 
 
   + @mxm 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239558)
Time Spent: 50m  (was: 40m)

> Adding AvroGenericCoder for simple dict type cross-language data transfer
> -
>
> Key: BEAM-7103
> URL: https://issues.apache.org/jira/browse/BEAM-7103
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Adding AvroGenericCoder for simple dict type cross-language data transfer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7103) Adding AvroGenericCoder for simple dict type cross-language data transfer

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7103?focusedWorklogId=239559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239559
 ]

ASF GitHub Bot logged work on BEAM-7103:


Author: ASF GitHub Bot
Created on: 09/May/19 01:26
Start Date: 09/May/19 01:26
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8342: [BEAM-7103] Adding 
AvroGenericCoder for cross-language data transfer
URL: https://github.com/apache/beam/pull/8342#issuecomment-490708363
 
 
   CC: @mxm 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239559)
Time Spent: 1h  (was: 50m)

> Adding AvroGenericCoder for simple dict type cross-language data transfer
> -
>
> Key: BEAM-7103
> URL: https://issues.apache.org/jira/browse/BEAM-7103
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Adding AvroGenericCoder for simple dict type cross-language data transfer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239550
 ]

ASF GitHub Bot logged work on BEAM-7253:


Author: ASF GitHub Bot
Created on: 09/May/19 01:16
Start Date: 09/May/19 01:16
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8537: [BEAM-7253] 
test_with_jar_packages_invalid_file_name test fails on Windows
URL: https://github.com/apache/beam/pull/8537#discussion_r282309479
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/stager.py
 ##
 @@ -200,10 +201,11 @@ def stage_job_resources(self,
 # Handle jar packages that should be staged for Java SDK Harness.
 jar_packages = options.view_as(
 DebugOptions).lookup_experiment('jar_packages')
+classpath_separator = ':' if platform.system() != 'Windows' else ';'
 
 Review comment:
   Can we use the same class path separator (';') for all platforms?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239550)
Time Spent: 0.5h  (was: 20m)

> test_with_jar_packages_invalid_file_name test fails on windows
> --
>
> Key: BEAM-7253
> URL: https://issues.apache.org/jira/browse/BEAM-7253
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Heejong Lee
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> test_with_jar_packages_invalid_file_name test fails on windows. possibly 
> different class path separator on windows ";" as compared to linux ":".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239547
 ]

ASF GitHub Bot logged work on BEAM-7253:


Author: ASF GitHub Bot
Created on: 09/May/19 01:01
Start Date: 09/May/19 01:01
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8537: [BEAM-7253] 
test_with_jar_packages_invalid_file_name test fails on Windows
URL: https://github.com/apache/beam/pull/8537#issuecomment-490704176
 
 
   R: @aaltay 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239547)
Time Spent: 20m  (was: 10m)

> test_with_jar_packages_invalid_file_name test fails on windows
> --
>
> Key: BEAM-7253
> URL: https://issues.apache.org/jira/browse/BEAM-7253
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Heejong Lee
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> test_with_jar_packages_invalid_file_name test fails on windows. possibly 
> different class path separator on windows ";" as compared to linux ":".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239544=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239544
 ]

ASF GitHub Bot logged work on BEAM-7253:


Author: ASF GitHub Bot
Created on: 09/May/19 00:57
Start Date: 09/May/19 00:57
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8537: [BEAM-7253] 
test_with_jar_packages_invalid_file_name test fails on Windows
URL: https://github.com/apache/beam/pull/8537
 
 
   Using ';' for classpath separator on Windows
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 

[jira] [Created] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows

2019-05-08 Thread Heejong Lee (JIRA)
Heejong Lee created BEAM-7253:
-

 Summary: test_with_jar_packages_invalid_file_name test fails on 
windows
 Key: BEAM-7253
 URL: https://issues.apache.org/jira/browse/BEAM-7253
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Heejong Lee


test_with_jar_packages_invalid_file_name test fails on windows. possibly 
different class path separator on windows ";" as compared to linux ":".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7103) Adding AvroGenericCoder for simple dict type cross-language data transfer

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7103?focusedWorklogId=239539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239539
 ]

ASF GitHub Bot logged work on BEAM-7103:


Author: ASF GitHub Bot
Created on: 09/May/19 00:43
Start Date: 09/May/19 00:43
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8342: [BEAM-7103] Adding 
AvroGenericCoder for cross-language data transfer
URL: https://github.com/apache/beam/pull/8342#issuecomment-485586933
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239539)
Time Spent: 40m  (was: 0.5h)

> Adding AvroGenericCoder for simple dict type cross-language data transfer
> -
>
> Key: BEAM-7103
> URL: https://issues.apache.org/jira/browse/BEAM-7103
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Adding AvroGenericCoder for simple dict type cross-language data transfer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB

2019-05-08 Thread Ahmed El.Hussaini (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835989#comment-16835989
 ] 

Ahmed El.Hussaini commented on BEAM-4567:
-

Sweet!

> Can't use mongo connector with Atlas MongoDB
> 
>
> Key: BEAM-4567
> URL: https://issues.apache.org/jira/browse/BEAM-4567
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-mongodb
>Affects Versions: 2.4.0
> Environment: Google Cloud Dataflow
>Reporter: Lucas de Sio Rosa
>Assignee: Ahmed El.Hussaini
>Priority: Major
>  Labels: mongodb
> Fix For: 2.12.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I can't use the MongoDB connector with a managed Atlas instance. The current 
> implementations makes use of splitVector which is a high-privilege function 
> that cannot be assigned to any user in Atlas.
> An open Jira issue for MongoDB suggests using $sample and $bucketAuto to 
> circunvent this necessity.
> Following is the exception thrown (removed some identifiable information):
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> com.mongodb.MongoCommandException: Command failed with error 13: 'not 
> authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  
>  at 
> br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27)
>  
> Caused by: com.mongodb.MongoCommandException: Command failed with error 13: 
> 'not authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at 
> com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115)
>  
>  at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114)
>  
>  at 
> com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
>  
>  at 
> com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
>  
>  at com.mongodb.Mongo.execute(Mongo.java:772)
>  
>  at com.mongodb.Mongo$2.execute(Mongo.java:759)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
>  
>  at 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
>  
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210)
>  
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87)
>  
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239538
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 09/May/19 00:38
Start Date: 09/May/19 00:38
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #8535: [BEAM-6693] 
ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#issuecomment-490700663
 
 
   Not ready to review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239538)
Time Spent: 1h 20m  (was: 1h 10m)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=239535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239535
 ]

ASF GitHub Bot logged work on BEAM-7138:


Author: ASF GitHub Bot
Created on: 09/May/19 00:32
Start Date: 09/May/19 00:32
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8396: [BEAM-7138] keep 
Java serialized coder in wire coder construction
URL: https://github.com/apache/beam/pull/8396
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239535)
Time Spent: 3h  (was: 2h 50m)

> keep Java serialized coder in length-prefixed wire coder construction
> -
>
> Key: BEAM-7138
> URL: https://issues.apache.org/jira/browse/BEAM-7138
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> don't replace Java serialized coder with byte array coder in length-prefixed 
> wire coder construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=239534=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239534
 ]

ASF GitHub Bot logged work on BEAM-7138:


Author: ASF GitHub Bot
Created on: 09/May/19 00:32
Start Date: 09/May/19 00:32
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8396: [BEAM-7138] keep Java 
serialized coder in wire coder construction
URL: https://github.com/apache/beam/pull/8396#issuecomment-490699720
 
 
   Closing this PR. This PR does not fix the source of the problem.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239534)
Time Spent: 3h  (was: 2h 50m)

> keep Java serialized coder in length-prefixed wire coder construction
> -
>
> Key: BEAM-7138
> URL: https://issues.apache.org/jira/browse/BEAM-7138
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> don't replace Java serialized coder with byte array coder in length-prefixed 
> wire coder construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB

2019-05-08 Thread Ahmed El.Hussaini (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835986#comment-16835986
 ] 

Ahmed El.Hussaini commented on BEAM-4567:
-

Hello [~jcornejo],

In order to use MongoDbIO with Atlas you need to explicitly to call 
`withBucketAuto(true)` when creating the `Read` object.

> Can't use mongo connector with Atlas MongoDB
> 
>
> Key: BEAM-4567
> URL: https://issues.apache.org/jira/browse/BEAM-4567
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-mongodb
>Affects Versions: 2.4.0
> Environment: Google Cloud Dataflow
>Reporter: Lucas de Sio Rosa
>Assignee: Ahmed El.Hussaini
>Priority: Major
>  Labels: mongodb
> Fix For: 2.12.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I can't use the MongoDB connector with a managed Atlas instance. The current 
> implementations makes use of splitVector which is a high-privilege function 
> that cannot be assigned to any user in Atlas.
> An open Jira issue for MongoDB suggests using $sample and $bucketAuto to 
> circunvent this necessity.
> Following is the exception thrown (removed some identifiable information):
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> com.mongodb.MongoCommandException: Command failed with error 13: 'not 
> authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  
>  at 
> br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27)
>  
> Caused by: com.mongodb.MongoCommandException: Command failed with error 13: 
> 'not authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at 
> com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115)
>  
>  at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114)
>  
>  at 
> com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
>  
>  at 
> com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
>  
>  at com.mongodb.Mongo.execute(Mongo.java:772)
>  
>  at com.mongodb.Mongo$2.execute(Mongo.java:759)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
>  
>  at 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
>  
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210)
>  
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87)
>  
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=239533=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239533
 ]

ASF GitHub Bot logged work on BEAM-7138:


Author: ASF GitHub Bot
Created on: 09/May/19 00:28
Start Date: 09/May/19 00:28
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8396: [BEAM-7138] keep Java 
serialized coder in wire coder construction
URL: https://github.com/apache/beam/pull/8396#issuecomment-490699040
 
 
   > Reads are deliberately translated by the Runner to be able to support 
unbounded sources.
   
   This might not be true for portability framework. I don't know whether 
unbounded `Read` transform works just fine on Flink portable runner or not. 
But, if it does, I think it's because 
`FlinkStreamingPortablePipelineTranslator` does not use `WireCoder` for 
translating `Read` transform. In case of 
`FlinkBatchPortablePipelineTranslator`, it uses
   ```
   outputCoder = WireCoders.instantiateRunnerWireCoder(collectionNode, 
pipeline.getComponents());
   ```
   When `Read` transform run by Flink runner itself produces `PCollection` of 
something that should be encoded with `SerializableCoder`, it will throw the 
exception because `SerializableCoder` is not supported by a runner wire coder.
   
   If we generate elements with SDF, everything should be okay since the source 
`PCollection` is the output of `DoFn` and SDK harness supports any coders.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239533)
Time Spent: 2h 50m  (was: 2h 40m)

> keep Java serialized coder in length-prefixed wire coder construction
> -
>
> Key: BEAM-7138
> URL: https://issues.apache.org/jira/browse/BEAM-7138
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> don't replace Java serialized coder with byte array coder in length-prefixed 
> wire coder construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7143) adding withConsumerConfigUpdates

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7143?focusedWorklogId=239537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239537
 ]

ASF GitHub Bot logged work on BEAM-7143:


Author: ASF GitHub Bot
Created on: 09/May/19 00:35
Start Date: 09/May/19 00:35
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8398: [BEAM-7143] adding 
withConsumerConfigUpdates
URL: https://github.com/apache/beam/pull/8398#issuecomment-490700040
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239537)
Time Spent: 40m  (was: 0.5h)

> adding withConsumerConfigUpdates
> 
>
> Key: BEAM-7143
> URL: https://issues.apache.org/jira/browse/BEAM-7143
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kafka
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> To modify `ConsumerConfig` for main consumer, we use 
> `updateConsumerProperties`. However, to modify `ConsumerConfig` for offset 
> consumer, the right method is `withOffsetConsumerConfigOverrides`. It would 
> be good to match both names for improving usability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7143) adding withConsumerConfigUpdates

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7143?focusedWorklogId=239536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239536
 ]

ASF GitHub Bot logged work on BEAM-7143:


Author: ASF GitHub Bot
Created on: 09/May/19 00:34
Start Date: 09/May/19 00:34
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8398: [BEAM-7143] adding 
withConsumerConfigUpdates
URL: https://github.com/apache/beam/pull/8398#issuecomment-490700040
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239536)
Time Spent: 0.5h  (was: 20m)

> adding withConsumerConfigUpdates
> 
>
> Key: BEAM-7143
> URL: https://issues.apache.org/jira/browse/BEAM-7143
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kafka
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> To modify `ConsumerConfig` for main consumer, we use 
> `updateConsumerProperties`. However, to modify `ConsumerConfig` for offset 
> consumer, the right method is `withOffsetConsumerConfigOverrides`. It would 
> be good to match both names for improving usability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB

2019-05-08 Thread Javier Cornejo (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835985#comment-16835985
 ] 

Javier Cornejo commented on BEAM-4567:
--

I am sorry I already did it. I had to use aggreation. Thanks again!!

> Can't use mongo connector with Atlas MongoDB
> 
>
> Key: BEAM-4567
> URL: https://issues.apache.org/jira/browse/BEAM-4567
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-mongodb
>Affects Versions: 2.4.0
> Environment: Google Cloud Dataflow
>Reporter: Lucas de Sio Rosa
>Assignee: Ahmed El.Hussaini
>Priority: Major
>  Labels: mongodb
> Fix For: 2.12.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I can't use the MongoDB connector with a managed Atlas instance. The current 
> implementations makes use of splitVector which is a high-privilege function 
> that cannot be assigned to any user in Atlas.
> An open Jira issue for MongoDB suggests using $sample and $bucketAuto to 
> circunvent this necessity.
> Following is the exception thrown (removed some identifiable information):
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> com.mongodb.MongoCommandException: Command failed with error 13: 'not 
> authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  
>  at 
> br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27)
>  
> Caused by: com.mongodb.MongoCommandException: Command failed with error 13: 
> 'not authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at 
> com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115)
>  
>  at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114)
>  
>  at 
> com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
>  
>  at 
> com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
>  
>  at com.mongodb.Mongo.execute(Mongo.java:772)
>  
>  at com.mongodb.Mongo$2.execute(Mongo.java:759)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
>  
>  at 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
>  
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210)
>  
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87)
>  
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239529=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239529
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 09/May/19 00:17
Start Date: 09/May/19 00:17
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #8535: 
[BEAM-6693] ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282301017
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context):
   @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None)
   def from_runner_api_parameter(unused_parameter, unused_context):
 return Impulse()
+
+
+class ApproximateUniqueGlobally(PTransform):
 
 Review comment:
   Does `stats.py` sound good?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239529)
Time Spent: 1h 10m  (was: 1h)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239528=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239528
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 09/May/19 00:17
Start Date: 09/May/19 00:17
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #8535: 
[BEAM-6693] ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282301017
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context):
   @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None)
   def from_runner_api_parameter(unused_parameter, unused_context):
 return Impulse()
+
+
+class ApproximateUniqueGlobally(PTransform):
 
 Review comment:
   Does stats.py sound good?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239528)
Time Spent: 1h  (was: 50m)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239531
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 09/May/19 00:22
Start Date: 09/May/19 00:22
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#discussion_r282301351
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py
 ##
 @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self):
 native_type_compatibility.convert_to_beam_types(typing_types),
 beam_types)
 
+  def test_is_sub_class(self):
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Dict,
+derived=typing.Dict[bytes, int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.List[bytes]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Set,
+derived=typing.Set[int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Set[float]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Tuple,
+derived=typing.Tuple[int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Tuple[bytes]))
+
+  @unittest.skipIf(sys.version_info >= (2, 7, 0),
 
 Review comment:
   Why is this a correct behavior for this function to return different results 
on Py2 and Py3?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239531)
Time Spent: 3h 10m  (was: 3h)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239530=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239530
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 09/May/19 00:22
Start Date: 09/May/19 00:22
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#discussion_r282301745
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py
 ##
 @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self):
 native_type_compatibility.convert_to_beam_types(typing_types),
 beam_types)
 
+  def test_is_sub_class(self):
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Dict,
+derived=typing.Dict[bytes, int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.List[bytes]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Set,
+derived=typing.Set[int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Set[float]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Tuple,
+derived=typing.Tuple[int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Tuple[bytes]))
+
+  @unittest.skipIf(sys.version_info >= (2, 7, 0),
+   'Order dosent matter in python 3')
+  def test_is_sub_class_order(self):
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Dict[bytes, int],
+derived=typing.Dict))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.List[bytes],
+derived=typing.List))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Set[int],
+derived=typing.Set))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Tuple[int],
+derived=typing.Tuple))
+
+  @unittest.skipIf(sys.version_info.major != '3',
 
 Review comment:
   It is  discouraged to compare to exact "3" version, as one day there may be 
Python 4, see also: 
https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection
   
   ```if sys.version_info.major < 3 will be better```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239530)
Time Spent: 3h 10m  (was: 3h)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This 

[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239532
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 09/May/19 00:22
Start Date: 09/May/19 00:22
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#discussion_r282299000
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py
 ##
 @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self):
 native_type_compatibility.convert_to_beam_types(typing_types),
 beam_types)
 
+  def test_is_sub_class(self):
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.Dict,
+derived=typing.Dict[bytes, int]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.Dict[bytes, int]))
+self.assertTrue(native_type_compatibility._safe_issubclass(
+parent=typing.List,
+derived=typing.List[bytes]))
+self.assertFalse(native_type_compatibility._safe_issubclass(
 
 Review comment:
   This is a duplicate of line 105.
   Also, it is sufficient to focus on different classes of usecases for the 
function. 
   For example between
   ```
   self.assertTrue(native_type_compatibility._safe_issubclass(
   parent=typing.Set,
   derived=typing.Set[int]))
   ```
   and
   ```
   self.assertTrue(native_type_compatibility._safe_issubclass(
   parent=typing.Tuple,
   derived=typing.Tuple[int]))
   ```
   I would keep only one scenario.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239532)
Time Spent: 3h 10m  (was: 3h)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer

2019-05-08 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835981#comment-16835981
 ] 

Ahmet Altay commented on BEAM-7252:
---

Got it. Who would be a good owner for this? 

> "beam:java:boundedsource" not supported with python optimizer
> -
>
> Key: BEAM-7252
> URL: https://issues.apache.org/jira/browse/BEAM-7252
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
>
> python pipeline optimizer does not handle external transforms.
>  
> Relevant error stack
> ==
> ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized)
> --
> Traceback (most recent call last):
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
>  line 174, in test_external_transforms
>  assert_that(res, equal_to([i for i in range(1, 10)]))
>  File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in 
> __exit__
>  self.run().wait_until_finish()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 436, in wait_until_finish
>  self._job_id, self._state, self._last_error_message()))
> RuntimeError: Pipeline 
> test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 
> failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 4: Traceback (most recent call last):
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 157, in _execute
>  response = task()
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 190, in 
>  self._execute(lambda: worker.do_instruction(work), work)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 333, in do_instruction
>  request.instruction_id)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 353, in process_bundle
>  instruction_id, request.process_bundle_descriptor_reference)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 305, in get
>  self.data_channel_factory)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 501, in __init__
>  self.ops = self.create_execution_tree(self.process_bundle_descriptor)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 545, in create_execution_tree
>  descriptor.transforms, key=topological_height, reverse=True)])
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 528, in get_operation
>  in descriptor.transforms[transform_id].outputs.items()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 527, in 
>  for tag, pcoll_id
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 531, in get_operation
>  transform_id, transform_consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 790, in create_operation
>  return creator(self, transform_id, transform_proto, payload, consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 957, in create
>  parameter.source, factory.context),
>  File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in 
> from_runner_api
>  parameter_type, constructor = cls._known_urns[fn_proto.spec.urn]
> KeyError: u'urn:beam:java:boundedsource:v1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB

2019-05-08 Thread Javier Cornejo (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835978#comment-16835978
 ] 

Javier Cornejo commented on BEAM-4567:
--

Hello [~iemejia]

I can't see how the [BEAM-6241] solved the problem. I use the Filters and limit 
and MongoIO still is using splitVector command. Could you give a hand? 

Thanks for the great job.

Regards

> Can't use mongo connector with Atlas MongoDB
> 
>
> Key: BEAM-4567
> URL: https://issues.apache.org/jira/browse/BEAM-4567
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-mongodb
>Affects Versions: 2.4.0
> Environment: Google Cloud Dataflow
>Reporter: Lucas de Sio Rosa
>Assignee: Ahmed El.Hussaini
>Priority: Major
>  Labels: mongodb
> Fix For: 2.12.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I can't use the MongoDB connector with a managed Atlas instance. The current 
> implementations makes use of splitVector which is a high-privilege function 
> that cannot be assigned to any user in Atlas.
> An open Jira issue for MongoDB suggests using $sample and $bucketAuto to 
> circunvent this necessity.
> Following is the exception thrown (removed some identifiable information):
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> com.mongodb.MongoCommandException: Command failed with error 13: 'not 
> authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  
>  at 
> br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27)
>  
> Caused by: com.mongodb.MongoCommandException: Command failed with error 13: 
> 'not authorized on  to execute command \{ splitVector: 
> ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 
> }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not 
> authorized on  to execute command { splitVector: 
> \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: 
> 1 }", "code" : 13, "codeName" : "Unauthorized" }
>  
>  at 
> com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115)
>  
>  at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114)
>  
>  at 
> com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
>  
>  at 
> com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91)
>  
>  at 
> com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
>  
>  at 
> com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
>  
>  at com.mongodb.Mongo.execute(Mongo.java:772)
>  
>  at com.mongodb.Mongo$2.execute(Mongo.java:759)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
>  
>  at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
>  
>  at 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
>  
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210)
>  
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87)
>  
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7102) Adding `jar_packages` experiment option for Python SDK

2019-05-08 Thread Heejong Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee resolved BEAM-7102.
---
   Resolution: Fixed
Fix Version/s: 2.13.0

> Adding `jar_packages` experiment option for Python SDK
> --
>
> Key: BEAM-7102
> URL: https://issues.apache.org/jira/browse/BEAM-7102
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Adding `jar_packages` experiment option for Python SDK for staging Jar 
> artifacts from Python pipeline. This is required for running cross-language 
> transforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer

2019-05-08 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835972#comment-16835972
 ] 

Ankur Goenka commented on BEAM-7252:


Python pipeline optimizer was not tested earlier. I am adding tests for it in 
PR [https://github.com/apache/beam/pull/8488]

The bug is specifically for Python optimized pipelines using experimental flag 
"peroptimize=all"

> "beam:java:boundedsource" not supported with python optimizer
> -
>
> Key: BEAM-7252
> URL: https://issues.apache.org/jira/browse/BEAM-7252
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
>
> python pipeline optimizer does not handle external transforms.
>  
> Relevant error stack
> ==
> ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized)
> --
> Traceback (most recent call last):
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
>  line 174, in test_external_transforms
>  assert_that(res, equal_to([i for i in range(1, 10)]))
>  File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in 
> __exit__
>  self.run().wait_until_finish()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 436, in wait_until_finish
>  self._job_id, self._state, self._last_error_message()))
> RuntimeError: Pipeline 
> test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 
> failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 4: Traceback (most recent call last):
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 157, in _execute
>  response = task()
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 190, in 
>  self._execute(lambda: worker.do_instruction(work), work)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 333, in do_instruction
>  request.instruction_id)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 353, in process_bundle
>  instruction_id, request.process_bundle_descriptor_reference)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 305, in get
>  self.data_channel_factory)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 501, in __init__
>  self.ops = self.create_execution_tree(self.process_bundle_descriptor)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 545, in create_execution_tree
>  descriptor.transforms, key=topological_height, reverse=True)])
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 528, in get_operation
>  in descriptor.transforms[transform_id].outputs.items()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 527, in 
>  for tag, pcoll_id
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 531, in get_operation
>  transform_id, transform_consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 790, in create_operation
>  return creator(self, transform_id, transform_proto, payload, consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 957, in create
>  parameter.source, factory.context),
>  File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in 
> from_runner_api
>  parameter_type, constructor = cls._known_urns[fn_proto.spec.urn]
> KeyError: u'urn:beam:java:boundedsource:v1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer

2019-05-08 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835972#comment-16835972
 ] 

Ankur Goenka edited comment on BEAM-7252 at 5/8/19 11:52 PM:
-

Python pipeline optimizer was not tested earlier. I am adding tests for it in 
PR [https://github.com/apache/beam/pull/8488]

The bug is specifically for Python optimized pipelines using experimental flag 
"peroptimize=all"

I am planning to disable this test and we can reenable it when we resolve this 
Jira


was (Author: angoenka):
Python pipeline optimizer was not tested earlier. I am adding tests for it in 
PR [https://github.com/apache/beam/pull/8488]

The bug is specifically for Python optimized pipelines using experimental flag 
"peroptimize=all"

> "beam:java:boundedsource" not supported with python optimizer
> -
>
> Key: BEAM-7252
> URL: https://issues.apache.org/jira/browse/BEAM-7252
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
>
> python pipeline optimizer does not handle external transforms.
>  
> Relevant error stack
> ==
> ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized)
> --
> Traceback (most recent call last):
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
>  line 174, in test_external_transforms
>  assert_that(res, equal_to([i for i in range(1, 10)]))
>  File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in 
> __exit__
>  self.run().wait_until_finish()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 436, in wait_until_finish
>  self._job_id, self._state, self._last_error_message()))
> RuntimeError: Pipeline 
> test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 
> failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 4: Traceback (most recent call last):
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 157, in _execute
>  response = task()
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 190, in 
>  self._execute(lambda: worker.do_instruction(work), work)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 333, in do_instruction
>  request.instruction_id)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 353, in process_bundle
>  instruction_id, request.process_bundle_descriptor_reference)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 305, in get
>  self.data_channel_factory)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 501, in __init__
>  self.ops = self.create_execution_tree(self.process_bundle_descriptor)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 545, in create_execution_tree
>  descriptor.transforms, key=topological_height, reverse=True)])
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 528, in get_operation
>  in descriptor.transforms[transform_id].outputs.items()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 527, in 
>  for tag, pcoll_id
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 531, in get_operation
>  transform_id, transform_consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 790, in create_operation
>  return creator(self, transform_id, transform_proto, payload, consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 957, in create
>  parameter.source, factory.context),
>  File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in 
> from_runner_api
>  parameter_type, constructor = cls._known_urns[fn_proto.spec.urn]
> KeyError: u'urn:beam:java:boundedsource:v1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer

2019-05-08 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835969#comment-16835969
 ] 

Ahmet Altay commented on BEAM-7252:
---

Is this a newly failing test? Do you have a link?

> "beam:java:boundedsource" not supported with python optimizer
> -
>
> Key: BEAM-7252
> URL: https://issues.apache.org/jira/browse/BEAM-7252
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
>
> python pipeline optimizer does not handle external transforms.
>  
> Relevant error stack
> ==
> ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized)
> --
> Traceback (most recent call last):
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
>  line 174, in test_external_transforms
>  assert_that(res, equal_to([i for i in range(1, 10)]))
>  File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in 
> __exit__
>  self.run().wait_until_finish()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 436, in wait_until_finish
>  self._job_id, self._state, self._last_error_message()))
> RuntimeError: Pipeline 
> test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 
> failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 4: Traceback (most recent call last):
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 157, in _execute
>  response = task()
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 190, in 
>  self._execute(lambda: worker.do_instruction(work), work)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 333, in do_instruction
>  request.instruction_id)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 353, in process_bundle
>  instruction_id, request.process_bundle_descriptor_reference)
>  File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 305, in get
>  self.data_channel_factory)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 501, in __init__
>  self.ops = self.create_execution_tree(self.process_bundle_descriptor)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 545, in create_execution_tree
>  descriptor.transforms, key=topological_height, reverse=True)])
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 528, in get_operation
>  in descriptor.transforms[transform_id].outputs.items()
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 527, in 
>  for tag, pcoll_id
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 467, in wrapper
>  result = cache[args] = func(*args)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 531, in get_operation
>  transform_id, transform_consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 790, in create_operation
>  return creator(self, transform_id, transform_proto, payload, consumers)
>  File 
> "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
> line 957, in create
>  parameter.source, factory.context),
>  File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in 
> from_runner_api
>  parameter_type, constructor = cls._known_urns[fn_proto.spec.urn]
> KeyError: u'urn:beam:java:boundedsource:v1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239515
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 08/May/19 23:42
Start Date: 08/May/19 23:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8535: [BEAM-6693] 
ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282294788
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -518,12 +525,12 @@ def _process_argspec_fn(self):
   def is_process_bounded(self):
 """Checks if an object is a bound method on an instance."""
 if not isinstance(self.process, types.MethodType):
-  return False # Not a method
+  return False  # Not a method
 
 Review comment:
   How did you reformat the code? As long as linter still passes, this is fine. 
However it generally easier to review when format changes are separated from 
other changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239515)
Time Spent: 40m  (was: 0.5h)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239516
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 08/May/19 23:42
Start Date: 08/May/19 23:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8535: [BEAM-6693] 
ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282295018
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context):
   @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None)
   def from_runner_api_parameter(unused_parameter, unused_context):
 return Impulse()
+
+
+class ApproximateUniqueGlobally(PTransform):
 
 Review comment:
   That make sense. Also probably core.py is not the right place anyway.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239516)
Time Spent: 50m  (was: 40m)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239514
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 08/May/19 23:42
Start Date: 08/May/19 23:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8535: [BEAM-6693] 
ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282294445
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -125,6 +125,7 @@ def get_version():
 'pyvcf>=0.6.8,<0.7.0; python_version < "3.0"',
 'pyyaml>=3.12,<4.0.0',
 'typing>=3.6.0,<3.7.0; python_version < "3.5.0"',
+'mmh3>=2.5.1; python_version >= "2.7"',
 
 Review comment:
   A few questions:
   - What is this dependency? The pypi page says this is a wrapper. Does it 
require other things to be installed? Is this the only option we have?
   - You do need the python_version >= "2.7" part, because all our sdks only 
support py >= 2.7
   - Can you add an upper bound here. Maybe <3.0.0
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239514)
Time Spent: 0.5h  (was: 20m)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer

2019-05-08 Thread Ankur Goenka (JIRA)
Ankur Goenka created BEAM-7252:
--

 Summary: "beam:java:boundedsource" not supported with python 
optimizer
 Key: BEAM-7252
 URL: https://issues.apache.org/jira/browse/BEAM-7252
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Ankur Goenka


python pipeline optimizer does not handle external transforms.

 

Relevant error stack


==
ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized)
--
Traceback (most recent call last):
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py",
 line 174, in test_external_transforms
 assert_that(res, equal_to([i for i in range(1, 10)]))
 File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in 
__exit__
 self.run().wait_until_finish()
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py",
 line 436, in wait_until_finish
 self._job_id, self._state, self._last_error_message()))
RuntimeError: Pipeline 
test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 
failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
harness for instruction 4: Traceback (most recent call last):
 File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
line 157, in _execute
 response = task()
 File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
line 190, in 
 self._execute(lambda: worker.do_instruction(work), work)
 File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
line 333, in do_instruction
 request.instruction_id)
 File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
line 353, in process_bundle
 instruction_id, request.process_bundle_descriptor_reference)
 File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
line 305, in get
 self.data_channel_factory)
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 501, in __init__
 self.ops = self.create_execution_tree(self.process_bundle_descriptor)
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 545, in create_execution_tree
 descriptor.transforms, key=topological_height, reverse=True)])
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 467, in wrapper
 result = cache[args] = func(*args)
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 528, in get_operation
 in descriptor.transforms[transform_id].outputs.items()
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 527, in 
 for tag, pcoll_id
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 467, in wrapper
 result = cache[args] = func(*args)
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 531, in get_operation
 transform_id, transform_consumers)
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 790, in create_operation
 return creator(self, transform_id, transform_proto, payload, consumers)
 File 
"/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", 
line 957, in create
 parameter.source, factory.context),
 File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in 
from_runner_api
 parameter_type, constructor = cls._known_urns[fn_proto.spec.urn]
KeyError: u'urn:beam:java:boundedsource:v1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239511=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239511
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 08/May/19 23:32
Start Date: 08/May/19 23:32
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #8535: 
[BEAM-6693] ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282293233
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context):
   @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None)
   def from_runner_api_parameter(unused_parameter, unused_context):
 return Impulse()
+
+
+class ApproximateUniqueGlobally(PTransform):
 
 Review comment:
   Since the file is growing, is it better to separate each transform to it's 
own file? I am happy to make that change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239511)
Time Spent: 20m  (was: 10m)

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239509
 ]

ASF GitHub Bot logged work on BEAM-6693:


Author: ASF GitHub Bot
Created on: 08/May/19 23:30
Start Date: 08/May/19 23:30
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #8535: 
[BEAM-6693] ApproximateUnique transform for Python SDK
URL: https://github.com/apache/beam/pull/8535#discussion_r282292979
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -518,12 +525,12 @@ def _process_argspec_fn(self):
   def is_process_bounded(self):
 """Checks if an object is a bound method on an instance."""
 if not isinstance(self.process, types.MethodType):
-  return False # Not a method
+  return False  # Not a method
 
 Review comment:
   Format of some lines are changed after I reformat the code.
   Code changes start from line 2289.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239509)
Time Spent: 10m
Remaining Estimate: 0h

> ApproximateUnique transform for Python SDK
> --
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Hannah Jiang
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a 
> PCollection and the number of distinct values associated with each key in a 
> PCollection KVs.
> it should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7131) Spark portable runner appears to be repeating work (in TFX example)

2019-05-08 Thread Kyle Weaver (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835962#comment-16835962
 ] 

Kyle Weaver commented on BEAM-7131:
---

[~robertwb] Narrowed it down to the simplest pipeline that exhibits this 
behavior:

If I have pcolls B and C that both depend on A, the Spark portable runner will 
compute A B C A B C (whereas the Flink and legacy Spark runners compute only 
once, A B C).

> Spark portable runner appears to be repeating work (in TFX example)
> ---
>
> Key: BEAM-7131
> URL: https://issues.apache.org/jira/browse/BEAM-7131
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>
> I've been trying to run the TFX Chicago taxi example [1] on the Spark 
> portable runner. TFDV works fine, but the preprocess step 
> (preprocess_flink.sh [2]) fails with the following error:
> RuntimeError: AlreadyExistsError: file already exists [while running 
> 'WriteTransformFn/WriteTransformFn']
> Assets are being written multiple times to different temp directories, which 
> is okay, but the error occurs when they are copied to the same permanent 
> output directory. Specifically, the copy tree operation in transform_fn_io.py 
> [3] is run twice with the same output directory. The error doesn't occur when 
> that code is modified to allow overwriting existing files, but that's only a 
> shallow fix. While the TF transform should probably be made idempotent, this 
> is also an issue with the Spark runner, which shouldn't be repeating work 
> like this regularly (in the absence of a failure condition).
> [1] [https://github.com/tensorflow/tfx/tree/master/tfx/examples/chicago_taxi]
> [2] 
> [https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi/preprocess_flink.sh]
> [3] 
> [https://github.com/tensorflow/transform/blob/master/tensorflow_transform/beam/tft_beam_io/transform_fn_io.py#L33-L45]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-7251:
---

Assignee: Udi Meiri

> Testing BigQuery client fails queries if job results aren't immediately 
> available
> -
>
> Key: BEAM-7251
> URL: https://issues.apache.org/jira/browse/BEAM-7251
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Correction: the test client is using a synchronous query with a default 
> timeout of 10s: 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
> This matches the timestamps below (5:29:19 to 5:29:29).
> Also note that this this method only returns the first page of results.
> ---
> Adding functionality to fetch query results should solve this issue, which is 
> probably causing test flakiness.
> Log:
> May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
> checkForPAssertSuccess
> INFO: Success result for Dataflow job 
> 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 
> expected assertions.
> May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 
> matchesSafely
> INFO: Verifying Bigquery data
> May 05, 2019 5:29:29 PM 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main
> SEVERE: 
> testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT)
> java.lang.AssertionError: 
> Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f)
>  but: The query job hasn't completed. Got response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"}
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145)
> Exception in thread "main" java.lang.IllegalStateException: Tests failed, 
> check output logs for details.
>   at 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154)
> But checking BQ logs on the console 

[jira] [Updated] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-7251:

Status: Open  (was: Triage Needed)

> Testing BigQuery client fails queries if job results aren't immediately 
> available
> -
>
> Key: BEAM-7251
> URL: https://issues.apache.org/jira/browse/BEAM-7251
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Correction: the test client is using a synchronous query with a default 
> timeout of 10s: 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
> This matches the timestamps below (5:29:19 to 5:29:29).
> Also note that this this method only returns the first page of results.
> ---
> Adding functionality to fetch query results should solve this issue, which is 
> probably causing test flakiness.
> Log:
> May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
> checkForPAssertSuccess
> INFO: Success result for Dataflow job 
> 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 
> expected assertions.
> May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 
> matchesSafely
> INFO: Verifying Bigquery data
> May 05, 2019 5:29:29 PM 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main
> SEVERE: 
> testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT)
> java.lang.AssertionError: 
> Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f)
>  but: The query job hasn't completed. Got response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"}
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145)
> Exception in thread "main" java.lang.IllegalStateException: Tests failed, 
> check output logs for details.
>   at 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154)
> But checking BQ logs on the console reveals that the query job 

[jira] [Work logged] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7251?focusedWorklogId=239505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239505
 ]

ASF GitHub Bot logged work on BEAM-7251:


Author: ASF GitHub Bot
Created on: 08/May/19 23:26
Start Date: 08/May/19 23:26
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8536: [BEAM-7251] 
Increase timeout for test BQ queries.
URL: https://github.com/apache/beam/pull/8536
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 

[jira] [Work logged] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7251?focusedWorklogId=239506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239506
 ]

ASF GitHub Bot logged work on BEAM-7251:


Author: ASF GitHub Bot
Created on: 08/May/19 23:27
Start Date: 08/May/19 23:27
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8536: [BEAM-7251] Increase 
timeout for test BQ queries.
URL: https://github.com/apache/beam/pull/8536#issuecomment-490687401
 
 
   run java postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239506)
Time Spent: 20m  (was: 10m)

> Testing BigQuery client fails queries if job results aren't immediately 
> available
> -
>
> Key: BEAM-7251
> URL: https://issues.apache.org/jira/browse/BEAM-7251
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Correction: the test client is using a synchronous query with a default 
> timeout of 10s: 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
> This matches the timestamps below (5:29:19 to 5:29:29).
> Also note that this this method only returns the first page of results.
> ---
> Adding functionality to fetch query results should solve this issue, which is 
> probably causing test flakiness.
> Log:
> May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
> checkForPAssertSuccess
> INFO: Success result for Dataflow job 
> 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 
> expected assertions.
> May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 
> matchesSafely
> INFO: Verifying Bigquery data
> May 05, 2019 5:29:29 PM 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main
> SEVERE: 
> testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT)
> java.lang.AssertionError: 
> Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f)
>  but: The query job hasn't completed. Got response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"}
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at 

[jira] [Work logged] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7251?focusedWorklogId=239507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239507
 ]

ASF GitHub Bot logged work on BEAM-7251:


Author: ASF GitHub Bot
Created on: 08/May/19 23:27
Start Date: 08/May/19 23:27
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8536: [BEAM-7251] Increase 
timeout for test BQ queries.
URL: https://github.com/apache/beam/pull/8536#issuecomment-490687488
 
 
   R: @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239507)
Time Spent: 0.5h  (was: 20m)

> Testing BigQuery client fails queries if job results aren't immediately 
> available
> -
>
> Key: BEAM-7251
> URL: https://issues.apache.org/jira/browse/BEAM-7251
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Correction: the test client is using a synchronous query with a default 
> timeout of 10s: 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
> This matches the timestamps below (5:29:19 to 5:29:29).
> Also note that this this method only returns the first page of results.
> ---
> Adding functionality to fetch query results should solve this issue, which is 
> probably causing test flakiness.
> Log:
> May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
> checkForPAssertSuccess
> INFO: Success result for Dataflow job 
> 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 
> expected assertions.
> May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 
> matchesSafely
> INFO: Verifying Bigquery data
> May 05, 2019 5:29:29 PM 
> com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main
> SEVERE: 
> testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT)
> java.lang.AssertionError: 
> Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f)
>  but: The query job hasn't completed. Got response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"}
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70)
>   at 
> org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at 

[jira] [Closed] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary

2019-05-08 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik closed BEAM-7235.
---
   Resolution: Fixed
Fix Version/s: 2.13.0

> GrpcWindmillServer creates commit streams before necessary
> --
>
> Key: BEAM-7235
> URL: https://issues.apache.org/jira/browse/BEAM-7235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
> Fix For: 2.13.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This can cause spammy logs if there are no commits before the stream deadline 
> is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239499
 ]

ASF GitHub Bot logged work on BEAM-7235:


Author: ASF GitHub Bot
Created on: 08/May/19 22:50
Start Date: 08/May/19 22:50
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8512: [BEAM-7235] 
StreamingDataflowWorker creates commit stream only when commit available
URL: https://github.com/apache/beam/pull/8512
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239499)
Time Spent: 1.5h  (was: 1h 20m)

> GrpcWindmillServer creates commit streams before necessary
> --
>
> Key: BEAM-7235
> URL: https://issues.apache.org/jira/browse/BEAM-7235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This can cause spammy logs if there are no commits before the stream deadline 
> is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-7251:

Description: 
Correction: the test client is using a synchronous query with a default timeout 
of 10s: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
This matches the timestamps below (5:29:19 to 5:29:29).

Also note that this this method only returns the first page of results.

---
Adding functionality to fetch query results should solve this issue, which is 
probably causing test flakiness.

Log:
May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
checkForPAssertSuccess
INFO: Success result for Dataflow job 2019-05-05_17_25_26-4118012232925193147. 
Found 0 success, 0 failures out of 0 expected assertions.
May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 
matchesSafely
INFO: Verifying Bigquery data
May 05, 2019 5:29:29 PM 
com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main
SEVERE: 
testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT)
java.lang.AssertionError: 
Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f)
 but: The query job hasn't completed. Got response: 
{"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"}
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138)
at 
org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90)
at 
org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145)

Exception in thread "main" java.lang.IllegalStateException: Tests failed, check 
output logs for details.
at 
com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154)

But checking BQ logs on the console reveals that the query job did run:
2019-05-05 17:29:29.601 PDT
Bigquery
query
queries
2019-05-05 17:29:31.956 PDT
Bigquery
jobcompleted
job_cZkICLalRsrnivu78BX1y3UwMhIz


  was:
Adding functionality to fetch query results should solve this issue, which is 
probably causing test flakiness.

Log:
May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
checkForPAssertSuccess
INFO: Success result for Dataflow job 2019-05-05_17_25_26-4118012232925193147. 
Found 0 success, 0 failures out of 0 expected assertions.
May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 

[jira] [Work logged] (BEAM-6916) Reorganize Beam SQL docs

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6916?focusedWorklogId=239490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239490
 ]

ASF GitHub Bot logged work on BEAM-6916:


Author: ASF GitHub Bot
Created on: 08/May/19 22:28
Start Date: 08/May/19 22:28
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #8455: [BEAM-6916] Reorg 
Beam SQL docs and add Calcite section
URL: https://github.com/apache/beam/pull/8455#issuecomment-490673528
 
 
   LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239490)
Time Spent: 1h 50m  (was: 1h 40m)
Remaining Estimate: 166h 10m  (was: 166h 20m)

> Reorganize Beam SQL docs
> 
>
> Key: BEAM-6916
> URL: https://issues.apache.org/jira/browse/BEAM-6916
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Rose Nguyen
>Assignee: Rose Nguyen
>Priority: Minor
>   Original Estimate: 168h
>  Time Spent: 1h 50m
>  Remaining Estimate: 166h 10m
>
> This page describes the Calcite SQL dialect supported by Beam SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239488
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 08/May/19 22:25
Start Date: 08/May/19 22:25
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-490672901
 
 
   I removed them in this PR since it was causing some tests to fail 
(validatesPortableRunner test for the removed code), it's just that the failing 
test wasn't exercised and should've been deleted anyway. Plus I'd prefer to get 
it all done in one stroke if possible.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239488)
Time Spent: 4h 40m  (was: 4.5h)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7240) Kinesis IO Watermark Computation Improvements

2019-05-08 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7240:
---
Status: Open  (was: Triage Needed)

> Kinesis IO Watermark Computation Improvements
> -
>
> Key: BEAM-7240
> URL: https://issues.apache.org/jira/browse/BEAM-7240
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kinesis
>Reporter: Ajo Thomas
>Assignee: Ajo Thomas
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, watermarks in kinesis IO are computed taking into account the 
> record arrival time in a {{KinesisRecord}}. The arrival time might not always 
> be the right representation of the event time. The user of the IO should be 
> able to specify how they want to extract the event time from the 
> KinesisRecord. 
> As the per current logic, the end user of the IO cannot control watermark 
> computation in any way. A user should be able to control watermark 
> computation through some custom heuristics or configurable params like time 
> duration to advance the watermark if no data was received (could be due to a 
> shard getting stalled.  The watermark should advance and not be stalled in 
> that case).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4150) Standardize use of PCollection coder proto attribute

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4150?focusedWorklogId=239483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239483
 ]

ASF GitHub Bot logged work on BEAM-4150:


Author: ASF GitHub Bot
Created on: 08/May/19 22:06
Start Date: 08/May/19 22:06
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8533: [BEAM-4150] 
Downgrade missing coder error logs to info logs.
URL: https://github.com/apache/beam/pull/8533
 
 
   This log message shows up frequently in the error logs. Users cannot do much 
about it and they are not fatal.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | 

[jira] [Work logged] (BEAM-562) DoFn Reuse: Add new methods to DoFn

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-562?focusedWorklogId=239480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239480
 ]

ASF GitHub Bot logged work on BEAM-562:
---

Author: ASF GitHub Bot
Created on: 08/May/19 21:42
Start Date: 08/May/19 21:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #7994: [BEAM-562] Add 
DoFn.setup and DoFn.teardown to Python SDK
URL: https://github.com/apache/beam/pull/7994#issuecomment-490661743
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239480)
Time Spent: 9h  (was: 8h 50m)

> DoFn Reuse: Add new methods to DoFn
> ---
>
> Key: BEAM-562
> URL: https://issues.apache.org/jira/browse/BEAM-562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Yifan Mai
>Priority: Major
>  Labels: sdk-consistency
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Java SDK added setup and teardown methods to the DoFns. This makes DoFns 
> reusable and provide performance improvements. Python SDK should add support 
> for these new DoFn methods:
> Proposal doc: 
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f#



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7246) Create a Spanner IO for Python

2019-05-08 Thread Shehzaad Nakhoda (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shehzaad Nakhoda updated BEAM-7246:
---
Description: 
Add I/O support for Google Cloud Spanner for the Python SDK.
Integration and performance tests are a separate work item (not included here).

See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
Google Clound Spanner to the Database column for the Python/Batch row.

> Create a Spanner IO for Python
> --
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>
> Add I/O support for Google Cloud Spanner for the Python SDK.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-562) DoFn Reuse: Add new methods to DoFn

2019-05-08 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-562:


Assignee: Yifan Mai  (was: Shehzaad Nakhoda)

> DoFn Reuse: Add new methods to DoFn
> ---
>
> Key: BEAM-562
> URL: https://issues.apache.org/jira/browse/BEAM-562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Yifan Mai
>Priority: Major
>  Labels: sdk-consistency
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Java SDK added setup and teardown methods to the DoFns. This makes DoFns 
> reusable and provide performance improvements. Python SDK should add support 
> for these new DoFn methods:
> Proposal doc: 
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f#



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-562) DoFn Reuse: Add new methods to DoFn

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-562?focusedWorklogId=239479=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239479
 ]

ASF GitHub Bot logged work on BEAM-562:
---

Author: ASF GitHub Bot
Created on: 08/May/19 21:37
Start Date: 08/May/19 21:37
Worklog Time Spent: 10m 
  Work Description: yifanmai commented on issue #7994: [BEAM-562] Add 
DoFn.setup and DoFn.teardown to Python SDK
URL: https://github.com/apache/beam/pull/7994#issuecomment-490660293
 
 
   @kennknowles @aaltay tests are passing now. PTAL?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239479)
Time Spent: 8h 50m  (was: 8h 40m)

> DoFn Reuse: Add new methods to DoFn
> ---
>
> Key: BEAM-562
> URL: https://issues.apache.org/jira/browse/BEAM-562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Labels: sdk-consistency
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Java SDK added setup and teardown methods to the DoFns. This makes DoFns 
> reusable and provide performance improvements. Python SDK should add support 
> for these new DoFn methods:
> Proposal doc: 
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f#



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-562) DoFn Reuse: Add new methods to DoFn

2019-05-08 Thread Shehzaad Nakhoda (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835919#comment-16835919
 ] 

Shehzaad Nakhoda commented on BEAM-562:
---

[~altay] thanks for the heads up. Can you please assign this to 
[~myffi...@gmail.com]?

> DoFn Reuse: Add new methods to DoFn
> ---
>
> Key: BEAM-562
> URL: https://issues.apache.org/jira/browse/BEAM-562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Labels: sdk-consistency
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Java SDK added setup and teardown methods to the DoFns. This makes DoFns 
> reusable and provide performance improvements. Python SDK should add support 
> for these new DoFn methods:
> Proposal doc: 
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f#



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available

2019-05-08 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-7251:
---

 Summary: Testing BigQuery client fails queries if job results 
aren't immediately available
 Key: BEAM-7251
 URL: https://issues.apache.org/jira/browse/BEAM-7251
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp
Reporter: Udi Meiri


Adding functionality to fetch query results should solve this issue, which is 
probably causing test flakiness.

Log:
May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
checkForPAssertSuccess
INFO: Success result for Dataflow job 2019-05-05_17_25_26-4118012232925193147. 
Found 0 success, 0 failures out of 0 expected assertions.
May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher 
matchesSafely
INFO: Verifying Bigquery data
May 05, 2019 5:29:29 PM 
com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main
SEVERE: 
testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT)
java.lang.AssertionError: 
Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f)
 but: The query job hasn't completed. Got response: 
{"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"}
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138)
at 
org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90)
at 
org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145)

Exception in thread "main" java.lang.IllegalStateException: Tests failed, check 
output logs for details.
at 
com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154)

But checking BQ logs on the console reveals that the query job did run:
2019-05-05 17:29:29.601 PDT
Bigquery
query
queries
2019-05-05 17:29:31.956 PDT
Bigquery
jobcompleted
job_cZkICLalRsrnivu78BX1y3UwMhIz




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7250) Determine whether Java SDF API should be change to work via annotations

2019-05-08 Thread Pablo Estrada (JIRA)
Pablo Estrada created BEAM-7250:
---

 Summary: Determine whether Java SDF API should be change to work 
via annotations
 Key: BEAM-7250
 URL: https://issues.apache.org/jira/browse/BEAM-7250
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Pablo Estrada


This would be akin to the Python change in 
[https://github.com/apache/beam/pull/8430]

This is discussed here: 
[https://lists.apache.org/thread.html/7e1ebc970891778c2dbedec7e9846ab221ef12f38e689895567f1f4e@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7240) Kinesis IO Watermark Computation Improvements

2019-05-08 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-7240:
---

Assignee: Ajo Thomas

> Kinesis IO Watermark Computation Improvements
> -
>
> Key: BEAM-7240
> URL: https://issues.apache.org/jira/browse/BEAM-7240
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kinesis
>Reporter: Ajo Thomas
>Assignee: Ajo Thomas
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, watermarks in kinesis IO are computed taking into account the 
> record arrival time in a {{KinesisRecord}}. The arrival time might not always 
> be the right representation of the event time. The user of the IO should be 
> able to specify how they want to extract the event time from the 
> KinesisRecord. 
> As the per current logic, the end user of the IO cannot control watermark 
> computation in any way. A user should be able to control watermark 
> computation through some custom heuristics or configurable params like time 
> duration to advance the watermark if no data was received (could be due to a 
> shard getting stalled.  The watermark should advance and not be stalled in 
> that case).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239477
 ]

ASF GitHub Bot logged work on BEAM-7235:


Author: ASF GitHub Bot
Created on: 08/May/19 21:00
Start Date: 08/May/19 21:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8512: [BEAM-7235] 
StreamingDataflowWorker creates commit stream only when commit available
URL: https://github.com/apache/beam/pull/8512#discussion_r282247785
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java
 ##
 @@ -1508,8 +1513,10 @@ private void streamingCommitLoop() {
   break;
 }
   }
-  commitStream.flush();
-  streamPool.releaseStream(commitStream);
+  if (commitStream) {
 
 Review comment:
   You built the runner that submits the pipeline and not the worker component.
   
   I think you meant to do `./gradlew 
:beam-runners-google-cloud-dataflow-java:worker:build`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239477)
Time Spent: 1h 20m  (was: 1h 10m)

> GrpcWindmillServer creates commit streams before necessary
> --
>
> Key: BEAM-7235
> URL: https://issues.apache.org/jira/browse/BEAM-7235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This can cause spammy logs if there are no commits before the stream deadline 
> is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6916) Reorganize Beam SQL docs

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6916?focusedWorklogId=239476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239476
 ]

ASF GitHub Bot logged work on BEAM-6916:


Author: ASF GitHub Bot
Created on: 08/May/19 20:54
Start Date: 08/May/19 20:54
Worklog Time Spent: 10m 
  Work Description: melap commented on issue #8455: [BEAM-6916] Reorg Beam 
SQL docs and add Calcite section
URL: https://github.com/apache/beam/pull/8455#issuecomment-490647139
 
 
   Staged: 
http://apache-beam-website-pull-requests.storage.googleapis.com/8455/documentation/dsls/sql/overview/index.html
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239476)
Time Spent: 1h 40m  (was: 1.5h)
Remaining Estimate: 166h 20m  (was: 166.5h)

> Reorganize Beam SQL docs
> 
>
> Key: BEAM-6916
> URL: https://issues.apache.org/jira/browse/BEAM-6916
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Rose Nguyen
>Assignee: Rose Nguyen
>Priority: Minor
>   Original Estimate: 168h
>  Time Spent: 1h 40m
>  Remaining Estimate: 166h 20m
>
> This page describes the Calcite SQL dialect supported by Beam SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239474
 ]

ASF GitHub Bot logged work on BEAM-6959:


Author: ASF GitHub Bot
Created on: 08/May/19 20:49
Start Date: 08/May/19 20:49
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8531: [BEAM-6959] Add Flink 
tests for Go SDK
URL: https://github.com/apache/beam/pull/8531#issuecomment-490645594
 
 
   > in probably a different PR?
   Yep, that will be the follow-up.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239474)
Time Spent: 40m  (was: 0.5h)

> Run Go SDK  Post Commit tests against the Flink Runner.
> ---
>
> Key: BEAM-6959
> URL: https://issues.apache.org/jira/browse/BEAM-6959
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, sdk-go, testing
>Reporter: Robert Burke
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> See parent task BEAM-6958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239471
 ]

ASF GitHub Bot logged work on BEAM-7235:


Author: ASF GitHub Bot
Created on: 08/May/19 20:44
Start Date: 08/May/19 20:44
Worklog Time Spent: 10m 
  Work Description: scwhittle commented on pull request #8512: [BEAM-7235] 
StreamingDataflowWorker creates commit stream only when commit available
URL: https://github.com/apache/beam/pull/8512#discussion_r282242048
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java
 ##
 @@ -1508,8 +1513,10 @@ private void streamingCommitLoop() {
   break;
 }
   }
-  commitStream.flush();
-  streamPool.releaseStream(commitStream);
+  if (commitStream) {
 
 Review comment:
   Oops, fixed,
   
   I ran the following ./gradlew :beam-runners-google-cloud-dataflow-java:build 
does that not actually build this? Or perhaps I ran it in the wrong branch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239471)
Time Spent: 1h 10m  (was: 1h)

> GrpcWindmillServer creates commit streams before necessary
> --
>
> Key: BEAM-7235
> URL: https://issues.apache.org/jira/browse/BEAM-7235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This can cause spammy logs if there are no commits before the stream deadline 
> is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239475
 ]

ASF GitHub Bot logged work on BEAM-6959:


Author: ASF GitHub Bot
Created on: 08/May/19 20:49
Start Date: 08/May/19 20:49
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8531: [BEAM-6959] Add Flink 
tests for Go SDK
URL: https://github.com/apache/beam/pull/8531#issuecomment-490645691
 
 
   > in probably a different PR?
   
   Yep, that will be the follow-up.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239475)
Time Spent: 50m  (was: 40m)

> Run Go SDK  Post Commit tests against the Flink Runner.
> ---
>
> Key: BEAM-6959
> URL: https://issues.apache.org/jira/browse/BEAM-6959
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, sdk-go, testing
>Reporter: Robert Burke
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> See parent task BEAM-6958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239473
 ]

ASF GitHub Bot logged work on BEAM-6959:


Author: ASF GitHub Bot
Created on: 08/May/19 20:49
Start Date: 08/May/19 20:49
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8531: [BEAM-6959] Add Flink 
tests for Go SDK
URL: https://github.com/apache/beam/pull/8531#issuecomment-490645594
 
 
   > in probably a different PR?
   Yep, that will be the follow-up.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239473)
Time Spent: 0.5h  (was: 20m)

> Run Go SDK  Post Commit tests against the Flink Runner.
> ---
>
> Key: BEAM-6959
> URL: https://issues.apache.org/jira/browse/BEAM-6959
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, sdk-go, testing
>Reporter: Robert Burke
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> See parent task BEAM-6958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-08 Thread Brachi Packter (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835896#comment-16835896
 ] 

Brachi Packter commented on BEAM-7230:
--

Awesome, I'll check it out.




> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239472
 ]

ASF GitHub Bot logged work on BEAM-6959:


Author: ASF GitHub Bot
Created on: 08/May/19 20:46
Start Date: 08/May/19 20:46
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #8531: [BEAM-6959] Add 
Flink tests for Go SDK
URL: https://github.com/apache/beam/pull/8531#issuecomment-490644734
 
 
   Woohoo! Thanks for writing this Kyle!
   I presume the next step (in probably a different PR?) would be to add the 
task to the post commits, and have it tracked/triggered by Jenkins appropriate.
   Finally adding the appropriate badge to the PR template: 
https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239472)
Time Spent: 20m  (was: 10m)

> Run Go SDK  Post Commit tests against the Flink Runner.
> ---
>
> Key: BEAM-6959
> URL: https://issues.apache.org/jira/browse/BEAM-6959
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, sdk-go, testing
>Reporter: Robert Burke
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See parent task BEAM-6958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=239466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239466
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 08/May/19 20:33
Start Date: 08/May/19 20:33
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on issue #8530: [BEAM-6988] solved 
problem related to updates of the str object
URL: https://github.com/apache/beam/pull/8530#issuecomment-490640523
 
 
   R: @aaltay  @fredo838  @Juta  @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239466)
Time Spent: 20m  (was: 10m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklas Hansson reassigned BEAM-6877:


Assignee: niklas Hansson

> TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode 
> changes
> 
>
> Key: BEAM-6877
> URL: https://issues.apache.org/jira/browse/BEAM-6877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>
> Type inference doesn't work on Python 3.6 due to [bytecode to wordcode 
> changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes].
> Type inference always returns Any on Python 3.6, so this is not critical.
> Affected tests are:
>  *transforms.ptransform_test*:
>  - test_combine_properly_pipeline_type_checks_using_decorator
>  - test_mean_globally_pipeline_checking_satisfied
>  - test_mean_globally_runtime_checking_satisfied
>  - test_count_globally_pipeline_type_checking_satisfied
>  - test_count_globally_runtime_type_checking_satisfied
>  - test_pardo_type_inference
>  - test_pipeline_inference
>  - test_inferred_bad_kv_type
> *typehints.trivial_inference_test*:
>  - all tests in TrivialInferenceTest
> *io.gcp.pubsub_test.TestReadFromPubSubOverride*:
> * test_expand_with_other_options
> * test_expand_with_subscription
> * test_expand_with_topic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239468=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239468
 ]

ASF GitHub Bot logged work on BEAM-7235:


Author: ASF GitHub Bot
Created on: 08/May/19 20:37
Start Date: 08/May/19 20:37
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8512: [BEAM-7235] 
StreamingDataflowWorker creates commit stream only when commit available
URL: https://github.com/apache/beam/pull/8512#discussion_r282239271
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java
 ##
 @@ -1508,8 +1513,10 @@ private void streamingCommitLoop() {
   break;
 }
   }
-  commitStream.flush();
-  streamPool.releaseStream(commitStream);
+  if (commitStream) {
 
 Review comment:
   ```suggestion
 if (commitStream != null) {
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239468)
Time Spent: 1h  (was: 50m)

> GrpcWindmillServer creates commit streams before necessary
> --
>
> Key: BEAM-7235
> URL: https://issues.apache.org/jira/browse/BEAM-7235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This can cause spammy logs if there are no commits before the stream deadline 
> is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes

2019-05-08 Thread niklas Hansson (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835889#comment-16835889
 ] 

niklas Hansson commented on BEAM-6877:
--

I will start to work on this :)

> TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode 
> changes
> 
>
> Key: BEAM-6877
> URL: https://issues.apache.org/jira/browse/BEAM-6877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>
> Type inference doesn't work on Python 3.6 due to [bytecode to wordcode 
> changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes].
> Type inference always returns Any on Python 3.6, so this is not critical.
> Affected tests are:
>  *transforms.ptransform_test*:
>  - test_combine_properly_pipeline_type_checks_using_decorator
>  - test_mean_globally_pipeline_checking_satisfied
>  - test_mean_globally_runtime_checking_satisfied
>  - test_count_globally_pipeline_type_checking_satisfied
>  - test_count_globally_runtime_type_checking_satisfied
>  - test_pardo_type_inference
>  - test_pipeline_inference
>  - test_inferred_bad_kv_type
> *typehints.trivial_inference_test*:
>  - all tests in TrivialInferenceTest
> *io.gcp.pubsub_test.TestReadFromPubSubOverride*:
> * test_expand_with_other_options
> * test_expand_with_subscription
> * test_expand_with_topic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6877 started by niklas Hansson.

> TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode 
> changes
> 
>
> Key: BEAM-6877
> URL: https://issues.apache.org/jira/browse/BEAM-6877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>
> Type inference doesn't work on Python 3.6 due to [bytecode to wordcode 
> changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes].
> Type inference always returns Any on Python 3.6, so this is not critical.
> Affected tests are:
>  *transforms.ptransform_test*:
>  - test_combine_properly_pipeline_type_checks_using_decorator
>  - test_mean_globally_pipeline_checking_satisfied
>  - test_mean_globally_runtime_checking_satisfied
>  - test_count_globally_pipeline_type_checking_satisfied
>  - test_count_globally_runtime_type_checking_satisfied
>  - test_pardo_type_inference
>  - test_pipeline_inference
>  - test_inferred_bad_kv_type
> *typehints.trivial_inference_test*:
>  - all tests in TrivialInferenceTest
> *io.gcp.pubsub_test.TestReadFromPubSubOverride*:
> * test_expand_with_other_options
> * test_expand_with_subscription
> * test_expand_with_topic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=239464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239464
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 08/May/19 20:31
Start Date: 08/May/19 20:31
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8530: [BEAM-6988] 
solved problem related to updates of the str object
URL: https://github.com/apache/beam/pull/8530
 
 
   Update test apache_beam.typehints.typed_pipeline_test.MainInputTest. The 
problem is related to that in Python3 the str object don't have the strip 
method. See answer on this stackoverflow 
question:https://stackoverflow.com/questions/46241389/two-different-definition-of-strip-method-in-python-2-7-14rc1-official-document
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | 

[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239467
 ]

ASF GitHub Bot logged work on BEAM-7235:


Author: ASF GitHub Bot
Created on: 08/May/19 20:36
Start Date: 08/May/19 20:36
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8512: [BEAM-7235] 
StreamingDataflowWorker creates commit stream only when commit available
URL: https://github.com/apache/beam/pull/8512#discussion_r282239133
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java
 ##
 @@ -1459,7 +1459,9 @@ private void streamingCommitLoop() {
 Commit commit = null;
 while (running.get()) {
   // Batch commits as long as there are more and we can fit them in the 
current request.
-  CommitWorkStream commitStream = streamPool.getStream();
+  // We lazily initialize the commit stream to make sure that we only 
create one after
+  // we have a commit.
+  CommitWorkStream commitStream = null;
 
 Review comment:
   Thanks, makes sense.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239467)
Time Spent: 50m  (was: 40m)

> GrpcWindmillServer creates commit streams before necessary
> --
>
> Key: BEAM-7235
> URL: https://issues.apache.org/jira/browse/BEAM-7235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This can cause spammy logs if there are no commits before the stream deadline 
> is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239465=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239465
 ]

ASF GitHub Bot logged work on BEAM-6959:


Author: ASF GitHub Bot
Created on: 08/May/19 20:31
Start Date: 08/May/19 20:31
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #8531: [BEAM-6959] Add 
Flink tests for Go SDK
URL: https://github.com/apache/beam/pull/8531
 
 
   Adds a Gradle task to run tests using the Go SDK and the Flink runner. (To 
run locally, the user can run `run_integration_tests.sh` with flags configured 
according to their own Cloud setup).
   
   N.B. I re-used the existing Python code to get an unused port for the job 
server, which has the potential to result in a race condition if a port it 
selects is immediately snapped up by another process before that port can be 
claimed by the job server, though such an event would be relatively unlikely.
   
   R: @robertwb @angoenka @lostluck 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
 | [![Build 

[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239463
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 08/May/19 20:30
Start Date: 08/May/19 20:30
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8518: [BEAM-6908] 
Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#issuecomment-490639266
 
 
   PTAL @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239463)
Time Spent: 9.5h  (was: 9h 20m)

> Add Python3 performance benchmarks
> --
>
> Key: BEAM-6908
> URL: https://issues.apache.org/jira/browse/BEAM-6908
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Similar to 
> [beam_PerformanceTests_Python|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_Python/],
>  we want to have a Python3 benchmark running on Jenkins to detect performance 
> regression during code adoption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239451
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 08/May/19 20:15
Start Date: 08/May/19 20:15
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8518: 
[BEAM-6908] Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#discussion_r282231279
 
 

 ##
 File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy
 ##
 @@ -18,46 +18,107 @@
 
 import CommonJobProperties as commonJobProperties
 
-// This job runs the Beam Python performance tests on PerfKit Benchmarker.
-job('beam_PerformanceTests_Python'){
-  // Set default Beam job properties.
-  commonJobProperties.setTopLevelMainJobProperties(delegate)
-
-  // Run job in postcommit every 6 hours, don't trigger every push.
-  commonJobProperties.setAutoJob(
-  delegate,
-  'H */6 * * *')
-
-  // Allows triggering this build against pull requests.
-  commonJobProperties.enablePhraseTriggeringFromPullRequest(
-  delegate,
-  'Python SDK Performance Test',
-  'Run Python Performance Test')
-
-  def pipelineArgs = [
-  project: 'apache-beam-testing',
-  staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it',
-  temp_location: 'gs://temp-storage-for-end-to-end-tests/temp-it',
-  output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
-  ]
-  def pipelineArgList = []
-  pipelineArgs.each({
-key, value -> pipelineArgList.add("--$key=$value")
-  })
-  def pipelineArgsJoined = pipelineArgList.join(',')
-
-  def argMap = [
-  beam_sdk : 'python',
-  benchmarks   : 'beam_integration_benchmark',
-  bigquery_table   : 'beam_performance.wordcount_py_pkb_results',
-  beam_it_class: 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
-  beam_it_module   : 'sdks/python',
-  beam_prebuilt: 'true',  // skip beam prebuild
-  beam_python_sdk_location : 'build/apache-beam.tar.gz',
-  beam_runner  : 'TestDataflowRunner',
-  beam_it_timeout  : '1200',
-  beam_it_args : pipelineArgsJoined,
-  ]
-
-  commonJobProperties.buildPerformanceTest(delegate, argMap)
+
+class PerformanceTestConfigurations {
+  String jobName
+  String jobDescription
+  String jobTriggerPhrase
+  String buildSchedule = 'H */6 * * *'  // every 6 hours
+  String benchmarkName = 'beam_integration_benchmark'
+  String sdk = 'python'
+  String bigqueryTable
+  String itClass
+  String itModule
+  Boolean skipPrebuild = false
+  String pythonSdkLocation
+  String runner = 'TestDataflowRunner'
+  Integer itTimeout = 1200
+  Map extraPipelineArgs
+}
+
+// Common pipeline args for Dataflow job.
+def dataflowPipelineArgs = [
+project : 'apache-beam-testing',
+staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it',
+temp_location   : 'gs://temp-storage-for-end-to-end-tests/temp-it',
+]
+
+
+// Configurations of each Jenkins job.
+def testConfigurations = [
+new PerformanceTestConfigurations(
+jobName   : 'beam_PerformanceTests_Python',
+jobDescription: 'Python SDK Performance Test',
+jobTriggerPhrase  : 'Run Python Performance Test',
+bigqueryTable : 'beam_performance.wordcount_py_pkb_results',
+skipPrebuild  : true,
+pythonSdkLocation : 'build/apache-beam.tar.gz',
+itClass   : 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
+itModule  : 'sdks/python',
+extraPipelineArgs : dataflowPipelineArgs + [
+output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
+],
+),
+new PerformanceTestConfigurations(
+jobName   : 'beam_PerformanceTests_Python35',
+jobDescription: 'Python35 SDK Performance Test',
+jobTriggerPhrase  : 'Run Python35 Performance Test',
+bigqueryTable : 'beam_performance.wordcount_py35_pkb_results',
+skipPrebuild  : true,
+pythonSdkLocation : 
'test-suites/dataflow/py35/build/apache-beam.tar.gz',
+itClass   : 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
+itModule  : 'sdks/python/test-suites/dataflow/py35',
+extraPipelineArgs : dataflowPipelineArgs + [
+output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
+],
+)
+]
+
+
+for (testConfig in testConfigurations) {
+  createPythonPerformanceTestJob(testConfig)
+}
+
+
+private void createPythonPerformanceTestJob(PerformanceTestConfigurations 
testConfig) {
+  // This job runs the 

[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239447
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 08/May/19 20:14
Start Date: 08/May/19 20:14
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8518: 
[BEAM-6908] Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#discussion_r282230690
 
 

 ##
 File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy
 ##
 @@ -18,46 +18,107 @@
 
 import CommonJobProperties as commonJobProperties
 
-// This job runs the Beam Python performance tests on PerfKit Benchmarker.
-job('beam_PerformanceTests_Python'){
-  // Set default Beam job properties.
-  commonJobProperties.setTopLevelMainJobProperties(delegate)
-
-  // Run job in postcommit every 6 hours, don't trigger every push.
-  commonJobProperties.setAutoJob(
-  delegate,
-  'H */6 * * *')
-
-  // Allows triggering this build against pull requests.
-  commonJobProperties.enablePhraseTriggeringFromPullRequest(
-  delegate,
-  'Python SDK Performance Test',
-  'Run Python Performance Test')
-
-  def pipelineArgs = [
-  project: 'apache-beam-testing',
-  staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it',
-  temp_location: 'gs://temp-storage-for-end-to-end-tests/temp-it',
-  output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
-  ]
-  def pipelineArgList = []
-  pipelineArgs.each({
-key, value -> pipelineArgList.add("--$key=$value")
-  })
-  def pipelineArgsJoined = pipelineArgList.join(',')
-
-  def argMap = [
-  beam_sdk : 'python',
-  benchmarks   : 'beam_integration_benchmark',
-  bigquery_table   : 'beam_performance.wordcount_py_pkb_results',
-  beam_it_class: 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
-  beam_it_module   : 'sdks/python',
-  beam_prebuilt: 'true',  // skip beam prebuild
-  beam_python_sdk_location : 'build/apache-beam.tar.gz',
-  beam_runner  : 'TestDataflowRunner',
-  beam_it_timeout  : '1200',
-  beam_it_args : pipelineArgsJoined,
-  ]
-
-  commonJobProperties.buildPerformanceTest(delegate, argMap)
+
+class PerformanceTestConfigurations {
+  String jobName
+  String jobDescription
+  String jobTriggerPhrase
+  String buildSchedule = 'H */6 * * *'  // every 6 hours
+  String benchmarkName = 'beam_integration_benchmark'
+  String sdk = 'python'
+  String bigqueryTable
+  String itClass
+  String itModule
+  Boolean skipPrebuild = false
+  String pythonSdkLocation
+  String runner = 'TestDataflowRunner'
+  Integer itTimeout = 1200
+  Map extraPipelineArgs
+}
+
+// Common pipeline args for Dataflow job.
+def dataflowPipelineArgs = [
+project : 'apache-beam-testing',
+staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it',
+temp_location   : 'gs://temp-storage-for-end-to-end-tests/temp-it',
+]
+
+
+// Configurations of each Jenkins job.
+def testConfigurations = [
+new PerformanceTestConfigurations(
+jobName   : 'beam_PerformanceTests_Python',
+jobDescription: 'Python SDK Performance Test',
+jobTriggerPhrase  : 'Run Python Performance Test',
+bigqueryTable : 'beam_performance.wordcount_py_pkb_results',
+skipPrebuild  : true,
+pythonSdkLocation : 'build/apache-beam.tar.gz',
+itClass   : 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
+itModule  : 'sdks/python',
+extraPipelineArgs : dataflowPipelineArgs + [
+output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
+],
+),
+new PerformanceTestConfigurations(
+jobName   : 'beam_PerformanceTests_Python35',
+jobDescription: 'Python35 SDK Performance Test',
+jobTriggerPhrase  : 'Run Python35 Performance Test',
+bigqueryTable : 'beam_performance.wordcount_py35_pkb_results',
+skipPrebuild  : true,
+pythonSdkLocation : 
'test-suites/dataflow/py35/build/apache-beam.tar.gz',
+itClass   : 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
+itModule  : 'sdks/python/test-suites/dataflow/py35',
+extraPipelineArgs : dataflowPipelineArgs + [
+output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
+],
+)
+]
+
+
+for (testConfig in testConfigurations) {
+  createPythonPerformanceTestJob(testConfig)
+}
+
+
+private void createPythonPerformanceTestJob(PerformanceTestConfigurations 
testConfig) {
+  // This job runs the 

[jira] [Commented] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-08 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835851#comment-16835851
 ] 

Ismaël Mejía commented on BEAM-7230:


THinking about it, the included implementation of DataSourceProviderFn 
`PoolableDataSourceProvider` instantiates a Poolable DataSource per JVM so it 
should cover your case, if you have the chance to try if it works it would be 
great to know. Thanks

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly

2019-05-08 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-7238.

   Resolution: Fixed
Fix Version/s: 2.13.0

> Make sfl4j bindings runtimeOnly/testRuntimeOnly
> ---
>
> Key: BEAM-7238
> URL: https://issues.apache.org/jira/browse/BEAM-7238
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.13.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Multiple modules are including sfl4j bindings in the compile or testCompile 
> scope this is an issue because this may break loggin in particular in logs 
> that are reused by others. Concrete case sfl4j-simple runners-core and the 
> logging in the specific runners.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7238?focusedWorklogId=239428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239428
 ]

ASF GitHub Bot logged work on BEAM-7238:


Author: ASF GitHub Bot
Created on: 08/May/19 19:27
Start Date: 08/May/19 19:27
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #8515: [BEAM-7238] 
Make sfl4j bindings runtimeOnly/testRuntimeOnly
URL: https://github.com/apache/beam/pull/8515#discussion_r282213704
 
 

 ##
 File path: runners/core-java/build.gradle
 ##
 @@ -45,6 +45,6 @@ dependencies {
   shadowTest library.java.mockito_core
   shadowTest library.java.junit
   shadowTest library.java.slf4j_api
-  shadowTest library.java.slf4j_simple
   shadowTest library.java.jackson_dataformat_yaml
+  testRuntimeOnly library.java.slf4j_simple
 
 Review comment:
   This is great improvement, I will try to check in which modules it is worth 
to do the switch (in particular the ones who do shade stuff to enable it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239428)
Time Spent: 1h 50m  (was: 1h 40m)

> Make sfl4j bindings runtimeOnly/testRuntimeOnly
> ---
>
> Key: BEAM-7238
> URL: https://issues.apache.org/jira/browse/BEAM-7238
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Multiple modules are including sfl4j bindings in the compile or testCompile 
> scope this is an issue because this may break loggin in particular in logs 
> that are reused by others. Concrete case sfl4j-simple runners-core and the 
> logging in the specific runners.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7238?focusedWorklogId=239425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239425
 ]

ASF GitHub Bot logged work on BEAM-7238:


Author: ASF GitHub Bot
Created on: 08/May/19 19:26
Start Date: 08/May/19 19:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #8515: [BEAM-7238] 
Make sfl4j bindings runtimeOnly/testRuntimeOnly
URL: https://github.com/apache/beam/pull/8515
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239425)
Time Spent: 1.5h  (was: 1h 20m)

> Make sfl4j bindings runtimeOnly/testRuntimeOnly
> ---
>
> Key: BEAM-7238
> URL: https://issues.apache.org/jira/browse/BEAM-7238
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Multiple modules are including sfl4j bindings in the compile or testCompile 
> scope this is an issue because this may break loggin in particular in logs 
> that are reused by others. Concrete case sfl4j-simple runners-core and the 
> logging in the specific runners.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7238?focusedWorklogId=239426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239426
 ]

ASF GitHub Bot logged work on BEAM-7238:


Author: ASF GitHub Bot
Created on: 08/May/19 19:26
Start Date: 08/May/19 19:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #8515: [BEAM-7238] Make 
sfl4j bindings runtimeOnly/testRuntimeOnly
URL: https://github.com/apache/beam/pull/8515#issuecomment-490618081
 
 
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239426)
Time Spent: 1h 40m  (was: 1.5h)

> Make sfl4j bindings runtimeOnly/testRuntimeOnly
> ---
>
> Key: BEAM-7238
> URL: https://issues.apache.org/jira/browse/BEAM-7238
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Multiple modules are including sfl4j bindings in the compile or testCompile 
> scope this is an issue because this may break loggin in particular in logs 
> that are reused by others. Concrete case sfl4j-simple runners-core and the 
> logging in the specific runners.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239415
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 08/May/19 19:12
Start Date: 08/May/19 19:12
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-490613569
 
 
   this PR LGTM, if we do have community consensus on deprecating java 
reference runner. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239415)
Time Spent: 4.5h  (was: 4h 20m)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-05-08 Thread niklas Hansson (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835821#comment-16835821
 ] 

niklas Hansson commented on BEAM-7060:
--

[~udim] sounds great! I will look at the docs a bit as well :) 

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6988 started by niklas Hansson.

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239407
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 08/May/19 19:01
Start Date: 08/May/19 19:01
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8518: 
[BEAM-6908] Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#discussion_r282204772
 
 

 ##
 File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy
 ##
 @@ -18,46 +18,107 @@
 
 import CommonJobProperties as commonJobProperties
 
-// This job runs the Beam Python performance tests on PerfKit Benchmarker.
-job('beam_PerformanceTests_Python'){
-  // Set default Beam job properties.
-  commonJobProperties.setTopLevelMainJobProperties(delegate)
-
-  // Run job in postcommit every 6 hours, don't trigger every push.
-  commonJobProperties.setAutoJob(
-  delegate,
-  'H */6 * * *')
-
-  // Allows triggering this build against pull requests.
-  commonJobProperties.enablePhraseTriggeringFromPullRequest(
-  delegate,
-  'Python SDK Performance Test',
-  'Run Python Performance Test')
-
-  def pipelineArgs = [
-  project: 'apache-beam-testing',
-  staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it',
-  temp_location: 'gs://temp-storage-for-end-to-end-tests/temp-it',
-  output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
-  ]
-  def pipelineArgList = []
-  pipelineArgs.each({
-key, value -> pipelineArgList.add("--$key=$value")
-  })
-  def pipelineArgsJoined = pipelineArgList.join(',')
-
-  def argMap = [
-  beam_sdk : 'python',
-  benchmarks   : 'beam_integration_benchmark',
-  bigquery_table   : 'beam_performance.wordcount_py_pkb_results',
-  beam_it_class: 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
-  beam_it_module   : 'sdks/python',
-  beam_prebuilt: 'true',  // skip beam prebuild
-  beam_python_sdk_location : 'build/apache-beam.tar.gz',
-  beam_runner  : 'TestDataflowRunner',
-  beam_it_timeout  : '1200',
-  beam_it_args : pipelineArgsJoined,
-  ]
-
-  commonJobProperties.buildPerformanceTest(delegate, argMap)
+
+class PerformanceTestConfigurations {
+  String jobName
+  String jobDescription
+  String jobTriggerPhrase
+  String buildSchedule = 'H */6 * * *'  // every 6 hours
+  String benchmarkName = 'beam_integration_benchmark'
+  String sdk = 'python'
+  String bigqueryTable
+  String itClass
+  String itModule
+  Boolean skipPrebuild = false
+  String pythonSdkLocation
+  String runner = 'TestDataflowRunner'
+  Integer itTimeout = 1200
+  Map extraPipelineArgs
+}
+
+// Common pipeline args for Dataflow job.
+def dataflowPipelineArgs = [
+project : 'apache-beam-testing',
+staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it',
+temp_location   : 'gs://temp-storage-for-end-to-end-tests/temp-it',
+]
+
+
+// Configurations of each Jenkins job.
+def testConfigurations = [
+new PerformanceTestConfigurations(
+jobName   : 'beam_PerformanceTests_Python',
+jobDescription: 'Python SDK Performance Test',
+jobTriggerPhrase  : 'Run Python Performance Test',
+bigqueryTable : 'beam_performance.wordcount_py_pkb_results',
+skipPrebuild  : true,
+pythonSdkLocation : 'build/apache-beam.tar.gz',
+itClass   : 
'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it',
+itModule  : 'sdks/python',
+extraPipelineArgs : dataflowPipelineArgs + [
+output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output'
+],
+),
+new PerformanceTestConfigurations(
+jobName   : 'beam_PerformanceTests_Python35',
+jobDescription: 'Python35 SDK Performance Test',
+jobTriggerPhrase  : 'Run Python35 Performance Test',
+bigqueryTable : 'beam_performance.wordcount_py35_pkb_results',
+skipPrebuild  : true,
+pythonSdkLocation : 
'test-suites/dataflow/py35/build/apache-beam.tar.gz',
 
 Review comment:
   Currently, tar file is generated in build directory of the Gradle module 
where IT is located. We need to specify location per test. We can populate 
sdkLocation from itModule directly and in the future we could refactor Gradle 
build to generate tar file only once. 
https://github.com/markflyhigh/beam/pull/6 is a draft.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL 

[jira] [Assigned] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklas Hansson reassigned BEAM-6988:


Assignee: niklas Hansson

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklas Hansson reassigned BEAM-6535:


Assignee: (was: niklas Hansson)

> TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on 
> Python 3
> --
>
> Key: BEAM-6535
> URL: https://issues.apache.org/jira/browse/BEAM-6535
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Priority: Minor
>
> This is the last remaining typehints test still failing on Python 3. It fails 
> with:
> {code:java}
> ==
> FAIL: testTupleListComprehension 
> (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest)
> --
> Traceback (most recent call last):
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 127, in testTupleListComprehension
> [typehints.Tuple[str, typehints.Iterable[int]]])
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 35, in assertReturnType
> self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs))
> AssertionError: List[Tuple[str, int]] != Any
> --
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work stopped] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6535 stopped by niklas Hansson.

> TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on 
> Python 3
> --
>
> Key: BEAM-6535
> URL: https://issues.apache.org/jira/browse/BEAM-6535
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Minor
>
> This is the last remaining typehints test still failing on Python 3. It fails 
> with:
> {code:java}
> ==
> FAIL: testTupleListComprehension 
> (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest)
> --
> Traceback (most recent call last):
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 127, in testTupleListComprehension
> [typehints.Tuple[str, typehints.Iterable[int]]])
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 35, in assertReturnType
> self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs))
> AssertionError: List[Tuple[str, int]] != Any
> --
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklas Hansson reassigned BEAM-6535:


Assignee: niklas Hansson

> TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on 
> Python 3
> --
>
> Key: BEAM-6535
> URL: https://issues.apache.org/jira/browse/BEAM-6535
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Minor
>
> This is the last remaining typehints test still failing on Python 3. It fails 
> with:
> {code:java}
> ==
> FAIL: testTupleListComprehension 
> (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest)
> --
> Traceback (most recent call last):
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 127, in testTupleListComprehension
> [typehints.Tuple[str, typehints.Iterable[int]]])
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 35, in assertReturnType
> self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs))
> AssertionError: List[Tuple[str, int]] != Any
> --
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3

2019-05-08 Thread niklas Hansson (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklas Hansson reassigned BEAM-6535:


Assignee: (was: niklas Hansson)

> TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on 
> Python 3
> --
>
> Key: BEAM-6535
> URL: https://issues.apache.org/jira/browse/BEAM-6535
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Priority: Minor
>
> This is the last remaining typehints test still failing on Python 3. It fails 
> with:
> {code:java}
> ==
> FAIL: testTupleListComprehension 
> (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest)
> --
> Traceback (most recent call last):
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 127, in testTupleListComprehension
> [typehints.Tuple[str, typehints.Iterable[int]]])
> File 
> "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py",
>  line 35, in assertReturnType
> self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs))
> AssertionError: List[Tuple[str, int]] != Any
> --
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7249) Ability to cancel Cloud Bigtable reads

2019-05-08 Thread Max (JIRA)
Max created BEAM-7249:
-

 Summary: Ability to cancel Cloud Bigtable reads
 Key: BEAM-7249
 URL: https://issues.apache.org/jira/browse/BEAM-7249
 Project: Beam
  Issue Type: New Feature
  Components: io-python-gcp
Reporter: Max






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7121) Provide deterministic version of Python's ProtoCoder

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7121?focusedWorklogId=239390=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239390
 ]

ASF GitHub Bot logged work on BEAM-7121:


Author: ASF GitHub Bot
Created on: 08/May/19 18:39
Start Date: 08/May/19 18:39
Worklog Time Spent: 10m 
  Work Description: yifanmai commented on issue #8377: [BEAM-7121] Add 
deterministic proto coder
URL: https://github.com/apache/beam/pull/8377#issuecomment-490602509
 
 
   Added microbenchmarks:
   ```
   small_message_with_map, ProtoCoder, 1000 element(s): run 1 of 20, per 
element time cost: 1.59318e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 2 of 20, per 
element time cost: 1.4535e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 3 of 20, per 
element time cost: 1.4518e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 4 of 20, per 
element time cost: 1.38509e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 5 of 20, per 
element time cost: 1.60599e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 6 of 20, per 
element time cost: 1.2928e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 7 of 20, per 
element time cost: 1.17121e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 8 of 20, per 
element time cost: 1.18802e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 9 of 20, per 
element time cost: 1.24202e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 10 of 20, per 
element time cost: 1.71099e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 11 of 20, per 
element time cost: 1.61979e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 12 of 20, per 
element time cost: 1.1765e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 13 of 20, per 
element time cost: 1.28272e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 14 of 20, per 
element time cost: 1.13361e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 15 of 20, per 
element time cost: 1.11871e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 16 of 20, per 
element time cost: 1.20051e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 17 of 20, per 
element time cost: 1.2253e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 18 of 20, per 
element time cost: 1.1971e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 19 of 20, per 
element time cost: 1.1992e-05 sec
   small_message_with_map, ProtoCoder, 1000 element(s): run 20 of 20, per 
element time cost: 1.24671e-05 sec
   
   large_message_with_map, ProtoCoder, 1000 element(s): run 1 of 20, per 
element time cost: 3.93291e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 2 of 20, per 
element time cost: 3.4997e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 3 of 20, per 
element time cost: 4.39081e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 4 of 20, per 
element time cost: 3.50349e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 5 of 20, per 
element time cost: 3.8208e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 6 of 20, per 
element time cost: 3.45418e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 7 of 20, per 
element time cost: 3.6797e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 8 of 20, per 
element time cost: 3.8079e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 9 of 20, per 
element time cost: 3.90592e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 10 of 20, per 
element time cost: 3.8846e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 11 of 20, per 
element time cost: 3.8914e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 12 of 20, per 
element time cost: 3.8697e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 13 of 20, per 
element time cost: 3.9917e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 14 of 20, per 
element time cost: 4.14879e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 15 of 20, per 
element time cost: 3.91412e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 16 of 20, per 
element time cost: 4.4534e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 17 of 20, per 
element time cost: 3.98979e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 18 of 20, per 
element time cost: 4.05421e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 19 of 20, per 
element time cost: 4.1333e-05 sec
   large_message_with_map, ProtoCoder, 1000 element(s): run 20 of 20, per 
element time cost: 4.23539e-05 sec
   
   

[jira] [Created] (BEAM-7248) Support "beam:runner:executable_stage:v1" on fn_api_runner

2019-05-08 Thread Ankur Goenka (JIRA)
Ankur Goenka created BEAM-7248:
--

 Summary: Support "beam:runner:executable_stage:v1" on fn_api_runner
 Key: BEAM-7248
 URL: https://issues.apache.org/jira/browse/BEAM-7248
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Ankur Goenka


fn_api_runner.py does not support translation of executable stage transforms. 

We should support executable stage transforms as job server produces executable 
stages and this will more closely relate to a portable runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239375
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 08/May/19 18:15
Start Date: 08/May/19 18:15
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-490594092
 
 
   R: @HuangLED 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239375)
Time Spent: 4h 20m  (was: 4h 10m)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-05-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239374
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 08/May/19 18:14
Start Date: 08/May/19 18:14
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-490593555
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 239374)
Time Spent: 4h 10m  (was: 4h)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >