[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239593=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239593 ] ASF GitHub Bot logged work on BEAM-6985: Author: ASF GitHub Bot Created on: 09/May/19 05:27 Start Date: 09/May/19 05:27 Worklog Time Spent: 10m Work Description: NikeNano commented on pull request #8453: [BEAM-6985] TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates URL: https://github.com/apache/beam/pull/8453#discussion_r282342465 ## File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py ## @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self): native_type_compatibility.convert_to_beam_types(typing_types), beam_types) + def test_is_sub_class(self): +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Dict, +derived=typing.Dict[bytes, int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.List[bytes])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Set, +derived=typing.Set[int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Set[float])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Tuple, +derived=typing.Tuple[int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Tuple[bytes])) + + @unittest.skipIf(sys.version_info >= (2, 7, 0), Review comment: To be honest I don't know if it is an advantage or not. I realised there was a difference and made the test to point it out. Maybe I also should have pointed it out more clearly in the PR as well. Based upon your comment I assume that we want to keep the Py2 behaviour and will investigate further if I can achieve that with minimal changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239593) Time Spent: 3.5h (was: 3h 20m) > TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ > > > Key: BEAM-6985 > URL: https://issues.apache.org/jira/browse/BEAM-6985 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > The following tests are failing: > * test_convert_nested_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_types > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > With similar errors, where `typing. != `. eg: > {noformat} > FAIL: test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py", > line 79, in test_convert_to_beam_type > beam_type, description) > AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239592 ] ASF GitHub Bot logged work on BEAM-6985: Author: ASF GitHub Bot Created on: 09/May/19 05:23 Start Date: 09/May/19 05:23 Worklog Time Spent: 10m Work Description: NikeNano commented on pull request #8453: [BEAM-6985] TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates URL: https://github.com/apache/beam/pull/8453#discussion_r282341953 ## File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py ## @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self): native_type_compatibility.convert_to_beam_types(typing_types), beam_types) + def test_is_sub_class(self): +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Dict, +derived=typing.Dict[bytes, int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.List[bytes])) +self.assertFalse(native_type_compatibility._safe_issubclass( Review comment: True, removed the duplicate This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239592) Time Spent: 3h 20m (was: 3h 10m) > TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ > > > Key: BEAM-6985 > URL: https://issues.apache.org/jira/browse/BEAM-6985 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > The following tests are failing: > * test_convert_nested_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_types > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > With similar errors, where `typing. != `. eg: > {noformat} > FAIL: test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py", > line 79, in test_convert_to_beam_type > beam_type, description) > AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7255) UNEST with JOIN
Rui Wang created BEAM-7255: -- Summary: UNEST with JOIN Key: BEAM-7255 URL: https://issues.apache.org/jira/browse/BEAM-7255 Project: Beam Issue Type: Bug Components: dsl-sql Reporter: Rui Wang UNNEST with JOIN does not work well. see: https://stackoverflow.com/questions/56028038/unnest-the-nested-pcollection-using-beamsql -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7254) UNEST with JOIN
Rui Wang created BEAM-7254: -- Summary: UNEST with JOIN Key: BEAM-7254 URL: https://issues.apache.org/jira/browse/BEAM-7254 Project: Beam Issue Type: Bug Components: dsl-sql Reporter: Rui Wang UNNEST with JOIN does not work well. see: https://stackoverflow.com/questions/56028038/unnest-the-nested-pcollection-using-beamsql -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer
[ https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836026#comment-16836026 ] Ankur Goenka commented on BEAM-7252: me or [~robertwb] Can take a look but this is not of high priority. > "beam:java:boundedsource" not supported with python optimizer > - > > Key: BEAM-7252 > URL: https://issues.apache.org/jira/browse/BEAM-7252 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Ankur Goenka >Priority: Major > > python pipeline optimizer does not handle external transforms. > > Relevant error stack > == > ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized) > -- > Traceback (most recent call last): > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py", > line 174, in test_external_transforms > assert_that(res, equal_to([i for i in range(1, 10)])) > File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in > __exit__ > self.run().wait_until_finish() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 436, in wait_until_finish > self._job_id, self._state, self._last_error_message())) > RuntimeError: Pipeline > test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 4: Traceback (most recent call last): > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute > response = task() > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 190, in > self._execute(lambda: worker.do_instruction(work), work) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 333, in do_instruction > request.instruction_id) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 353, in process_bundle > instruction_id, request.process_bundle_descriptor_reference) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 305, in get > self.data_channel_factory) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 501, in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 545, in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True)]) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 528, in get_operation > in descriptor.transforms[transform_id].outputs.items() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 527, in > for tag, pcoll_id > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 531, in get_operation > transform_id, transform_consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 790, in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 957, in create > parameter.source, factory.context), > File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in > from_runner_api > parameter_type, constructor = cls._known_urns[fn_proto.spec.urn] > KeyError: u'urn:beam:java:boundedsource:v1' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows
[ https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239561 ] ASF GitHub Bot logged work on BEAM-7253: Author: ASF GitHub Bot Created on: 09/May/19 01:50 Start Date: 09/May/19 01:50 Worklog Time Spent: 10m Work Description: ihji commented on pull request #8537: [BEAM-7253] test_with_jar_packages_invalid_file_name test fails on Windows URL: https://github.com/apache/beam/pull/8537#discussion_r282314288 ## File path: sdks/python/apache_beam/runners/portability/stager.py ## @@ -200,10 +201,11 @@ def stage_job_resources(self, # Handle jar packages that should be staged for Java SDK Harness. jar_packages = options.view_as( DebugOptions).lookup_experiment('jar_packages') +classpath_separator = ':' if platform.system() != 'Windows' else ';' Review comment: I think there’s no issue to use ‘;’ on all platforms. ‘:’ is a standard classpath separator on Linux systems so some people on Linux may find that using ‘;’ for separating jar files makes them feel awkward but it’s not a classpath anyway. Do you think it would be better to use a single character of choice for all platforms? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239561) Time Spent: 40m (was: 0.5h) > test_with_jar_packages_invalid_file_name test fails on windows > -- > > Key: BEAM-7253 > URL: https://issues.apache.org/jira/browse/BEAM-7253 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Heejong Lee >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > test_with_jar_packages_invalid_file_name test fails on windows. possibly > different class path separator on windows ";" as compared to linux ":". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7246) Create a Spanner IO for Python
[ https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shehzaad Nakhoda updated BEAM-7246: --- Description: Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only). Testing in this work item will be in the form of DirectRunner tests and manual testing. Integration and performance tests are a separate work item (not included here). See https://beam.apache.org/documentation/io/built-in/. The goal is to add Google Clound Spanner to the Database column for the Python/Batch row. was: Add I/O support for Google Cloud Spanner for the Python SDK. Testing in this work item will be in the form of DirectRunner tests and manual testing. Integration and performance tests are a separate work item (not included here). See https://beam.apache.org/documentation/io/built-in/. The goal is to add Google Clound Spanner to the Database column for the Python/Batch row. > Create a Spanner IO for Python > -- > > Key: BEAM-7246 > URL: https://issues.apache.org/jira/browse/BEAM-7246 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > > Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only). > Testing in this work item will be in the form of DirectRunner tests and > manual testing. > Integration and performance tests are a separate work item (not included > here). > See https://beam.apache.org/documentation/io/built-in/. The goal is to add > Google Clound Spanner to the Database column for the Python/Batch row. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7246) Create a Spanner IO for Python
[ https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shehzaad Nakhoda updated BEAM-7246: --- Description: Add I/O support for Google Cloud Spanner for the Python SDK. Testing in this work item will be in the form of DirectRunner tests and manual testing. Integration and performance tests are a separate work item (not included here). See https://beam.apache.org/documentation/io/built-in/. The goal is to add Google Clound Spanner to the Database column for the Python/Batch row. was: Add I/O support for Google Cloud Spanner for the Python SDK. Integration and performance tests are a separate work item (not included here). See https://beam.apache.org/documentation/io/built-in/. The goal is to add Google Clound Spanner to the Database column for the Python/Batch row. > Create a Spanner IO for Python > -- > > Key: BEAM-7246 > URL: https://issues.apache.org/jira/browse/BEAM-7246 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > > Add I/O support for Google Cloud Spanner for the Python SDK. > Testing in this work item will be in the form of DirectRunner tests and > manual testing. > Integration and performance tests are a separate work item (not included > here). > See https://beam.apache.org/documentation/io/built-in/. The goal is to add > Google Clound Spanner to the Database column for the Python/Batch row. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7103) Adding AvroGenericCoder for simple dict type cross-language data transfer
[ https://issues.apache.org/jira/browse/BEAM-7103?focusedWorklogId=239558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239558 ] ASF GitHub Bot logged work on BEAM-7103: Author: ASF GitHub Bot Created on: 09/May/19 01:26 Start Date: 09/May/19 01:26 Worklog Time Spent: 10m Work Description: ihji commented on issue #8342: [BEAM-7103] Adding AvroGenericCoder for cross-language data transfer URL: https://github.com/apache/beam/pull/8342#issuecomment-490708363 + @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239558) Time Spent: 50m (was: 40m) > Adding AvroGenericCoder for simple dict type cross-language data transfer > - > > Key: BEAM-7103 > URL: https://issues.apache.org/jira/browse/BEAM-7103 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core, sdk-py-core >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Adding AvroGenericCoder for simple dict type cross-language data transfer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7103) Adding AvroGenericCoder for simple dict type cross-language data transfer
[ https://issues.apache.org/jira/browse/BEAM-7103?focusedWorklogId=239559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239559 ] ASF GitHub Bot logged work on BEAM-7103: Author: ASF GitHub Bot Created on: 09/May/19 01:26 Start Date: 09/May/19 01:26 Worklog Time Spent: 10m Work Description: ihji commented on issue #8342: [BEAM-7103] Adding AvroGenericCoder for cross-language data transfer URL: https://github.com/apache/beam/pull/8342#issuecomment-490708363 CC: @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239559) Time Spent: 1h (was: 50m) > Adding AvroGenericCoder for simple dict type cross-language data transfer > - > > Key: BEAM-7103 > URL: https://issues.apache.org/jira/browse/BEAM-7103 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core, sdk-py-core >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Adding AvroGenericCoder for simple dict type cross-language data transfer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows
[ https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239550 ] ASF GitHub Bot logged work on BEAM-7253: Author: ASF GitHub Bot Created on: 09/May/19 01:16 Start Date: 09/May/19 01:16 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8537: [BEAM-7253] test_with_jar_packages_invalid_file_name test fails on Windows URL: https://github.com/apache/beam/pull/8537#discussion_r282309479 ## File path: sdks/python/apache_beam/runners/portability/stager.py ## @@ -200,10 +201,11 @@ def stage_job_resources(self, # Handle jar packages that should be staged for Java SDK Harness. jar_packages = options.view_as( DebugOptions).lookup_experiment('jar_packages') +classpath_separator = ':' if platform.system() != 'Windows' else ';' Review comment: Can we use the same class path separator (';') for all platforms? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239550) Time Spent: 0.5h (was: 20m) > test_with_jar_packages_invalid_file_name test fails on windows > -- > > Key: BEAM-7253 > URL: https://issues.apache.org/jira/browse/BEAM-7253 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Heejong Lee >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > test_with_jar_packages_invalid_file_name test fails on windows. possibly > different class path separator on windows ";" as compared to linux ":". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows
[ https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239547 ] ASF GitHub Bot logged work on BEAM-7253: Author: ASF GitHub Bot Created on: 09/May/19 01:01 Start Date: 09/May/19 01:01 Worklog Time Spent: 10m Work Description: ihji commented on issue #8537: [BEAM-7253] test_with_jar_packages_invalid_file_name test fails on Windows URL: https://github.com/apache/beam/pull/8537#issuecomment-490704176 R: @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239547) Time Spent: 20m (was: 10m) > test_with_jar_packages_invalid_file_name test fails on windows > -- > > Key: BEAM-7253 > URL: https://issues.apache.org/jira/browse/BEAM-7253 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Heejong Lee >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > test_with_jar_packages_invalid_file_name test fails on windows. possibly > different class path separator on windows ";" as compared to linux ":". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows
[ https://issues.apache.org/jira/browse/BEAM-7253?focusedWorklogId=239544=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239544 ] ASF GitHub Bot logged work on BEAM-7253: Author: ASF GitHub Bot Created on: 09/May/19 00:57 Start Date: 09/May/19 00:57 Worklog Time Spent: 10m Work Description: ihji commented on pull request #8537: [BEAM-7253] test_with_jar_packages_invalid_file_name test fails on Windows URL: https://github.com/apache/beam/pull/8537 Using ';' for classpath separator on Windows Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch) --- |Java | Python | Go | Website --- | --- | --- | --- | --- Non-portable | [![Build
[jira] [Created] (BEAM-7253) test_with_jar_packages_invalid_file_name test fails on windows
Heejong Lee created BEAM-7253: - Summary: test_with_jar_packages_invalid_file_name test fails on windows Key: BEAM-7253 URL: https://issues.apache.org/jira/browse/BEAM-7253 Project: Beam Issue Type: Bug Components: sdk-py-core Reporter: Heejong Lee test_with_jar_packages_invalid_file_name test fails on windows. possibly different class path separator on windows ";" as compared to linux ":". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7103) Adding AvroGenericCoder for simple dict type cross-language data transfer
[ https://issues.apache.org/jira/browse/BEAM-7103?focusedWorklogId=239539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239539 ] ASF GitHub Bot logged work on BEAM-7103: Author: ASF GitHub Bot Created on: 09/May/19 00:43 Start Date: 09/May/19 00:43 Worklog Time Spent: 10m Work Description: ihji commented on issue #8342: [BEAM-7103] Adding AvroGenericCoder for cross-language data transfer URL: https://github.com/apache/beam/pull/8342#issuecomment-485586933 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239539) Time Spent: 40m (was: 0.5h) > Adding AvroGenericCoder for simple dict type cross-language data transfer > - > > Key: BEAM-7103 > URL: https://issues.apache.org/jira/browse/BEAM-7103 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core, sdk-py-core >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Adding AvroGenericCoder for simple dict type cross-language data transfer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB
[ https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835989#comment-16835989 ] Ahmed El.Hussaini commented on BEAM-4567: - Sweet! > Can't use mongo connector with Atlas MongoDB > > > Key: BEAM-4567 > URL: https://issues.apache.org/jira/browse/BEAM-4567 > Project: Beam > Issue Type: Bug > Components: io-java-mongodb >Affects Versions: 2.4.0 > Environment: Google Cloud Dataflow >Reporter: Lucas de Sio Rosa >Assignee: Ahmed El.Hussaini >Priority: Major > Labels: mongodb > Fix For: 2.12.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > I can't use the MongoDB connector with a managed Atlas instance. The current > implementations makes use of splitVector which is a high-privilege function > that cannot be assigned to any user in Atlas. > An open Jira issue for MongoDB suggests using $sample and $bucketAuto to > circunvent this necessity. > Following is the exception thrown (removed some identifiable information): > Exception in thread "main" > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > com.mongodb.MongoCommandException: Command failed with error 13: 'not > authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297) > > at > br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27) > > Caused by: com.mongodb.MongoCommandException: Command failed with error 13: > 'not authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at > com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115) > > at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114) > > at > com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159) > > at > com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286) > > at > com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84) > > at > com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55) > > at com.mongodb.Mongo.execute(Mongo.java:772) > > at com.mongodb.Mongo$2.execute(Mongo.java:759) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114) > > at > org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332) > > at > org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210) > > at > org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87) > > at > org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239538 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 09/May/19 00:38 Start Date: 09/May/19 00:38 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#issuecomment-490700663 Not ready to review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239538) Time Spent: 1h 20m (was: 1h 10m) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction
[ https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=239535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239535 ] ASF GitHub Bot logged work on BEAM-7138: Author: ASF GitHub Bot Created on: 09/May/19 00:32 Start Date: 09/May/19 00:32 Worklog Time Spent: 10m Work Description: ihji commented on pull request #8396: [BEAM-7138] keep Java serialized coder in wire coder construction URL: https://github.com/apache/beam/pull/8396 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239535) Time Spent: 3h (was: 2h 50m) > keep Java serialized coder in length-prefixed wire coder construction > - > > Key: BEAM-7138 > URL: https://issues.apache.org/jira/browse/BEAM-7138 > Project: Beam > Issue Type: Improvement > Components: java-fn-execution >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > don't replace Java serialized coder with byte array coder in length-prefixed > wire coder construction. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction
[ https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=239534=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239534 ] ASF GitHub Bot logged work on BEAM-7138: Author: ASF GitHub Bot Created on: 09/May/19 00:32 Start Date: 09/May/19 00:32 Worklog Time Spent: 10m Work Description: ihji commented on issue #8396: [BEAM-7138] keep Java serialized coder in wire coder construction URL: https://github.com/apache/beam/pull/8396#issuecomment-490699720 Closing this PR. This PR does not fix the source of the problem. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239534) Time Spent: 3h (was: 2h 50m) > keep Java serialized coder in length-prefixed wire coder construction > - > > Key: BEAM-7138 > URL: https://issues.apache.org/jira/browse/BEAM-7138 > Project: Beam > Issue Type: Improvement > Components: java-fn-execution >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > don't replace Java serialized coder with byte array coder in length-prefixed > wire coder construction. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB
[ https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835986#comment-16835986 ] Ahmed El.Hussaini commented on BEAM-4567: - Hello [~jcornejo], In order to use MongoDbIO with Atlas you need to explicitly to call `withBucketAuto(true)` when creating the `Read` object. > Can't use mongo connector with Atlas MongoDB > > > Key: BEAM-4567 > URL: https://issues.apache.org/jira/browse/BEAM-4567 > Project: Beam > Issue Type: Bug > Components: io-java-mongodb >Affects Versions: 2.4.0 > Environment: Google Cloud Dataflow >Reporter: Lucas de Sio Rosa >Assignee: Ahmed El.Hussaini >Priority: Major > Labels: mongodb > Fix For: 2.12.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > I can't use the MongoDB connector with a managed Atlas instance. The current > implementations makes use of splitVector which is a high-privilege function > that cannot be assigned to any user in Atlas. > An open Jira issue for MongoDB suggests using $sample and $bucketAuto to > circunvent this necessity. > Following is the exception thrown (removed some identifiable information): > Exception in thread "main" > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > com.mongodb.MongoCommandException: Command failed with error 13: 'not > authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297) > > at > br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27) > > Caused by: com.mongodb.MongoCommandException: Command failed with error 13: > 'not authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at > com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115) > > at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114) > > at > com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159) > > at > com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286) > > at > com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84) > > at > com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55) > > at com.mongodb.Mongo.execute(Mongo.java:772) > > at com.mongodb.Mongo$2.execute(Mongo.java:759) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114) > > at > org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332) > > at > org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210) > > at > org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87) > > at > org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction
[ https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=239533=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239533 ] ASF GitHub Bot logged work on BEAM-7138: Author: ASF GitHub Bot Created on: 09/May/19 00:28 Start Date: 09/May/19 00:28 Worklog Time Spent: 10m Work Description: ihji commented on issue #8396: [BEAM-7138] keep Java serialized coder in wire coder construction URL: https://github.com/apache/beam/pull/8396#issuecomment-490699040 > Reads are deliberately translated by the Runner to be able to support unbounded sources. This might not be true for portability framework. I don't know whether unbounded `Read` transform works just fine on Flink portable runner or not. But, if it does, I think it's because `FlinkStreamingPortablePipelineTranslator` does not use `WireCoder` for translating `Read` transform. In case of `FlinkBatchPortablePipelineTranslator`, it uses ``` outputCoder = WireCoders.instantiateRunnerWireCoder(collectionNode, pipeline.getComponents()); ``` When `Read` transform run by Flink runner itself produces `PCollection` of something that should be encoded with `SerializableCoder`, it will throw the exception because `SerializableCoder` is not supported by a runner wire coder. If we generate elements with SDF, everything should be okay since the source `PCollection` is the output of `DoFn` and SDK harness supports any coders. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239533) Time Spent: 2h 50m (was: 2h 40m) > keep Java serialized coder in length-prefixed wire coder construction > - > > Key: BEAM-7138 > URL: https://issues.apache.org/jira/browse/BEAM-7138 > Project: Beam > Issue Type: Improvement > Components: java-fn-execution >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > don't replace Java serialized coder with byte array coder in length-prefixed > wire coder construction. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7143) adding withConsumerConfigUpdates
[ https://issues.apache.org/jira/browse/BEAM-7143?focusedWorklogId=239537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239537 ] ASF GitHub Bot logged work on BEAM-7143: Author: ASF GitHub Bot Created on: 09/May/19 00:35 Start Date: 09/May/19 00:35 Worklog Time Spent: 10m Work Description: ihji commented on issue #8398: [BEAM-7143] adding withConsumerConfigUpdates URL: https://github.com/apache/beam/pull/8398#issuecomment-490700040 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239537) Time Spent: 40m (was: 0.5h) > adding withConsumerConfigUpdates > > > Key: BEAM-7143 > URL: https://issues.apache.org/jira/browse/BEAM-7143 > Project: Beam > Issue Type: Improvement > Components: io-java-kafka >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > To modify `ConsumerConfig` for main consumer, we use > `updateConsumerProperties`. However, to modify `ConsumerConfig` for offset > consumer, the right method is `withOffsetConsumerConfigOverrides`. It would > be good to match both names for improving usability. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7143) adding withConsumerConfigUpdates
[ https://issues.apache.org/jira/browse/BEAM-7143?focusedWorklogId=239536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239536 ] ASF GitHub Bot logged work on BEAM-7143: Author: ASF GitHub Bot Created on: 09/May/19 00:34 Start Date: 09/May/19 00:34 Worklog Time Spent: 10m Work Description: ihji commented on issue #8398: [BEAM-7143] adding withConsumerConfigUpdates URL: https://github.com/apache/beam/pull/8398#issuecomment-490700040 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239536) Time Spent: 0.5h (was: 20m) > adding withConsumerConfigUpdates > > > Key: BEAM-7143 > URL: https://issues.apache.org/jira/browse/BEAM-7143 > Project: Beam > Issue Type: Improvement > Components: io-java-kafka >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > To modify `ConsumerConfig` for main consumer, we use > `updateConsumerProperties`. However, to modify `ConsumerConfig` for offset > consumer, the right method is `withOffsetConsumerConfigOverrides`. It would > be good to match both names for improving usability. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB
[ https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835985#comment-16835985 ] Javier Cornejo commented on BEAM-4567: -- I am sorry I already did it. I had to use aggreation. Thanks again!! > Can't use mongo connector with Atlas MongoDB > > > Key: BEAM-4567 > URL: https://issues.apache.org/jira/browse/BEAM-4567 > Project: Beam > Issue Type: Bug > Components: io-java-mongodb >Affects Versions: 2.4.0 > Environment: Google Cloud Dataflow >Reporter: Lucas de Sio Rosa >Assignee: Ahmed El.Hussaini >Priority: Major > Labels: mongodb > Fix For: 2.12.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > I can't use the MongoDB connector with a managed Atlas instance. The current > implementations makes use of splitVector which is a high-privilege function > that cannot be assigned to any user in Atlas. > An open Jira issue for MongoDB suggests using $sample and $bucketAuto to > circunvent this necessity. > Following is the exception thrown (removed some identifiable information): > Exception in thread "main" > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > com.mongodb.MongoCommandException: Command failed with error 13: 'not > authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297) > > at > br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27) > > Caused by: com.mongodb.MongoCommandException: Command failed with error 13: > 'not authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at > com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115) > > at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114) > > at > com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159) > > at > com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286) > > at > com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84) > > at > com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55) > > at com.mongodb.Mongo.execute(Mongo.java:772) > > at com.mongodb.Mongo$2.execute(Mongo.java:759) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114) > > at > org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332) > > at > org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210) > > at > org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87) > > at > org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239529=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239529 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 09/May/19 00:17 Start Date: 09/May/19 00:17 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282301017 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context): @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None) def from_runner_api_parameter(unused_parameter, unused_context): return Impulse() + + +class ApproximateUniqueGlobally(PTransform): Review comment: Does `stats.py` sound good? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239529) Time Spent: 1h 10m (was: 1h) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239528=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239528 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 09/May/19 00:17 Start Date: 09/May/19 00:17 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282301017 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context): @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None) def from_runner_api_parameter(unused_parameter, unused_context): return Impulse() + + +class ApproximateUniqueGlobally(PTransform): Review comment: Does stats.py sound good? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239528) Time Spent: 1h (was: 50m) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239531 ] ASF GitHub Bot logged work on BEAM-6985: Author: ASF GitHub Bot Created on: 09/May/19 00:22 Start Date: 09/May/19 00:22 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8453: [BEAM-6985] TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates URL: https://github.com/apache/beam/pull/8453#discussion_r282301351 ## File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py ## @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self): native_type_compatibility.convert_to_beam_types(typing_types), beam_types) + def test_is_sub_class(self): +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Dict, +derived=typing.Dict[bytes, int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.List[bytes])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Set, +derived=typing.Set[int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Set[float])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Tuple, +derived=typing.Tuple[int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Tuple[bytes])) + + @unittest.skipIf(sys.version_info >= (2, 7, 0), Review comment: Why is this a correct behavior for this function to return different results on Py2 and Py3? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239531) Time Spent: 3h 10m (was: 3h) > TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ > > > Key: BEAM-6985 > URL: https://issues.apache.org/jira/browse/BEAM-6985 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > The following tests are failing: > * test_convert_nested_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_types > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > With similar errors, where `typing. != `. eg: > {noformat} > FAIL: test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py", > line 79, in test_convert_to_beam_type > beam_type, description) > AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239530=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239530 ] ASF GitHub Bot logged work on BEAM-6985: Author: ASF GitHub Bot Created on: 09/May/19 00:22 Start Date: 09/May/19 00:22 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8453: [BEAM-6985] TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates URL: https://github.com/apache/beam/pull/8453#discussion_r282301745 ## File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py ## @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self): native_type_compatibility.convert_to_beam_types(typing_types), beam_types) + def test_is_sub_class(self): +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Dict, +derived=typing.Dict[bytes, int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.List[bytes])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Set, +derived=typing.Set[int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Set[float])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Tuple, +derived=typing.Tuple[int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Tuple[bytes])) + + @unittest.skipIf(sys.version_info >= (2, 7, 0), + 'Order dosent matter in python 3') + def test_is_sub_class_order(self): +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Dict[bytes, int], +derived=typing.Dict)) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.List[bytes], +derived=typing.List)) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Set[int], +derived=typing.Set)) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Tuple[int], +derived=typing.Tuple)) + + @unittest.skipIf(sys.version_info.major != '3', Review comment: It is discouraged to compare to exact "3" version, as one day there may be Python 4, see also: https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection ```if sys.version_info.major < 3 will be better``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239530) Time Spent: 3h 10m (was: 3h) > TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ > > > Key: BEAM-6985 > URL: https://issues.apache.org/jira/browse/BEAM-6985 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > The following tests are failing: > * test_convert_nested_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_types > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > With similar errors, where `typing. != `. eg: > {noformat} > FAIL: test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py", > line 79, in test_convert_to_beam_type > beam_type, description) > AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict > {noformat} > -- This
[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=239532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239532 ] ASF GitHub Bot logged work on BEAM-6985: Author: ASF GitHub Bot Created on: 09/May/19 00:22 Start Date: 09/May/19 00:22 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8453: [BEAM-6985] TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates URL: https://github.com/apache/beam/pull/8453#discussion_r282299000 ## File path: sdks/python/apache_beam/typehints/native_type_compatibility_test.py ## @@ -103,6 +98,64 @@ def test_convert_to_beam_types(self): native_type_compatibility.convert_to_beam_types(typing_types), beam_types) + def test_is_sub_class(self): +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.Dict, +derived=typing.Dict[bytes, int])) +self.assertFalse(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.Dict[bytes, int])) +self.assertTrue(native_type_compatibility._safe_issubclass( +parent=typing.List, +derived=typing.List[bytes])) +self.assertFalse(native_type_compatibility._safe_issubclass( Review comment: This is a duplicate of line 105. Also, it is sufficient to focus on different classes of usecases for the function. For example between ``` self.assertTrue(native_type_compatibility._safe_issubclass( parent=typing.Set, derived=typing.Set[int])) ``` and ``` self.assertTrue(native_type_compatibility._safe_issubclass( parent=typing.Tuple, derived=typing.Tuple[int])) ``` I would keep only one scenario. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239532) Time Spent: 3h 10m (was: 3h) > TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ > > > Key: BEAM-6985 > URL: https://issues.apache.org/jira/browse/BEAM-6985 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > The following tests are failing: > * test_convert_nested_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > > * test_convert_to_beam_types > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > With similar errors, where `typing. != `. eg: > {noformat} > FAIL: test_convert_to_beam_type > (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest) > -- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py", > line 79, in test_convert_to_beam_type > beam_type, description) > AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer
[ https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835981#comment-16835981 ] Ahmet Altay commented on BEAM-7252: --- Got it. Who would be a good owner for this? > "beam:java:boundedsource" not supported with python optimizer > - > > Key: BEAM-7252 > URL: https://issues.apache.org/jira/browse/BEAM-7252 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Ankur Goenka >Priority: Major > > python pipeline optimizer does not handle external transforms. > > Relevant error stack > == > ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized) > -- > Traceback (most recent call last): > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py", > line 174, in test_external_transforms > assert_that(res, equal_to([i for i in range(1, 10)])) > File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in > __exit__ > self.run().wait_until_finish() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 436, in wait_until_finish > self._job_id, self._state, self._last_error_message())) > RuntimeError: Pipeline > test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 4: Traceback (most recent call last): > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute > response = task() > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 190, in > self._execute(lambda: worker.do_instruction(work), work) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 333, in do_instruction > request.instruction_id) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 353, in process_bundle > instruction_id, request.process_bundle_descriptor_reference) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 305, in get > self.data_channel_factory) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 501, in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 545, in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True)]) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 528, in get_operation > in descriptor.transforms[transform_id].outputs.items() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 527, in > for tag, pcoll_id > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 531, in get_operation > transform_id, transform_consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 790, in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 957, in create > parameter.source, factory.context), > File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in > from_runner_api > parameter_type, constructor = cls._known_urns[fn_proto.spec.urn] > KeyError: u'urn:beam:java:boundedsource:v1' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4567) Can't use mongo connector with Atlas MongoDB
[ https://issues.apache.org/jira/browse/BEAM-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835978#comment-16835978 ] Javier Cornejo commented on BEAM-4567: -- Hello [~iemejia] I can't see how the [BEAM-6241] solved the problem. I use the Filters and limit and MongoIO still is using splitVector command. Could you give a hand? Thanks for the great job. Regards > Can't use mongo connector with Atlas MongoDB > > > Key: BEAM-4567 > URL: https://issues.apache.org/jira/browse/BEAM-4567 > Project: Beam > Issue Type: Bug > Components: io-java-mongodb >Affects Versions: 2.4.0 > Environment: Google Cloud Dataflow >Reporter: Lucas de Sio Rosa >Assignee: Ahmed El.Hussaini >Priority: Major > Labels: mongodb > Fix For: 2.12.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > I can't use the MongoDB connector with a managed Atlas instance. The current > implementations makes use of splitVector which is a high-privilege function > that cannot be assigned to any user in Atlas. > An open Jira issue for MongoDB suggests using $sample and $bucketAuto to > circunvent this necessity. > Following is the exception thrown (removed some identifiable information): > Exception in thread "main" > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > com.mongodb.MongoCommandException: Command failed with error 13: 'not > authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297) > > at > br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27) > > Caused by: com.mongodb.MongoCommandException: Command failed with error 13: > 'not authorized on to execute command \{ splitVector: > ".", keyPattern: { _id: 1 }, force: false, maxChunkSize: 1 > }' on server . The full response is \{ "ok" : 0.0, "errmsg" : "not > authorized on to execute command { splitVector: > \".\", keyPattern: { _id: 1 }, force: false, maxChunkSize: > 1 }", "code" : 13, "codeName" : "Unauthorized" } > > at > com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115) > > at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114) > > at > com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159) > > at > com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286) > > at > com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91) > > at > com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84) > > at > com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55) > > at com.mongodb.Mongo.execute(Mongo.java:772) > > at com.mongodb.Mongo$2.execute(Mongo.java:759) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124) > > at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114) > > at > org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332) > > at > org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210) > > at > org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87) > > at > org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-7102) Adding `jar_packages` experiment option for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heejong Lee resolved BEAM-7102. --- Resolution: Fixed Fix Version/s: 2.13.0 > Adding `jar_packages` experiment option for Python SDK > -- > > Key: BEAM-7102 > URL: https://issues.apache.org/jira/browse/BEAM-7102 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Fix For: 2.13.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Adding `jar_packages` experiment option for Python SDK for staging Jar > artifacts from Python pipeline. This is required for running cross-language > transforms. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer
[ https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835972#comment-16835972 ] Ankur Goenka commented on BEAM-7252: Python pipeline optimizer was not tested earlier. I am adding tests for it in PR [https://github.com/apache/beam/pull/8488] The bug is specifically for Python optimized pipelines using experimental flag "peroptimize=all" > "beam:java:boundedsource" not supported with python optimizer > - > > Key: BEAM-7252 > URL: https://issues.apache.org/jira/browse/BEAM-7252 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Ankur Goenka >Priority: Major > > python pipeline optimizer does not handle external transforms. > > Relevant error stack > == > ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized) > -- > Traceback (most recent call last): > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py", > line 174, in test_external_transforms > assert_that(res, equal_to([i for i in range(1, 10)])) > File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in > __exit__ > self.run().wait_until_finish() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 436, in wait_until_finish > self._job_id, self._state, self._last_error_message())) > RuntimeError: Pipeline > test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 4: Traceback (most recent call last): > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute > response = task() > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 190, in > self._execute(lambda: worker.do_instruction(work), work) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 333, in do_instruction > request.instruction_id) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 353, in process_bundle > instruction_id, request.process_bundle_descriptor_reference) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 305, in get > self.data_channel_factory) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 501, in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 545, in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True)]) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 528, in get_operation > in descriptor.transforms[transform_id].outputs.items() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 527, in > for tag, pcoll_id > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 531, in get_operation > transform_id, transform_consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 790, in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 957, in create > parameter.source, factory.context), > File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in > from_runner_api > parameter_type, constructor = cls._known_urns[fn_proto.spec.urn] > KeyError: u'urn:beam:java:boundedsource:v1' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer
[ https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835972#comment-16835972 ] Ankur Goenka edited comment on BEAM-7252 at 5/8/19 11:52 PM: - Python pipeline optimizer was not tested earlier. I am adding tests for it in PR [https://github.com/apache/beam/pull/8488] The bug is specifically for Python optimized pipelines using experimental flag "peroptimize=all" I am planning to disable this test and we can reenable it when we resolve this Jira was (Author: angoenka): Python pipeline optimizer was not tested earlier. I am adding tests for it in PR [https://github.com/apache/beam/pull/8488] The bug is specifically for Python optimized pipelines using experimental flag "peroptimize=all" > "beam:java:boundedsource" not supported with python optimizer > - > > Key: BEAM-7252 > URL: https://issues.apache.org/jira/browse/BEAM-7252 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Ankur Goenka >Priority: Major > > python pipeline optimizer does not handle external transforms. > > Relevant error stack > == > ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized) > -- > Traceback (most recent call last): > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py", > line 174, in test_external_transforms > assert_that(res, equal_to([i for i in range(1, 10)])) > File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in > __exit__ > self.run().wait_until_finish() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 436, in wait_until_finish > self._job_id, self._state, self._last_error_message())) > RuntimeError: Pipeline > test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 4: Traceback (most recent call last): > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute > response = task() > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 190, in > self._execute(lambda: worker.do_instruction(work), work) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 333, in do_instruction > request.instruction_id) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 353, in process_bundle > instruction_id, request.process_bundle_descriptor_reference) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 305, in get > self.data_channel_factory) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 501, in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 545, in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True)]) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 528, in get_operation > in descriptor.transforms[transform_id].outputs.items() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 527, in > for tag, pcoll_id > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 531, in get_operation > transform_id, transform_consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 790, in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 957, in create > parameter.source, factory.context), > File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in > from_runner_api > parameter_type, constructor = cls._known_urns[fn_proto.spec.urn] > KeyError: u'urn:beam:java:boundedsource:v1' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer
[ https://issues.apache.org/jira/browse/BEAM-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835969#comment-16835969 ] Ahmet Altay commented on BEAM-7252: --- Is this a newly failing test? Do you have a link? > "beam:java:boundedsource" not supported with python optimizer > - > > Key: BEAM-7252 > URL: https://issues.apache.org/jira/browse/BEAM-7252 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Ankur Goenka >Priority: Major > > python pipeline optimizer does not handle external transforms. > > Relevant error stack > == > ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized) > -- > Traceback (most recent call last): > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py", > line 174, in test_external_transforms > assert_that(res, equal_to([i for i in range(1, 10)])) > File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in > __exit__ > self.run().wait_until_finish() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 436, in wait_until_finish > self._job_id, self._state, self._last_error_message())) > RuntimeError: Pipeline > test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 4: Traceback (most recent call last): > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute > response = task() > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 190, in > self._execute(lambda: worker.do_instruction(work), work) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 333, in do_instruction > request.instruction_id) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 353, in process_bundle > instruction_id, request.process_bundle_descriptor_reference) > File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", > line 305, in get > self.data_channel_factory) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 501, in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 545, in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True)]) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 528, in get_operation > in descriptor.transforms[transform_id].outputs.items() > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 527, in > for tag, pcoll_id > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 467, in wrapper > result = cache[args] = func(*args) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 531, in get_operation > transform_id, transform_consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 790, in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) > File > "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", > line 957, in create > parameter.source, factory.context), > File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in > from_runner_api > parameter_type, constructor = cls._known_urns[fn_proto.spec.urn] > KeyError: u'urn:beam:java:boundedsource:v1' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239515 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 08/May/19 23:42 Start Date: 08/May/19 23:42 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282294788 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -518,12 +525,12 @@ def _process_argspec_fn(self): def is_process_bounded(self): """Checks if an object is a bound method on an instance.""" if not isinstance(self.process, types.MethodType): - return False # Not a method + return False # Not a method Review comment: How did you reformat the code? As long as linter still passes, this is fine. However it generally easier to review when format changes are separated from other changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239515) Time Spent: 40m (was: 0.5h) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239516 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 08/May/19 23:42 Start Date: 08/May/19 23:42 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282295018 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context): @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None) def from_runner_api_parameter(unused_parameter, unused_context): return Impulse() + + +class ApproximateUniqueGlobally(PTransform): Review comment: That make sense. Also probably core.py is not the right place anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239516) Time Spent: 50m (was: 40m) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239514 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 08/May/19 23:42 Start Date: 08/May/19 23:42 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282294445 ## File path: sdks/python/setup.py ## @@ -125,6 +125,7 @@ def get_version(): 'pyvcf>=0.6.8,<0.7.0; python_version < "3.0"', 'pyyaml>=3.12,<4.0.0', 'typing>=3.6.0,<3.7.0; python_version < "3.5.0"', +'mmh3>=2.5.1; python_version >= "2.7"', Review comment: A few questions: - What is this dependency? The pypi page says this is a wrapper. Does it require other things to be installed? Is this the only option we have? - You do need the python_version >= "2.7" part, because all our sdks only support py >= 2.7 - Can you add an upper bound here. Maybe <3.0.0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239514) Time Spent: 0.5h (was: 20m) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7252) "beam:java:boundedsource" not supported with python optimizer
Ankur Goenka created BEAM-7252: -- Summary: "beam:java:boundedsource" not supported with python optimizer Key: BEAM-7252 URL: https://issues.apache.org/jira/browse/BEAM-7252 Project: Beam Issue Type: Improvement Components: sdk-py-core Reporter: Ankur Goenka python pipeline optimizer does not handle external transforms. Relevant error stack == ERROR: test_external_transforms (__main__.FlinkRunnerTestOptimized) -- Traceback (most recent call last): File "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/flink_runner_test.py", line 174, in test_external_transforms assert_that(res, equal_to([i for i in range(1, 10)])) File "/tmp/beam/beam/sdks/python/apache_beam/pipeline.py", line 426, in __exit__ self.run().wait_until_finish() File "/tmp/beam/beam/sdks/python/apache_beam/runners/portability/portable_runner.py", line 436, in wait_until_finish self._job_id, self._state, self._last_error_message())) RuntimeError: Pipeline test_external_transforms_1557358286.71_f49d7fd6-7c14-4ded-8946-3ac3dad4d4c9 failed in state FAILED: java.lang.RuntimeException: Error received from SDK harness for instruction 4: Traceback (most recent call last): File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 157, in _execute response = task() File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 190, in self._execute(lambda: worker.do_instruction(work), work) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 333, in do_instruction request.instruction_id) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 353, in process_bundle instruction_id, request.process_bundle_descriptor_reference) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", line 305, in get self.data_channel_factory) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 501, in __init__ self.ops = self.create_execution_tree(self.process_bundle_descriptor) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 545, in create_execution_tree descriptor.transforms, key=topological_height, reverse=True)]) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 467, in wrapper result = cache[args] = func(*args) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 528, in get_operation in descriptor.transforms[transform_id].outputs.items() File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 527, in for tag, pcoll_id File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 467, in wrapper result = cache[args] = func(*args) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 531, in get_operation transform_id, transform_consumers) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 790, in create_operation return creator(self, transform_id, transform_proto, payload, consumers) File "/tmp/beam/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py", line 957, in create parameter.source, factory.context), File "/tmp/beam/beam/sdks/python/apache_beam/utils/urns.py", line 113, in from_runner_api parameter_type, constructor = cls._known_urns[fn_proto.spec.urn] KeyError: u'urn:beam:java:boundedsource:v1' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239511=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239511 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 08/May/19 23:32 Start Date: 08/May/19 23:32 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282293233 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -2270,3 +2286,163 @@ def to_runner_api_parameter(self, unused_context): @PTransform.register_urn(common_urns.primitives.IMPULSE.urn, None) def from_runner_api_parameter(unused_parameter, unused_context): return Impulse() + + +class ApproximateUniqueGlobally(PTransform): Review comment: Since the file is growing, is it better to separate each transform to it's own file? I am happy to make that change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239511) Time Spent: 20m (was: 10m) > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6693) ApproximateUnique transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239509 ] ASF GitHub Bot logged work on BEAM-6693: Author: ASF GitHub Bot Created on: 08/May/19 23:30 Start Date: 08/May/19 23:30 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282292979 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -518,12 +525,12 @@ def _process_argspec_fn(self): def is_process_bounded(self): """Checks if an object is a bound method on an instance.""" if not isinstance(self.process, types.MethodType): - return False # Not a method + return False # Not a method Review comment: Format of some lines are changed after I reformat the code. Code changes start from line 2289. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239509) Time Spent: 10m Remaining Estimate: 0h > ApproximateUnique transform for Python SDK > -- > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7131) Spark portable runner appears to be repeating work (in TFX example)
[ https://issues.apache.org/jira/browse/BEAM-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835962#comment-16835962 ] Kyle Weaver commented on BEAM-7131: --- [~robertwb] Narrowed it down to the simplest pipeline that exhibits this behavior: If I have pcolls B and C that both depend on A, the Spark portable runner will compute A B C A B C (whereas the Flink and legacy Spark runners compute only once, A B C). > Spark portable runner appears to be repeating work (in TFX example) > --- > > Key: BEAM-7131 > URL: https://issues.apache.org/jira/browse/BEAM-7131 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > I've been trying to run the TFX Chicago taxi example [1] on the Spark > portable runner. TFDV works fine, but the preprocess step > (preprocess_flink.sh [2]) fails with the following error: > RuntimeError: AlreadyExistsError: file already exists [while running > 'WriteTransformFn/WriteTransformFn'] > Assets are being written multiple times to different temp directories, which > is okay, but the error occurs when they are copied to the same permanent > output directory. Specifically, the copy tree operation in transform_fn_io.py > [3] is run twice with the same output directory. The error doesn't occur when > that code is modified to allow overwriting existing files, but that's only a > shallow fix. While the TF transform should probably be made idempotent, this > is also an issue with the Spark runner, which shouldn't be repeating work > like this regularly (in the absence of a failure condition). > [1] [https://github.com/tensorflow/tfx/tree/master/tfx/examples/chicago_taxi] > [2] > [https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi/preprocess_flink.sh] > [3] > [https://github.com/tensorflow/transform/blob/master/tensorflow_transform/beam/tft_beam_io/transform_fn_io.py#L33-L45] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
[ https://issues.apache.org/jira/browse/BEAM-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri reassigned BEAM-7251: --- Assignee: Udi Meiri > Testing BigQuery client fails queries if job results aren't immediately > available > - > > Key: BEAM-7251 > URL: https://issues.apache.org/jira/browse/BEAM-7251 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Correction: the test client is using a synchronous query with a default > timeout of 10s: > https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query > This matches the timestamps below (5:29:19 to 5:29:29). > Also note that this this method only returns the first page of results. > --- > Adding functionality to fetch query results should solve this issue, which is > probably causing test flakiness. > Log: > May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner > checkForPAssertSuccess > INFO: Success result for Dataflow job > 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 > expected assertions. > May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher > matchesSafely > INFO: Verifying Bigquery data > May 05, 2019 5:29:29 PM > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main > SEVERE: > testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT) > java.lang.AssertionError: > Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f) > but: The query job hasn't completed. Got response: > {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"} > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) > at > org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145) > Exception in thread "main" java.lang.IllegalStateException: Tests failed, > check output logs for details. > at > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154) > But checking BQ logs on the console
[jira] [Updated] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
[ https://issues.apache.org/jira/browse/BEAM-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri updated BEAM-7251: Status: Open (was: Triage Needed) > Testing BigQuery client fails queries if job results aren't immediately > available > - > > Key: BEAM-7251 > URL: https://issues.apache.org/jira/browse/BEAM-7251 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Udi Meiri >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Correction: the test client is using a synchronous query with a default > timeout of 10s: > https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query > This matches the timestamps below (5:29:19 to 5:29:29). > Also note that this this method only returns the first page of results. > --- > Adding functionality to fetch query results should solve this issue, which is > probably causing test flakiness. > Log: > May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner > checkForPAssertSuccess > INFO: Success result for Dataflow job > 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 > expected assertions. > May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher > matchesSafely > INFO: Verifying Bigquery data > May 05, 2019 5:29:29 PM > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main > SEVERE: > testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT) > java.lang.AssertionError: > Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f) > but: The query job hasn't completed. Got response: > {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"} > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) > at > org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145) > Exception in thread "main" java.lang.IllegalStateException: Tests failed, > check output logs for details. > at > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154) > But checking BQ logs on the console reveals that the query job
[jira] [Work logged] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
[ https://issues.apache.org/jira/browse/BEAM-7251?focusedWorklogId=239505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239505 ] ASF GitHub Bot logged work on BEAM-7251: Author: ASF GitHub Bot Created on: 08/May/19 23:26 Start Date: 08/May/19 23:26 Worklog Time Spent: 10m Work Description: udim commented on pull request #8536: [BEAM-7251] Increase timeout for test BQ queries. URL: https://github.com/apache/beam/pull/8536 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch) --- |Java | Python | Go | Website --- | --- | --- | --- | --- Non-portable | [![Build
[jira] [Work logged] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
[ https://issues.apache.org/jira/browse/BEAM-7251?focusedWorklogId=239506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239506 ] ASF GitHub Bot logged work on BEAM-7251: Author: ASF GitHub Bot Created on: 08/May/19 23:27 Start Date: 08/May/19 23:27 Worklog Time Spent: 10m Work Description: udim commented on issue #8536: [BEAM-7251] Increase timeout for test BQ queries. URL: https://github.com/apache/beam/pull/8536#issuecomment-490687401 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239506) Time Spent: 20m (was: 10m) > Testing BigQuery client fails queries if job results aren't immediately > available > - > > Key: BEAM-7251 > URL: https://issues.apache.org/jira/browse/BEAM-7251 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Udi Meiri >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Correction: the test client is using a synchronous query with a default > timeout of 10s: > https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query > This matches the timestamps below (5:29:19 to 5:29:29). > Also note that this this method only returns the first page of results. > --- > Adding functionality to fetch query results should solve this issue, which is > probably causing test flakiness. > Log: > May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner > checkForPAssertSuccess > INFO: Success result for Dataflow job > 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 > expected assertions. > May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher > matchesSafely > INFO: Verifying Bigquery data > May 05, 2019 5:29:29 PM > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main > SEVERE: > testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT) > java.lang.AssertionError: > Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f) > but: The query job hasn't completed. Got response: > {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"} > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) > at > org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at
[jira] [Work logged] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
[ https://issues.apache.org/jira/browse/BEAM-7251?focusedWorklogId=239507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239507 ] ASF GitHub Bot logged work on BEAM-7251: Author: ASF GitHub Bot Created on: 08/May/19 23:27 Start Date: 08/May/19 23:27 Worklog Time Spent: 10m Work Description: udim commented on issue #8536: [BEAM-7251] Increase timeout for test BQ queries. URL: https://github.com/apache/beam/pull/8536#issuecomment-490687488 R: @tvalentyn This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239507) Time Spent: 0.5h (was: 20m) > Testing BigQuery client fails queries if job results aren't immediately > available > - > > Key: BEAM-7251 > URL: https://issues.apache.org/jira/browse/BEAM-7251 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Udi Meiri >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Correction: the test client is using a synchronous query with a default > timeout of 10s: > https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query > This matches the timestamps below (5:29:19 to 5:29:29). > Also note that this this method only returns the first page of results. > --- > Adding functionality to fetch query results should solve this issue, which is > probably causing test flakiness. > Log: > May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner > checkForPAssertSuccess > INFO: Success result for Dataflow job > 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 > expected assertions. > May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher > matchesSafely > INFO: Verifying Bigquery data > May 05, 2019 5:29:29 PM > com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main > SEVERE: > testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT) > java.lang.AssertionError: > Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f) > but: The query job hasn't completed. Got response: > {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"} > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90) > at > org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) > at > org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70) > at > org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at
[jira] [Closed] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary
[ https://issues.apache.org/jira/browse/BEAM-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik closed BEAM-7235. --- Resolution: Fixed Fix Version/s: 2.13.0 > GrpcWindmillServer creates commit streams before necessary > -- > > Key: BEAM-7235 > URL: https://issues.apache.org/jira/browse/BEAM-7235 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Sam Whittle >Assignee: Sam Whittle >Priority: Minor > Fix For: 2.13.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > This can cause spammy logs if there are no commits before the stream deadline > is reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary
[ https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239499 ] ASF GitHub Bot logged work on BEAM-7235: Author: ASF GitHub Bot Created on: 08/May/19 22:50 Start Date: 08/May/19 22:50 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8512: [BEAM-7235] StreamingDataflowWorker creates commit stream only when commit available URL: https://github.com/apache/beam/pull/8512 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239499) Time Spent: 1.5h (was: 1h 20m) > GrpcWindmillServer creates commit streams before necessary > -- > > Key: BEAM-7235 > URL: https://issues.apache.org/jira/browse/BEAM-7235 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Sam Whittle >Assignee: Sam Whittle >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > This can cause spammy logs if there are no commits before the stream deadline > is reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
[ https://issues.apache.org/jira/browse/BEAM-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri updated BEAM-7251: Description: Correction: the test client is using a synchronous query with a default timeout of 10s: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query This matches the timestamps below (5:29:19 to 5:29:29). Also note that this this method only returns the first page of results. --- Adding functionality to fetch query results should solve this issue, which is probably causing test flakiness. Log: May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner checkForPAssertSuccess INFO: Success result for Dataflow job 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 expected assertions. May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher matchesSafely INFO: Verifying Bigquery data May 05, 2019 5:29:29 PM com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main SEVERE: testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT) java.lang.AssertionError: Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f) but: The query job hasn't completed. Got response: {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"} at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) at org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138) at org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90) at org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) at org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199) at org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70) at org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145) Exception in thread "main" java.lang.IllegalStateException: Tests failed, check output logs for details. at com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154) But checking BQ logs on the console reveals that the query job did run: 2019-05-05 17:29:29.601 PDT Bigquery query queries 2019-05-05 17:29:31.956 PDT Bigquery jobcompleted job_cZkICLalRsrnivu78BX1y3UwMhIz was: Adding functionality to fetch query results should solve this issue, which is probably causing test flakiness. Log: May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner checkForPAssertSuccess INFO: Success result for Dataflow job 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 expected assertions. May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher
[jira] [Work logged] (BEAM-6916) Reorganize Beam SQL docs
[ https://issues.apache.org/jira/browse/BEAM-6916?focusedWorklogId=239490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239490 ] ASF GitHub Bot logged work on BEAM-6916: Author: ASF GitHub Bot Created on: 08/May/19 22:28 Start Date: 08/May/19 22:28 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #8455: [BEAM-6916] Reorg Beam SQL docs and add Calcite section URL: https://github.com/apache/beam/pull/8455#issuecomment-490673528 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239490) Time Spent: 1h 50m (was: 1h 40m) Remaining Estimate: 166h 10m (was: 166h 20m) > Reorganize Beam SQL docs > > > Key: BEAM-6916 > URL: https://issues.apache.org/jira/browse/BEAM-6916 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Rose Nguyen >Assignee: Rose Nguyen >Priority: Minor > Original Estimate: 168h > Time Spent: 1h 50m > Remaining Estimate: 166h 10m > > This page describes the Calcite SQL dialect supported by Beam SQL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner
[ https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239488 ] ASF GitHub Bot logged work on BEAM-6880: Author: ASF GitHub Bot Created on: 08/May/19 22:25 Start Date: 08/May/19 22:25 Worklog Time Spent: 10m Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove deprecated Reference Runner code. URL: https://github.com/apache/beam/pull/8380#issuecomment-490672901 I removed them in this PR since it was causing some tests to fail (validatesPortableRunner test for the removed code), it's just that the failing test wasn't exercised and should've been deleted anyway. Plus I'd prefer to get it all done in one stroke if possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239488) Time Spent: 4h 40m (was: 4.5h) > Deprecate Java Portable Reference Runner > > > Key: BEAM-6880 > URL: https://issues.apache.org/jira/browse/BEAM-6880 > Project: Beam > Issue Type: New Feature > Components: runner-direct, test-failures, testing >Reporter: Mikhail Gryzykhin >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > This ticket is about deprecating Java Portable Reference runner. > > Discussion is happening in [this > thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]] > > > Current summary is: disable beam_PostCommit_Java_PVR_Reference job. > Keeping or removing reference runner code is still under discussion. It is > suggested to create PR that removes relevant code and start voting there. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7240) Kinesis IO Watermark Computation Improvements
[ https://issues.apache.org/jira/browse/BEAM-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-7240: --- Status: Open (was: Triage Needed) > Kinesis IO Watermark Computation Improvements > - > > Key: BEAM-7240 > URL: https://issues.apache.org/jira/browse/BEAM-7240 > Project: Beam > Issue Type: Improvement > Components: io-java-kinesis >Reporter: Ajo Thomas >Assignee: Ajo Thomas >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Currently, watermarks in kinesis IO are computed taking into account the > record arrival time in a {{KinesisRecord}}. The arrival time might not always > be the right representation of the event time. The user of the IO should be > able to specify how they want to extract the event time from the > KinesisRecord. > As the per current logic, the end user of the IO cannot control watermark > computation in any way. A user should be able to control watermark > computation through some custom heuristics or configurable params like time > duration to advance the watermark if no data was received (could be due to a > shard getting stalled. The watermark should advance and not be stalled in > that case). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4150) Standardize use of PCollection coder proto attribute
[ https://issues.apache.org/jira/browse/BEAM-4150?focusedWorklogId=239483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239483 ] ASF GitHub Bot logged work on BEAM-4150: Author: ASF GitHub Bot Created on: 08/May/19 22:06 Start Date: 08/May/19 22:06 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8533: [BEAM-4150] Downgrade missing coder error logs to info logs. URL: https://github.com/apache/beam/pull/8533 This log message shows up frequently in the error logs. Users cannot do much about it and they are not fatal. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch) --- |Java | Python | Go | Website --- |
[jira] [Work logged] (BEAM-562) DoFn Reuse: Add new methods to DoFn
[ https://issues.apache.org/jira/browse/BEAM-562?focusedWorklogId=239480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239480 ] ASF GitHub Bot logged work on BEAM-562: --- Author: ASF GitHub Bot Created on: 08/May/19 21:42 Start Date: 08/May/19 21:42 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7994: [BEAM-562] Add DoFn.setup and DoFn.teardown to Python SDK URL: https://github.com/apache/beam/pull/7994#issuecomment-490661743 Run Python PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239480) Time Spent: 9h (was: 8h 50m) > DoFn Reuse: Add new methods to DoFn > --- > > Key: BEAM-562 > URL: https://issues.apache.org/jira/browse/BEAM-562 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Yifan Mai >Priority: Major > Labels: sdk-consistency > Time Spent: 9h > Remaining Estimate: 0h > > Java SDK added setup and teardown methods to the DoFns. This makes DoFns > reusable and provide performance improvements. Python SDK should add support > for these new DoFn methods: > Proposal doc: > https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f# -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7246) Create a Spanner IO for Python
[ https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shehzaad Nakhoda updated BEAM-7246: --- Description: Add I/O support for Google Cloud Spanner for the Python SDK. Integration and performance tests are a separate work item (not included here). See https://beam.apache.org/documentation/io/built-in/. The goal is to add Google Clound Spanner to the Database column for the Python/Batch row. > Create a Spanner IO for Python > -- > > Key: BEAM-7246 > URL: https://issues.apache.org/jira/browse/BEAM-7246 > Project: Beam > Issue Type: Bug > Components: io-python-gcp >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > > Add I/O support for Google Cloud Spanner for the Python SDK. > Integration and performance tests are a separate work item (not included > here). > See https://beam.apache.org/documentation/io/built-in/. The goal is to add > Google Clound Spanner to the Database column for the Python/Batch row. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-562) DoFn Reuse: Add new methods to DoFn
[ https://issues.apache.org/jira/browse/BEAM-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay reassigned BEAM-562: Assignee: Yifan Mai (was: Shehzaad Nakhoda) > DoFn Reuse: Add new methods to DoFn > --- > > Key: BEAM-562 > URL: https://issues.apache.org/jira/browse/BEAM-562 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Yifan Mai >Priority: Major > Labels: sdk-consistency > Time Spent: 8h 50m > Remaining Estimate: 0h > > Java SDK added setup and teardown methods to the DoFns. This makes DoFns > reusable and provide performance improvements. Python SDK should add support > for these new DoFn methods: > Proposal doc: > https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f# -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-562) DoFn Reuse: Add new methods to DoFn
[ https://issues.apache.org/jira/browse/BEAM-562?focusedWorklogId=239479=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239479 ] ASF GitHub Bot logged work on BEAM-562: --- Author: ASF GitHub Bot Created on: 08/May/19 21:37 Start Date: 08/May/19 21:37 Worklog Time Spent: 10m Work Description: yifanmai commented on issue #7994: [BEAM-562] Add DoFn.setup and DoFn.teardown to Python SDK URL: https://github.com/apache/beam/pull/7994#issuecomment-490660293 @kennknowles @aaltay tests are passing now. PTAL? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239479) Time Spent: 8h 50m (was: 8h 40m) > DoFn Reuse: Add new methods to DoFn > --- > > Key: BEAM-562 > URL: https://issues.apache.org/jira/browse/BEAM-562 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Shehzaad Nakhoda >Priority: Major > Labels: sdk-consistency > Time Spent: 8h 50m > Remaining Estimate: 0h > > Java SDK added setup and teardown methods to the DoFns. This makes DoFns > reusable and provide performance improvements. Python SDK should add support > for these new DoFn methods: > Proposal doc: > https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f# -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-562) DoFn Reuse: Add new methods to DoFn
[ https://issues.apache.org/jira/browse/BEAM-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835919#comment-16835919 ] Shehzaad Nakhoda commented on BEAM-562: --- [~altay] thanks for the heads up. Can you please assign this to [~myffi...@gmail.com]? > DoFn Reuse: Add new methods to DoFn > --- > > Key: BEAM-562 > URL: https://issues.apache.org/jira/browse/BEAM-562 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Shehzaad Nakhoda >Priority: Major > Labels: sdk-consistency > Time Spent: 8h 40m > Remaining Estimate: 0h > > Java SDK added setup and teardown methods to the DoFns. This makes DoFns > reusable and provide performance improvements. Python SDK should add support > for these new DoFn methods: > Proposal doc: > https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f# -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7251) Testing BigQuery client fails queries if job results aren't immediately available
Udi Meiri created BEAM-7251: --- Summary: Testing BigQuery client fails queries if job results aren't immediately available Key: BEAM-7251 URL: https://issues.apache.org/jira/browse/BEAM-7251 Project: Beam Issue Type: Bug Components: io-java-gcp Reporter: Udi Meiri Adding functionality to fetch query results should solve this issue, which is probably causing test flakiness. Log: May 05, 2019 5:29:19 PM org.apache.beam.runners.dataflow.TestDataflowRunner checkForPAssertSuccess INFO: Success result for Dataflow job 2019-05-05_17_25_26-4118012232925193147. Found 0 success, 0 failures out of 0 expected assertions. May 05, 2019 5:29:19 PM org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher matchesSafely INFO: Verifying Bigquery data May 05, 2019 5:29:29 PM com.google.cloud.dataflow.testing.DataflowJUnitTestRunner main SEVERE: testE2eBigQueryTornadoesWithStorageApi(org.apache.beam.examples.cookbook.BigQueryTornadoesIT) java.lang.AssertionError: Expected: Expected checksum is (1ab4c7ec460b94bbb3c3885b178bf0e6bed56e1f) but: The query job hasn't completed. Got response: {"jobComplete":false,"jobReference":{"jobId":"job_cZkICLalRsrnivu78BX1y3UwMhIz","location":"US","projectId":"xxx"},"kind":"bigquery#queryResponse"} at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) at org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:138) at org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:90) at org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:55) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) at org.apache.beam.examples.cookbook.BigQueryTornadoes.runBigQueryTornadoes(BigQueryTornadoes.java:199) at org.apache.beam.examples.cookbook.BigQueryTornadoesIT.runE2EBigQueryTornadoesTest(BigQueryTornadoesIT.java:70) at org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2eBigQueryTornadoesWithStorageApi(BigQueryTornadoesIT.java:95) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:145) Exception in thread "main" java.lang.IllegalStateException: Tests failed, check output logs for details. at com.google.cloud.dataflow.testing.DataflowJUnitTestRunner.main(DataflowJUnitTestRunner.java:154) But checking BQ logs on the console reveals that the query job did run: 2019-05-05 17:29:29.601 PDT Bigquery query queries 2019-05-05 17:29:31.956 PDT Bigquery jobcompleted job_cZkICLalRsrnivu78BX1y3UwMhIz -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7250) Determine whether Java SDF API should be change to work via annotations
Pablo Estrada created BEAM-7250: --- Summary: Determine whether Java SDF API should be change to work via annotations Key: BEAM-7250 URL: https://issues.apache.org/jira/browse/BEAM-7250 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Pablo Estrada This would be akin to the Python change in [https://github.com/apache/beam/pull/8430] This is discussed here: [https://lists.apache.org/thread.html/7e1ebc970891778c2dbedec7e9846ab221ef12f38e689895567f1f4e@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-7240) Kinesis IO Watermark Computation Improvements
[ https://issues.apache.org/jira/browse/BEAM-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-7240: --- Assignee: Ajo Thomas > Kinesis IO Watermark Computation Improvements > - > > Key: BEAM-7240 > URL: https://issues.apache.org/jira/browse/BEAM-7240 > Project: Beam > Issue Type: Improvement > Components: io-java-kinesis >Reporter: Ajo Thomas >Assignee: Ajo Thomas >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Currently, watermarks in kinesis IO are computed taking into account the > record arrival time in a {{KinesisRecord}}. The arrival time might not always > be the right representation of the event time. The user of the IO should be > able to specify how they want to extract the event time from the > KinesisRecord. > As the per current logic, the end user of the IO cannot control watermark > computation in any way. A user should be able to control watermark > computation through some custom heuristics or configurable params like time > duration to advance the watermark if no data was received (could be due to a > shard getting stalled. The watermark should advance and not be stalled in > that case). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary
[ https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239477 ] ASF GitHub Bot logged work on BEAM-7235: Author: ASF GitHub Bot Created on: 08/May/19 21:00 Start Date: 08/May/19 21:00 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8512: [BEAM-7235] StreamingDataflowWorker creates commit stream only when commit available URL: https://github.com/apache/beam/pull/8512#discussion_r282247785 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java ## @@ -1508,8 +1513,10 @@ private void streamingCommitLoop() { break; } } - commitStream.flush(); - streamPool.releaseStream(commitStream); + if (commitStream) { Review comment: You built the runner that submits the pipeline and not the worker component. I think you meant to do `./gradlew :beam-runners-google-cloud-dataflow-java:worker:build` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239477) Time Spent: 1h 20m (was: 1h 10m) > GrpcWindmillServer creates commit streams before necessary > -- > > Key: BEAM-7235 > URL: https://issues.apache.org/jira/browse/BEAM-7235 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Sam Whittle >Assignee: Sam Whittle >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > This can cause spammy logs if there are no commits before the stream deadline > is reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6916) Reorganize Beam SQL docs
[ https://issues.apache.org/jira/browse/BEAM-6916?focusedWorklogId=239476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239476 ] ASF GitHub Bot logged work on BEAM-6916: Author: ASF GitHub Bot Created on: 08/May/19 20:54 Start Date: 08/May/19 20:54 Worklog Time Spent: 10m Work Description: melap commented on issue #8455: [BEAM-6916] Reorg Beam SQL docs and add Calcite section URL: https://github.com/apache/beam/pull/8455#issuecomment-490647139 Staged: http://apache-beam-website-pull-requests.storage.googleapis.com/8455/documentation/dsls/sql/overview/index.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239476) Time Spent: 1h 40m (was: 1.5h) Remaining Estimate: 166h 20m (was: 166.5h) > Reorganize Beam SQL docs > > > Key: BEAM-6916 > URL: https://issues.apache.org/jira/browse/BEAM-6916 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Rose Nguyen >Assignee: Rose Nguyen >Priority: Minor > Original Estimate: 168h > Time Spent: 1h 40m > Remaining Estimate: 166h 20m > > This page describes the Calcite SQL dialect supported by Beam SQL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.
[ https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239474 ] ASF GitHub Bot logged work on BEAM-6959: Author: ASF GitHub Bot Created on: 08/May/19 20:49 Start Date: 08/May/19 20:49 Worklog Time Spent: 10m Work Description: ibzib commented on issue #8531: [BEAM-6959] Add Flink tests for Go SDK URL: https://github.com/apache/beam/pull/8531#issuecomment-490645594 > in probably a different PR? Yep, that will be the follow-up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239474) Time Spent: 40m (was: 0.5h) > Run Go SDK Post Commit tests against the Flink Runner. > --- > > Key: BEAM-6959 > URL: https://issues.apache.org/jira/browse/BEAM-6959 > Project: Beam > Issue Type: Sub-task > Components: runner-flink, sdk-go, testing >Reporter: Robert Burke >Assignee: Kyle Weaver >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > See parent task BEAM-6958 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary
[ https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239471 ] ASF GitHub Bot logged work on BEAM-7235: Author: ASF GitHub Bot Created on: 08/May/19 20:44 Start Date: 08/May/19 20:44 Worklog Time Spent: 10m Work Description: scwhittle commented on pull request #8512: [BEAM-7235] StreamingDataflowWorker creates commit stream only when commit available URL: https://github.com/apache/beam/pull/8512#discussion_r282242048 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java ## @@ -1508,8 +1513,10 @@ private void streamingCommitLoop() { break; } } - commitStream.flush(); - streamPool.releaseStream(commitStream); + if (commitStream) { Review comment: Oops, fixed, I ran the following ./gradlew :beam-runners-google-cloud-dataflow-java:build does that not actually build this? Or perhaps I ran it in the wrong branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239471) Time Spent: 1h 10m (was: 1h) > GrpcWindmillServer creates commit streams before necessary > -- > > Key: BEAM-7235 > URL: https://issues.apache.org/jira/browse/BEAM-7235 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Sam Whittle >Assignee: Sam Whittle >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > This can cause spammy logs if there are no commits before the stream deadline > is reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.
[ https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239475 ] ASF GitHub Bot logged work on BEAM-6959: Author: ASF GitHub Bot Created on: 08/May/19 20:49 Start Date: 08/May/19 20:49 Worklog Time Spent: 10m Work Description: ibzib commented on issue #8531: [BEAM-6959] Add Flink tests for Go SDK URL: https://github.com/apache/beam/pull/8531#issuecomment-490645691 > in probably a different PR? Yep, that will be the follow-up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239475) Time Spent: 50m (was: 40m) > Run Go SDK Post Commit tests against the Flink Runner. > --- > > Key: BEAM-6959 > URL: https://issues.apache.org/jira/browse/BEAM-6959 > Project: Beam > Issue Type: Sub-task > Components: runner-flink, sdk-go, testing >Reporter: Robert Burke >Assignee: Kyle Weaver >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > See parent task BEAM-6958 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.
[ https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239473 ] ASF GitHub Bot logged work on BEAM-6959: Author: ASF GitHub Bot Created on: 08/May/19 20:49 Start Date: 08/May/19 20:49 Worklog Time Spent: 10m Work Description: ibzib commented on issue #8531: [BEAM-6959] Add Flink tests for Go SDK URL: https://github.com/apache/beam/pull/8531#issuecomment-490645594 > in probably a different PR? Yep, that will be the follow-up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239473) Time Spent: 0.5h (was: 20m) > Run Go SDK Post Commit tests against the Flink Runner. > --- > > Key: BEAM-6959 > URL: https://issues.apache.org/jira/browse/BEAM-6959 > Project: Beam > Issue Type: Sub-task > Components: runner-flink, sdk-go, testing >Reporter: Robert Burke >Assignee: Kyle Weaver >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > See parent task BEAM-6958 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7230) Using JdbcIO creates huge amount of connections
[ https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835896#comment-16835896 ] Brachi Packter commented on BEAM-7230: -- Awesome, I'll check it out. > Using JdbcIO creates huge amount of connections > --- > > Key: BEAM-7230 > URL: https://issues.apache.org/jira/browse/BEAM-7230 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.11.0 >Reporter: Brachi Packter >Assignee: Ismaël Mejía >Priority: Major > > I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, > and still I see huge amount of connections in GCP SQL (4k while I set > connection pool to 300), and most of them in sleep. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.
[ https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239472 ] ASF GitHub Bot logged work on BEAM-6959: Author: ASF GitHub Bot Created on: 08/May/19 20:46 Start Date: 08/May/19 20:46 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8531: [BEAM-6959] Add Flink tests for Go SDK URL: https://github.com/apache/beam/pull/8531#issuecomment-490644734 Woohoo! Thanks for writing this Kyle! I presume the next step (in probably a different PR?) would be to add the task to the post commits, and have it tracked/triggered by Jenkins appropriate. Finally adding the appropriate badge to the PR template: https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239472) Time Spent: 20m (was: 10m) > Run Go SDK Post Commit tests against the Flink Runner. > --- > > Key: BEAM-6959 > URL: https://issues.apache.org/jira/browse/BEAM-6959 > Project: Beam > Issue Type: Sub-task > Components: runner-flink, sdk-go, testing >Reporter: Robert Burke >Assignee: Kyle Weaver >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > See parent task BEAM-6958 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=239466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239466 ] ASF GitHub Bot logged work on BEAM-6988: Author: ASF GitHub Bot Created on: 08/May/19 20:33 Start Date: 08/May/19 20:33 Worklog Time Spent: 10m Work Description: NikeNano commented on issue #8530: [BEAM-6988] solved problem related to updates of the str object URL: https://github.com/apache/beam/pull/8530#issuecomment-490640523 R: @aaltay @fredo838 @Juta @tvalentyn This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239466) Time Spent: 20m (was: 10m) > TypeHints Py3 Error: test_non_function > (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+ > - > > Key: BEAM-6988 > URL: https://issues.apache.org/jira/browse/BEAM-6988 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py", > line 53, in test_non_function > result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x') > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py", > line 510, in _ror_ > result = p.apply(self, pvalueish, label) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py", > line 514, in apply > transform.type_check_inputs(pvalueish) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py", > line 753, in type_check_inputs > hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1]) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py", > line 283, in getcallargs_forhints > raise TypeCheckError(e) > apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required > positional argument: 'chars'{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes
[ https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklas Hansson reassigned BEAM-6877: Assignee: niklas Hansson > TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode > changes > > > Key: BEAM-6877 > URL: https://issues.apache.org/jira/browse/BEAM-6877 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > > Type inference doesn't work on Python 3.6 due to [bytecode to wordcode > changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes]. > Type inference always returns Any on Python 3.6, so this is not critical. > Affected tests are: > *transforms.ptransform_test*: > - test_combine_properly_pipeline_type_checks_using_decorator > - test_mean_globally_pipeline_checking_satisfied > - test_mean_globally_runtime_checking_satisfied > - test_count_globally_pipeline_type_checking_satisfied > - test_count_globally_runtime_type_checking_satisfied > - test_pardo_type_inference > - test_pipeline_inference > - test_inferred_bad_kv_type > *typehints.trivial_inference_test*: > - all tests in TrivialInferenceTest > *io.gcp.pubsub_test.TestReadFromPubSubOverride*: > * test_expand_with_other_options > * test_expand_with_subscription > * test_expand_with_topic -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary
[ https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239468=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239468 ] ASF GitHub Bot logged work on BEAM-7235: Author: ASF GitHub Bot Created on: 08/May/19 20:37 Start Date: 08/May/19 20:37 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8512: [BEAM-7235] StreamingDataflowWorker creates commit stream only when commit available URL: https://github.com/apache/beam/pull/8512#discussion_r282239271 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java ## @@ -1508,8 +1513,10 @@ private void streamingCommitLoop() { break; } } - commitStream.flush(); - streamPool.releaseStream(commitStream); + if (commitStream) { Review comment: ```suggestion if (commitStream != null) { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239468) Time Spent: 1h (was: 50m) > GrpcWindmillServer creates commit streams before necessary > -- > > Key: BEAM-7235 > URL: https://issues.apache.org/jira/browse/BEAM-7235 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Sam Whittle >Assignee: Sam Whittle >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > This can cause spammy logs if there are no commits before the stream deadline > is reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes
[ https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835889#comment-16835889 ] niklas Hansson commented on BEAM-6877: -- I will start to work on this :) > TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode > changes > > > Key: BEAM-6877 > URL: https://issues.apache.org/jira/browse/BEAM-6877 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > > Type inference doesn't work on Python 3.6 due to [bytecode to wordcode > changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes]. > Type inference always returns Any on Python 3.6, so this is not critical. > Affected tests are: > *transforms.ptransform_test*: > - test_combine_properly_pipeline_type_checks_using_decorator > - test_mean_globally_pipeline_checking_satisfied > - test_mean_globally_runtime_checking_satisfied > - test_count_globally_pipeline_type_checking_satisfied > - test_count_globally_runtime_type_checking_satisfied > - test_pardo_type_inference > - test_pipeline_inference > - test_inferred_bad_kv_type > *typehints.trivial_inference_test*: > - all tests in TrivialInferenceTest > *io.gcp.pubsub_test.TestReadFromPubSubOverride*: > * test_expand_with_other_options > * test_expand_with_subscription > * test_expand_with_topic -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes
[ https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-6877 started by niklas Hansson. > TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode > changes > > > Key: BEAM-6877 > URL: https://issues.apache.org/jira/browse/BEAM-6877 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > > Type inference doesn't work on Python 3.6 due to [bytecode to wordcode > changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes]. > Type inference always returns Any on Python 3.6, so this is not critical. > Affected tests are: > *transforms.ptransform_test*: > - test_combine_properly_pipeline_type_checks_using_decorator > - test_mean_globally_pipeline_checking_satisfied > - test_mean_globally_runtime_checking_satisfied > - test_count_globally_pipeline_type_checking_satisfied > - test_count_globally_runtime_type_checking_satisfied > - test_pardo_type_inference > - test_pipeline_inference > - test_inferred_bad_kv_type > *typehints.trivial_inference_test*: > - all tests in TrivialInferenceTest > *io.gcp.pubsub_test.TestReadFromPubSubOverride*: > * test_expand_with_other_options > * test_expand_with_subscription > * test_expand_with_topic -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=239464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239464 ] ASF GitHub Bot logged work on BEAM-6988: Author: ASF GitHub Bot Created on: 08/May/19 20:31 Start Date: 08/May/19 20:31 Worklog Time Spent: 10m Work Description: NikeNano commented on pull request #8530: [BEAM-6988] solved problem related to updates of the str object URL: https://github.com/apache/beam/pull/8530 Update test apache_beam.typehints.typed_pipeline_test.MainInputTest. The problem is related to that in Python3 the str object don't have the strip method. See answer on this stackoverflow question:https://stackoverflow.com/questions/46241389/two-different-definition-of-strip-method-in-python-2-7-14rc1-official-document Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) |
[jira] [Work logged] (BEAM-7235) GrpcWindmillServer creates commit streams before necessary
[ https://issues.apache.org/jira/browse/BEAM-7235?focusedWorklogId=239467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239467 ] ASF GitHub Bot logged work on BEAM-7235: Author: ASF GitHub Bot Created on: 08/May/19 20:36 Start Date: 08/May/19 20:36 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8512: [BEAM-7235] StreamingDataflowWorker creates commit stream only when commit available URL: https://github.com/apache/beam/pull/8512#discussion_r282239133 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java ## @@ -1459,7 +1459,9 @@ private void streamingCommitLoop() { Commit commit = null; while (running.get()) { // Batch commits as long as there are more and we can fit them in the current request. - CommitWorkStream commitStream = streamPool.getStream(); + // We lazily initialize the commit stream to make sure that we only create one after + // we have a commit. + CommitWorkStream commitStream = null; Review comment: Thanks, makes sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239467) Time Spent: 50m (was: 40m) > GrpcWindmillServer creates commit streams before necessary > -- > > Key: BEAM-7235 > URL: https://issues.apache.org/jira/browse/BEAM-7235 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Sam Whittle >Assignee: Sam Whittle >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > This can cause spammy logs if there are no commits before the stream deadline > is reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.
[ https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=239465=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239465 ] ASF GitHub Bot logged work on BEAM-6959: Author: ASF GitHub Bot Created on: 08/May/19 20:31 Start Date: 08/May/19 20:31 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #8531: [BEAM-6959] Add Flink tests for Go SDK URL: https://github.com/apache/beam/pull/8531 Adds a Gradle task to run tests using the Go SDK and the Flink runner. (To run locally, the user can run `run_integration_tests.sh` with flags configured according to their own Cloud setup). N.B. I re-used the existing Python code to get an unused port for the job server, which has the potential to result in a race condition if a port it selects is immediately snapped up by another process before that port can be claimed by the job server, though such an event would be relatively unlikely. R: @robertwb @angoenka @lostluck Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch) --- |Java | Python | Go | Website --- | --- | --- | --- | --- Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks
[ https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239463 ] ASF GitHub Bot logged work on BEAM-6908: Author: ASF GitHub Bot Created on: 08/May/19 20:30 Start Date: 08/May/19 20:30 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8518: [BEAM-6908] Refactor Python performance test groovy file for easy configuration URL: https://github.com/apache/beam/pull/8518#issuecomment-490639266 PTAL @tvalentyn This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239463) Time Spent: 9.5h (was: 9h 20m) > Add Python3 performance benchmarks > -- > > Key: BEAM-6908 > URL: https://issues.apache.org/jira/browse/BEAM-6908 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Time Spent: 9.5h > Remaining Estimate: 0h > > Similar to > [beam_PerformanceTests_Python|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_Python/], > we want to have a Python3 benchmark running on Jenkins to detect performance > regression during code adoption. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks
[ https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239451 ] ASF GitHub Bot logged work on BEAM-6908: Author: ASF GitHub Bot Created on: 08/May/19 20:15 Start Date: 08/May/19 20:15 Worklog Time Spent: 10m Work Description: markflyhigh commented on pull request #8518: [BEAM-6908] Refactor Python performance test groovy file for easy configuration URL: https://github.com/apache/beam/pull/8518#discussion_r282231279 ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy ## @@ -18,46 +18,107 @@ import CommonJobProperties as commonJobProperties -// This job runs the Beam Python performance tests on PerfKit Benchmarker. -job('beam_PerformanceTests_Python'){ - // Set default Beam job properties. - commonJobProperties.setTopLevelMainJobProperties(delegate) - - // Run job in postcommit every 6 hours, don't trigger every push. - commonJobProperties.setAutoJob( - delegate, - 'H */6 * * *') - - // Allows triggering this build against pull requests. - commonJobProperties.enablePhraseTriggeringFromPullRequest( - delegate, - 'Python SDK Performance Test', - 'Run Python Performance Test') - - def pipelineArgs = [ - project: 'apache-beam-testing', - staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it', - temp_location: 'gs://temp-storage-for-end-to-end-tests/temp-it', - output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' - ] - def pipelineArgList = [] - pipelineArgs.each({ -key, value -> pipelineArgList.add("--$key=$value") - }) - def pipelineArgsJoined = pipelineArgList.join(',') - - def argMap = [ - beam_sdk : 'python', - benchmarks : 'beam_integration_benchmark', - bigquery_table : 'beam_performance.wordcount_py_pkb_results', - beam_it_class: 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', - beam_it_module : 'sdks/python', - beam_prebuilt: 'true', // skip beam prebuild - beam_python_sdk_location : 'build/apache-beam.tar.gz', - beam_runner : 'TestDataflowRunner', - beam_it_timeout : '1200', - beam_it_args : pipelineArgsJoined, - ] - - commonJobProperties.buildPerformanceTest(delegate, argMap) + +class PerformanceTestConfigurations { + String jobName + String jobDescription + String jobTriggerPhrase + String buildSchedule = 'H */6 * * *' // every 6 hours + String benchmarkName = 'beam_integration_benchmark' + String sdk = 'python' + String bigqueryTable + String itClass + String itModule + Boolean skipPrebuild = false + String pythonSdkLocation + String runner = 'TestDataflowRunner' + Integer itTimeout = 1200 + Map extraPipelineArgs +} + +// Common pipeline args for Dataflow job. +def dataflowPipelineArgs = [ +project : 'apache-beam-testing', +staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it', +temp_location : 'gs://temp-storage-for-end-to-end-tests/temp-it', +] + + +// Configurations of each Jenkins job. +def testConfigurations = [ +new PerformanceTestConfigurations( +jobName : 'beam_PerformanceTests_Python', +jobDescription: 'Python SDK Performance Test', +jobTriggerPhrase : 'Run Python Performance Test', +bigqueryTable : 'beam_performance.wordcount_py_pkb_results', +skipPrebuild : true, +pythonSdkLocation : 'build/apache-beam.tar.gz', +itClass : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', +itModule : 'sdks/python', +extraPipelineArgs : dataflowPipelineArgs + [ +output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' +], +), +new PerformanceTestConfigurations( +jobName : 'beam_PerformanceTests_Python35', +jobDescription: 'Python35 SDK Performance Test', +jobTriggerPhrase : 'Run Python35 Performance Test', +bigqueryTable : 'beam_performance.wordcount_py35_pkb_results', +skipPrebuild : true, +pythonSdkLocation : 'test-suites/dataflow/py35/build/apache-beam.tar.gz', +itClass : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', +itModule : 'sdks/python/test-suites/dataflow/py35', +extraPipelineArgs : dataflowPipelineArgs + [ +output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' +], +) +] + + +for (testConfig in testConfigurations) { + createPythonPerformanceTestJob(testConfig) +} + + +private void createPythonPerformanceTestJob(PerformanceTestConfigurations testConfig) { + // This job runs the
[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks
[ https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239447 ] ASF GitHub Bot logged work on BEAM-6908: Author: ASF GitHub Bot Created on: 08/May/19 20:14 Start Date: 08/May/19 20:14 Worklog Time Spent: 10m Work Description: markflyhigh commented on pull request #8518: [BEAM-6908] Refactor Python performance test groovy file for easy configuration URL: https://github.com/apache/beam/pull/8518#discussion_r282230690 ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy ## @@ -18,46 +18,107 @@ import CommonJobProperties as commonJobProperties -// This job runs the Beam Python performance tests on PerfKit Benchmarker. -job('beam_PerformanceTests_Python'){ - // Set default Beam job properties. - commonJobProperties.setTopLevelMainJobProperties(delegate) - - // Run job in postcommit every 6 hours, don't trigger every push. - commonJobProperties.setAutoJob( - delegate, - 'H */6 * * *') - - // Allows triggering this build against pull requests. - commonJobProperties.enablePhraseTriggeringFromPullRequest( - delegate, - 'Python SDK Performance Test', - 'Run Python Performance Test') - - def pipelineArgs = [ - project: 'apache-beam-testing', - staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it', - temp_location: 'gs://temp-storage-for-end-to-end-tests/temp-it', - output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' - ] - def pipelineArgList = [] - pipelineArgs.each({ -key, value -> pipelineArgList.add("--$key=$value") - }) - def pipelineArgsJoined = pipelineArgList.join(',') - - def argMap = [ - beam_sdk : 'python', - benchmarks : 'beam_integration_benchmark', - bigquery_table : 'beam_performance.wordcount_py_pkb_results', - beam_it_class: 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', - beam_it_module : 'sdks/python', - beam_prebuilt: 'true', // skip beam prebuild - beam_python_sdk_location : 'build/apache-beam.tar.gz', - beam_runner : 'TestDataflowRunner', - beam_it_timeout : '1200', - beam_it_args : pipelineArgsJoined, - ] - - commonJobProperties.buildPerformanceTest(delegate, argMap) + +class PerformanceTestConfigurations { + String jobName + String jobDescription + String jobTriggerPhrase + String buildSchedule = 'H */6 * * *' // every 6 hours + String benchmarkName = 'beam_integration_benchmark' + String sdk = 'python' + String bigqueryTable + String itClass + String itModule + Boolean skipPrebuild = false + String pythonSdkLocation + String runner = 'TestDataflowRunner' + Integer itTimeout = 1200 + Map extraPipelineArgs +} + +// Common pipeline args for Dataflow job. +def dataflowPipelineArgs = [ +project : 'apache-beam-testing', +staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it', +temp_location : 'gs://temp-storage-for-end-to-end-tests/temp-it', +] + + +// Configurations of each Jenkins job. +def testConfigurations = [ +new PerformanceTestConfigurations( +jobName : 'beam_PerformanceTests_Python', +jobDescription: 'Python SDK Performance Test', +jobTriggerPhrase : 'Run Python Performance Test', +bigqueryTable : 'beam_performance.wordcount_py_pkb_results', +skipPrebuild : true, +pythonSdkLocation : 'build/apache-beam.tar.gz', +itClass : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', +itModule : 'sdks/python', +extraPipelineArgs : dataflowPipelineArgs + [ +output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' +], +), +new PerformanceTestConfigurations( +jobName : 'beam_PerformanceTests_Python35', +jobDescription: 'Python35 SDK Performance Test', +jobTriggerPhrase : 'Run Python35 Performance Test', +bigqueryTable : 'beam_performance.wordcount_py35_pkb_results', +skipPrebuild : true, +pythonSdkLocation : 'test-suites/dataflow/py35/build/apache-beam.tar.gz', +itClass : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', +itModule : 'sdks/python/test-suites/dataflow/py35', +extraPipelineArgs : dataflowPipelineArgs + [ +output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' +], +) +] + + +for (testConfig in testConfigurations) { + createPythonPerformanceTestJob(testConfig) +} + + +private void createPythonPerformanceTestJob(PerformanceTestConfigurations testConfig) { + // This job runs the
[jira] [Commented] (BEAM-7230) Using JdbcIO creates huge amount of connections
[ https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835851#comment-16835851 ] Ismaël Mejía commented on BEAM-7230: THinking about it, the included implementation of DataSourceProviderFn `PoolableDataSourceProvider` instantiates a Poolable DataSource per JVM so it should cover your case, if you have the chance to try if it works it would be great to know. Thanks > Using JdbcIO creates huge amount of connections > --- > > Key: BEAM-7230 > URL: https://issues.apache.org/jira/browse/BEAM-7230 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.11.0 >Reporter: Brachi Packter >Assignee: Ismaël Mejía >Priority: Major > > I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, > and still I see huge amount of connections in GCP SQL (4k while I set > connection pool to 300), and most of them in sleep. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly
[ https://issues.apache.org/jira/browse/BEAM-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía resolved BEAM-7238. Resolution: Fixed Fix Version/s: 2.13.0 > Make sfl4j bindings runtimeOnly/testRuntimeOnly > --- > > Key: BEAM-7238 > URL: https://issues.apache.org/jira/browse/BEAM-7238 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Fix For: 2.13.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Multiple modules are including sfl4j bindings in the compile or testCompile > scope this is an issue because this may break loggin in particular in logs > that are reused by others. Concrete case sfl4j-simple runners-core and the > logging in the specific runners. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly
[ https://issues.apache.org/jira/browse/BEAM-7238?focusedWorklogId=239428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239428 ] ASF GitHub Bot logged work on BEAM-7238: Author: ASF GitHub Bot Created on: 08/May/19 19:27 Start Date: 08/May/19 19:27 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #8515: [BEAM-7238] Make sfl4j bindings runtimeOnly/testRuntimeOnly URL: https://github.com/apache/beam/pull/8515#discussion_r282213704 ## File path: runners/core-java/build.gradle ## @@ -45,6 +45,6 @@ dependencies { shadowTest library.java.mockito_core shadowTest library.java.junit shadowTest library.java.slf4j_api - shadowTest library.java.slf4j_simple shadowTest library.java.jackson_dataformat_yaml + testRuntimeOnly library.java.slf4j_simple Review comment: This is great improvement, I will try to check in which modules it is worth to do the switch (in particular the ones who do shade stuff to enable it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239428) Time Spent: 1h 50m (was: 1h 40m) > Make sfl4j bindings runtimeOnly/testRuntimeOnly > --- > > Key: BEAM-7238 > URL: https://issues.apache.org/jira/browse/BEAM-7238 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > Multiple modules are including sfl4j bindings in the compile or testCompile > scope this is an issue because this may break loggin in particular in logs > that are reused by others. Concrete case sfl4j-simple runners-core and the > logging in the specific runners. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly
[ https://issues.apache.org/jira/browse/BEAM-7238?focusedWorklogId=239425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239425 ] ASF GitHub Bot logged work on BEAM-7238: Author: ASF GitHub Bot Created on: 08/May/19 19:26 Start Date: 08/May/19 19:26 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #8515: [BEAM-7238] Make sfl4j bindings runtimeOnly/testRuntimeOnly URL: https://github.com/apache/beam/pull/8515 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239425) Time Spent: 1.5h (was: 1h 20m) > Make sfl4j bindings runtimeOnly/testRuntimeOnly > --- > > Key: BEAM-7238 > URL: https://issues.apache.org/jira/browse/BEAM-7238 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Multiple modules are including sfl4j bindings in the compile or testCompile > scope this is an issue because this may break loggin in particular in logs > that are reused by others. Concrete case sfl4j-simple runners-core and the > logging in the specific runners. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7238) Make sfl4j bindings runtimeOnly/testRuntimeOnly
[ https://issues.apache.org/jira/browse/BEAM-7238?focusedWorklogId=239426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239426 ] ASF GitHub Bot logged work on BEAM-7238: Author: ASF GitHub Bot Created on: 08/May/19 19:26 Start Date: 08/May/19 19:26 Worklog Time Spent: 10m Work Description: iemejia commented on issue #8515: [BEAM-7238] Make sfl4j bindings runtimeOnly/testRuntimeOnly URL: https://github.com/apache/beam/pull/8515#issuecomment-490618081 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239426) Time Spent: 1h 40m (was: 1.5h) > Make sfl4j bindings runtimeOnly/testRuntimeOnly > --- > > Key: BEAM-7238 > URL: https://issues.apache.org/jira/browse/BEAM-7238 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > Multiple modules are including sfl4j bindings in the compile or testCompile > scope this is an issue because this may break loggin in particular in logs > that are reused by others. Concrete case sfl4j-simple runners-core and the > logging in the specific runners. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner
[ https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239415 ] ASF GitHub Bot logged work on BEAM-6880: Author: ASF GitHub Bot Created on: 08/May/19 19:12 Start Date: 08/May/19 19:12 Worklog Time Spent: 10m Work Description: HuangLED commented on issue #8380: [BEAM-6880] Remove deprecated Reference Runner code. URL: https://github.com/apache/beam/pull/8380#issuecomment-490613569 this PR LGTM, if we do have community consensus on deprecating java reference runner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239415) Time Spent: 4.5h (was: 4h 20m) > Deprecate Java Portable Reference Runner > > > Key: BEAM-6880 > URL: https://issues.apache.org/jira/browse/BEAM-6880 > Project: Beam > Issue Type: New Feature > Components: runner-direct, test-failures, testing >Reporter: Mikhail Gryzykhin >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > This ticket is about deprecating Java Portable Reference runner. > > Discussion is happening in [this > thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]] > > > Current summary is: disable beam_PostCommit_Java_PVR_Reference job. > Keeping or removing reference runner code is still under discussion. It is > suggested to create PR that removes relevant code and start voting there. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.
[ https://issues.apache.org/jira/browse/BEAM-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835821#comment-16835821 ] niklas Hansson commented on BEAM-7060: -- [~udim] sounds great! I will look at the docs a bit as well :) > Design Py3-compatible typehints annotation support in Beam 3. > - > > Key: BEAM-7060 > URL: https://issues.apache.org/jira/browse/BEAM-7060 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Udi Meiri >Priority: Major > > Existing [Typehints implementaiton in > Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/ > ] heavily relies on internal details of CPython implementation, and some of > the assumptions of this implementation broke as of Python 3.6, see for > example: https://issues.apache.org/jira/browse/BEAM-6877, which makes > typehints support unusable on Python 3.6 as of now. [Python 3 Kanban > Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail] > lists several specific typehints-related breakages, prefixed with "TypeHints > Py3 Error". > We need to decide whether to: > - Deprecate in-house typehints implementation. > - Continue to support in-house implementation, which at this point is a stale > code and has other known issues. > - Attempt to use some off-the-shelf libraries for supporting > type-annotations, like Pytype, Mypy, PyAnnotate. > WRT to this decision we also need to plan on immediate next steps to unblock > adoption of Beam for Python 3.6+ users. One potential option may be to have > Beam SDK ignore any typehint annotations on Py 3.6+. > cc: [~udim], [~altay], [~robertwb]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-6988 started by niklas Hansson. > TypeHints Py3 Error: test_non_function > (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+ > - > > Key: BEAM-6988 > URL: https://issues.apache.org/jira/browse/BEAM-6988 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py", > line 53, in test_non_function > result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x') > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py", > line 510, in _ror_ > result = p.apply(self, pvalueish, label) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py", > line 514, in apply > transform.type_check_inputs(pvalueish) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py", > line 753, in type_check_inputs > hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1]) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py", > line 283, in getcallargs_forhints > raise TypeCheckError(e) > apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required > positional argument: 'chars'{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks
[ https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=239407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239407 ] ASF GitHub Bot logged work on BEAM-6908: Author: ASF GitHub Bot Created on: 08/May/19 19:01 Start Date: 08/May/19 19:01 Worklog Time Spent: 10m Work Description: markflyhigh commented on pull request #8518: [BEAM-6908] Refactor Python performance test groovy file for easy configuration URL: https://github.com/apache/beam/pull/8518#discussion_r282204772 ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy ## @@ -18,46 +18,107 @@ import CommonJobProperties as commonJobProperties -// This job runs the Beam Python performance tests on PerfKit Benchmarker. -job('beam_PerformanceTests_Python'){ - // Set default Beam job properties. - commonJobProperties.setTopLevelMainJobProperties(delegate) - - // Run job in postcommit every 6 hours, don't trigger every push. - commonJobProperties.setAutoJob( - delegate, - 'H */6 * * *') - - // Allows triggering this build against pull requests. - commonJobProperties.enablePhraseTriggeringFromPullRequest( - delegate, - 'Python SDK Performance Test', - 'Run Python Performance Test') - - def pipelineArgs = [ - project: 'apache-beam-testing', - staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it', - temp_location: 'gs://temp-storage-for-end-to-end-tests/temp-it', - output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' - ] - def pipelineArgList = [] - pipelineArgs.each({ -key, value -> pipelineArgList.add("--$key=$value") - }) - def pipelineArgsJoined = pipelineArgList.join(',') - - def argMap = [ - beam_sdk : 'python', - benchmarks : 'beam_integration_benchmark', - bigquery_table : 'beam_performance.wordcount_py_pkb_results', - beam_it_class: 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', - beam_it_module : 'sdks/python', - beam_prebuilt: 'true', // skip beam prebuild - beam_python_sdk_location : 'build/apache-beam.tar.gz', - beam_runner : 'TestDataflowRunner', - beam_it_timeout : '1200', - beam_it_args : pipelineArgsJoined, - ] - - commonJobProperties.buildPerformanceTest(delegate, argMap) + +class PerformanceTestConfigurations { + String jobName + String jobDescription + String jobTriggerPhrase + String buildSchedule = 'H */6 * * *' // every 6 hours + String benchmarkName = 'beam_integration_benchmark' + String sdk = 'python' + String bigqueryTable + String itClass + String itModule + Boolean skipPrebuild = false + String pythonSdkLocation + String runner = 'TestDataflowRunner' + Integer itTimeout = 1200 + Map extraPipelineArgs +} + +// Common pipeline args for Dataflow job. +def dataflowPipelineArgs = [ +project : 'apache-beam-testing', +staging_location: 'gs://temp-storage-for-end-to-end-tests/staging-it', +temp_location : 'gs://temp-storage-for-end-to-end-tests/temp-it', +] + + +// Configurations of each Jenkins job. +def testConfigurations = [ +new PerformanceTestConfigurations( +jobName : 'beam_PerformanceTests_Python', +jobDescription: 'Python SDK Performance Test', +jobTriggerPhrase : 'Run Python Performance Test', +bigqueryTable : 'beam_performance.wordcount_py_pkb_results', +skipPrebuild : true, +pythonSdkLocation : 'build/apache-beam.tar.gz', +itClass : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', +itModule : 'sdks/python', +extraPipelineArgs : dataflowPipelineArgs + [ +output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output' +], +), +new PerformanceTestConfigurations( +jobName : 'beam_PerformanceTests_Python35', +jobDescription: 'Python35 SDK Performance Test', +jobTriggerPhrase : 'Run Python35 Performance Test', +bigqueryTable : 'beam_performance.wordcount_py35_pkb_results', +skipPrebuild : true, +pythonSdkLocation : 'test-suites/dataflow/py35/build/apache-beam.tar.gz', Review comment: Currently, tar file is generated in build directory of the Gradle module where IT is located. We need to specify location per test. We can populate sdkLocation from itModule directly and in the future we could refactor Gradle build to generate tar file only once. https://github.com/markflyhigh/beam/pull/6 is a draft. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL
[jira] [Assigned] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
[ https://issues.apache.org/jira/browse/BEAM-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklas Hansson reassigned BEAM-6988: Assignee: niklas Hansson > TypeHints Py3 Error: test_non_function > (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+ > - > > Key: BEAM-6988 > URL: https://issues.apache.org/jira/browse/BEAM-6988 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Major > > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py", > line 53, in test_non_function > result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x') > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py", > line 510, in _ror_ > result = p.apply(self, pvalueish, label) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py", > line 514, in apply > transform.type_check_inputs(pvalueish) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py", > line 753, in type_check_inputs > hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1]) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py", > line 283, in getcallargs_forhints > raise TypeCheckError(e) > apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required > positional argument: 'chars'{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3
[ https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklas Hansson reassigned BEAM-6535: Assignee: (was: niklas Hansson) > TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on > Python 3 > -- > > Key: BEAM-6535 > URL: https://issues.apache.org/jira/browse/BEAM-6535 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Priority: Minor > > This is the last remaining typehints test still failing on Python 3. It fails > with: > {code:java} > == > FAIL: testTupleListComprehension > (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest) > -- > Traceback (most recent call last): > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 127, in testTupleListComprehension > [typehints.Tuple[str, typehints.Iterable[int]]]) > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 35, in assertReturnType > self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs)) > AssertionError: List[Tuple[str, int]] != Any > -- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work stopped] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3
[ https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-6535 stopped by niklas Hansson. > TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on > Python 3 > -- > > Key: BEAM-6535 > URL: https://issues.apache.org/jira/browse/BEAM-6535 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Minor > > This is the last remaining typehints test still failing on Python 3. It fails > with: > {code:java} > == > FAIL: testTupleListComprehension > (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest) > -- > Traceback (most recent call last): > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 127, in testTupleListComprehension > [typehints.Tuple[str, typehints.Iterable[int]]]) > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 35, in assertReturnType > self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs)) > AssertionError: List[Tuple[str, int]] != Any > -- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3
[ https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklas Hansson reassigned BEAM-6535: Assignee: niklas Hansson > TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on > Python 3 > -- > > Key: BEAM-6535 > URL: https://issues.apache.org/jira/browse/BEAM-6535 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Assignee: niklas Hansson >Priority: Minor > > This is the last remaining typehints test still failing on Python 3. It fails > with: > {code:java} > == > FAIL: testTupleListComprehension > (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest) > -- > Traceback (most recent call last): > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 127, in testTupleListComprehension > [typehints.Tuple[str, typehints.Iterable[int]]]) > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 35, in assertReturnType > self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs)) > AssertionError: List[Tuple[str, int]] != Any > -- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-6535) TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on Python 3
[ https://issues.apache.org/jira/browse/BEAM-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklas Hansson reassigned BEAM-6535: Assignee: (was: niklas Hansson) > TypeHints Py3 Error: TrivialInferenceTest.testTupleListComprehension fails on > Python 3 > -- > > Key: BEAM-6535 > URL: https://issues.apache.org/jira/browse/BEAM-6535 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Robbe >Priority: Minor > > This is the last remaining typehints test still failing on Python 3. It fails > with: > {code:java} > == > FAIL: testTupleListComprehension > (apache_beam.typehints.trivial_inference_test.TrivialInferenceTest) > -- > Traceback (most recent call last): > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 127, in testTupleListComprehension > [typehints.Tuple[str, typehints.Iterable[int]]]) > File > "/home/robbe/workspace/beam/sdks/python/apache_beam/typehints/trivial_inference_test.py", > line 35, in assertReturnType > self.assertEquals(expected, trivial_inference.infer_return_type(f, inputs)) > AssertionError: List[Tuple[str, int]] != Any > -- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7249) Ability to cancel Cloud Bigtable reads
Max created BEAM-7249: - Summary: Ability to cancel Cloud Bigtable reads Key: BEAM-7249 URL: https://issues.apache.org/jira/browse/BEAM-7249 Project: Beam Issue Type: New Feature Components: io-python-gcp Reporter: Max -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7121) Provide deterministic version of Python's ProtoCoder
[ https://issues.apache.org/jira/browse/BEAM-7121?focusedWorklogId=239390=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239390 ] ASF GitHub Bot logged work on BEAM-7121: Author: ASF GitHub Bot Created on: 08/May/19 18:39 Start Date: 08/May/19 18:39 Worklog Time Spent: 10m Work Description: yifanmai commented on issue #8377: [BEAM-7121] Add deterministic proto coder URL: https://github.com/apache/beam/pull/8377#issuecomment-490602509 Added microbenchmarks: ``` small_message_with_map, ProtoCoder, 1000 element(s): run 1 of 20, per element time cost: 1.59318e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 2 of 20, per element time cost: 1.4535e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 3 of 20, per element time cost: 1.4518e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 4 of 20, per element time cost: 1.38509e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 5 of 20, per element time cost: 1.60599e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 6 of 20, per element time cost: 1.2928e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 7 of 20, per element time cost: 1.17121e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 8 of 20, per element time cost: 1.18802e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 9 of 20, per element time cost: 1.24202e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 10 of 20, per element time cost: 1.71099e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 11 of 20, per element time cost: 1.61979e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 12 of 20, per element time cost: 1.1765e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 13 of 20, per element time cost: 1.28272e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 14 of 20, per element time cost: 1.13361e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 15 of 20, per element time cost: 1.11871e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 16 of 20, per element time cost: 1.20051e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 17 of 20, per element time cost: 1.2253e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 18 of 20, per element time cost: 1.1971e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 19 of 20, per element time cost: 1.1992e-05 sec small_message_with_map, ProtoCoder, 1000 element(s): run 20 of 20, per element time cost: 1.24671e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 1 of 20, per element time cost: 3.93291e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 2 of 20, per element time cost: 3.4997e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 3 of 20, per element time cost: 4.39081e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 4 of 20, per element time cost: 3.50349e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 5 of 20, per element time cost: 3.8208e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 6 of 20, per element time cost: 3.45418e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 7 of 20, per element time cost: 3.6797e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 8 of 20, per element time cost: 3.8079e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 9 of 20, per element time cost: 3.90592e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 10 of 20, per element time cost: 3.8846e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 11 of 20, per element time cost: 3.8914e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 12 of 20, per element time cost: 3.8697e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 13 of 20, per element time cost: 3.9917e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 14 of 20, per element time cost: 4.14879e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 15 of 20, per element time cost: 3.91412e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 16 of 20, per element time cost: 4.4534e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 17 of 20, per element time cost: 3.98979e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 18 of 20, per element time cost: 4.05421e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 19 of 20, per element time cost: 4.1333e-05 sec large_message_with_map, ProtoCoder, 1000 element(s): run 20 of 20, per element time cost: 4.23539e-05 sec
[jira] [Created] (BEAM-7248) Support "beam:runner:executable_stage:v1" on fn_api_runner
Ankur Goenka created BEAM-7248: -- Summary: Support "beam:runner:executable_stage:v1" on fn_api_runner Key: BEAM-7248 URL: https://issues.apache.org/jira/browse/BEAM-7248 Project: Beam Issue Type: Improvement Components: sdk-py-core Reporter: Ankur Goenka fn_api_runner.py does not support translation of executable stage transforms. We should support executable stage transforms as job server produces executable stages and this will more closely relate to a portable runner. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner
[ https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239375 ] ASF GitHub Bot logged work on BEAM-6880: Author: ASF GitHub Bot Created on: 08/May/19 18:15 Start Date: 08/May/19 18:15 Worklog Time Spent: 10m Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove deprecated Reference Runner code. URL: https://github.com/apache/beam/pull/8380#issuecomment-490594092 R: @HuangLED This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239375) Time Spent: 4h 20m (was: 4h 10m) > Deprecate Java Portable Reference Runner > > > Key: BEAM-6880 > URL: https://issues.apache.org/jira/browse/BEAM-6880 > Project: Beam > Issue Type: New Feature > Components: runner-direct, test-failures, testing >Reporter: Mikhail Gryzykhin >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > This ticket is about deprecating Java Portable Reference runner. > > Discussion is happening in [this > thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]] > > > Current summary is: disable beam_PostCommit_Java_PVR_Reference job. > Keeping or removing reference runner code is still under discussion. It is > suggested to create PR that removes relevant code and start voting there. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner
[ https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=239374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239374 ] ASF GitHub Bot logged work on BEAM-6880: Author: ASF GitHub Bot Created on: 08/May/19 18:14 Start Date: 08/May/19 18:14 Worklog Time Spent: 10m Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove deprecated Reference Runner code. URL: https://github.com/apache/beam/pull/8380#issuecomment-490593555 R: @lukecwik This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 239374) Time Spent: 4h 10m (was: 4h) > Deprecate Java Portable Reference Runner > > > Key: BEAM-6880 > URL: https://issues.apache.org/jira/browse/BEAM-6880 > Project: Beam > Issue Type: New Feature > Components: runner-direct, test-failures, testing >Reporter: Mikhail Gryzykhin >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > This ticket is about deprecating Java Portable Reference runner. > > Discussion is happening in [this > thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]] > > > Current summary is: disable beam_PostCommit_Java_PVR_Reference job. > Keeping or removing reference runner code is still under discussion. It is > suggested to create PR that removes relevant code and start voting there. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)